⭐ Star on Github 🎤 Give Feedback

All Functions#

Functions#

`compress_memberships`(*memberships)	Compress membership vectors to int values, preserving index compatibility.
`clusters_to_graph`(clusters)	Transform clusters dictionary into Graph.
`clusters_to_membership`(clusters)	Transform clusters dictionary into membership vector.
`clusters_to_pairs`(clusters)	Transform clusters dictionary into pairs list.
`graph_to_clusters`(graph)	Transform Graph into clusters dictionary.
`graph_to_membership`(graph)	Transform Graph into membership vector.
`graph_to_pairs`(graph)	Transform Graph into pairs list.
`isclusters`(obj)	Check if given object is a clusters dictionary.
`isgraph`(obj)	Check if given object is an iGraph `Graph`.
`ismembership`(obj)	Check if given object is a membership vector.
`ispairs`(obj)	Check if given object is a pairs list.
`membership_to_clusters`(membership)	Transform membership vector into clusters dictionary.
`membership_to_graph`(membership)	Transform membership vector into Graph.
`membership_to_pairs`(membership)	Transform membership vector into pairs list.
`pairs_to_clusters`(pairs, indices)	Transform pairs list into clusters dictionary.
`pairs_to_graph`(pairs, indices)	Transform pairs list into Graph.
`pairs_to_membership`(pairs, indices)	Transform pairs list into membership vector.
`load_pv_data`()	Load PatentsView dataset.
`load_pv_disambiguations`()	Load reference disambiguation and predicted disambiguations for the PatentsView dataset.
`load_rldata500`()	Load RLdata500 dataset.
`load_rldata500_disambiguations`()	Load reference and predicted disambiguations for the RLdata500 dataset.
`load_rldata10000`()	Load RLdata10000 dataset.
`load_rldata10000_disambiguations`()	Load reference and predicted disambiguations for the RLdata10000 dataset.
`count_extra`(prediction, sample)	Count the number of extraneous elements to sampled clusters.
`count_missing`(prediction, sample)	Count the number of missin elements to sampled clusters.
`error_indicator`(prediction, sample)	Error indicator metric.
`error_metrics`(prediction, sample)	Compute canonical set of error metrics from record error table.
`expected_extra`(prediction, sample)	Expected number of extraneous elements to records in sampled clusters.
`expected_missing`(prediction, sample)	Expected number of missin elements to records in sampled clusters.
`expected_relative_extra`(prediction, sample)	Expected relative number of extraneous elements to records in sampled clusters.
`expected_relative_missing`(prediction, sample)	Expected relative number of missin elements to records in sampled clusters.
`expected_size_difference`(prediction, sample)	Expected size difference between predicted and sampled clusters.
`splitting_entropy`(prediction, sample[, alpha])	Splitting entropy of true clusters.
`cluster_sizes_from_table`(error_table)	Compute cluster sizes from record error table.
`error_indicator_from_table`(error_table)	Compute error indicator from record error table.
`error_metrics_from_table`(error_table)	Compute canonical set of error metrics from record error table.
`expected_extra_from_table`(error_table)	Compute expected extra elements from record error table.
`expected_missing_from_table`(error_table)	Compute expected missin elements from record error table.
`expected_relative_extra_from_table`(error_table)	Compute expected relative extra elements from record error table.
`expected_relative_missing_from_table`(error_table)	Compute expected relative missin elements from record error table.
`expected_size_difference_from_table`(error_table)	Compute expected size difference from record error table.
`fit_dt_regressor`(X, y[, numerical_features, ...])	Fits a decision tree regressor model with optional preprocessing for numerical and categorical features.
`pred_cluster_sizes_from_table`(error_table)	Compute predicted cluster sizes from record error table.
`record_error_table`(prediction, sample)	Compute record error table.
`b_cubed_precision_estimator`(prediction, ...)	B-cubed precision design estimator.
`b_cubed_recall_estimator`(prediction, sample, ...)	B-cubed recall design estimator.
`cluster_f_estimator`(prediction, sample, weights)	Cluster F-score design estimator.
`cluster_precision_estimator`(prediction, ...)	Cluster precision design estimator.
`cluster_recall_estimator`(prediction, sample, ...)	Cluster recall design estimator.
`estimates_table`(predictions, samples_weights)	Create table of estimates applied to all combinations of predictions and (sample, weights) pairs.
`pairwise_f_estimator`(prediction, sample, weights)	Design estimator for pairwise F-score.
`pairwise_precision_estimator`(prediction, ...)	Design estimator for pairwise precision.
`pairwise_recall_estimator`(prediction, ...)	Design estimator for pairwise recall.
`avg_cluster_size_estimator`(sample, weights)	Compute the average cluster size estimator for the given sample, weights, prediction, and names.
`homonymy_rate_estimator`(sample, weights[, ...])	Compute the homonymy rate estimator for the given sample, weights, prediction, and names.
`matching_rate_estimator`(sample, weights[, ...])	Compute the matching rate estimator for the given sample, weights, prediction, and names.
`name_variation_estimator`(sample, weights[, ...])	Compute the name variation estimator for the given sample, weights, prediction, and names.
`summary_estimates_table`(sample, weights, ...)	Generate a summary estimates table for the given sample, weights, predictions, and names.
`adjusted_rand_score`(prediction, reference)	Compute the adjusted Rand index.
`b_cubed_f`(prediction, reference[, beta])	B-cubed F score for the inner join of two clusterings.
`b_cubed_precision`(prediction, reference)	B-cubed precision for the inner join of two clusterings, with equal weight placed on each ground truth cluster.
`b_cubed_recall`(prediction, reference)	B-cubed recall for the inner join of two clusterings, with equal weight placed on each ground truth cluster.
`cluster_completeness`(prediction, reference)	Cluster completeness score (based on conditional entropy)
`cluster_f`(prediction, reference[, beta])	Cluster F score for the inner join of two clusterings.
`cluster_homogeneity`(prediction, reference)	Cluster homogeneity score (based on conditional entropy).
`cluster_precision`(prediction, reference)	Cluster precision for the inner join of two clusterings.
`cluster_recall`(prediction, reference)	Cluster recall for the inner join of two clusterings.
`cluster_v_measure`(prediction, reference[, beta])	Compute the V-measure.
`metrics_table`(predictions, references[, metrics])	Apply a set of metrics to all combinations of prediction and reference membership vectors.
`pairwise_f`(prediction, reference[, beta])	Pairwise F score for the inner join of two clusterings.
`pairwise_precision`(prediction, reference)	Pairwise precision for the inner join of two clusterings.
`pairwise_recall`(prediction, reference)	Pairwise recall for the inner join of two clusterings.
`rand_score`(prediction, reference)	Compute the Rand index.
`add_ests_to_summaries`(fig, predictions, ...)
`compare_plots`(*figs[, names, marker, ...])	Combine multiple figures into one.
`plot_cluster_errors`(prediction, reference[, ...])	Scatter plot of two cluster-wise error metrics.
`plot_cluster_sizes_distribution`(membership)	Plot the cluster size distribution
`plot_comparison`(predictions[, metrics, ...])	Plot metrics computed for all prediction pairs.
`make_dt_regressor_plot`(error_metrics, ...[, ...])	Fit a decision tree regressor to the data and create an interactive sunburst chart visualization of the resulting tree.
`plot_dt_regressor_sunburst`(dt_regressor, X, ...)	Creates a sunburst plot of a decision tree regressor.
`plot_dt_regressor_tree`(dt_regressor, ...)	Creates a tree plot of a decision tree regressor.
`plot_dt_regressor_treemap`(dt_regressor, X, ...)	Creates a treemap plot of a decision tree regressor.
`plot_entropy_curve`(membership[, q_range, ...])	Plot the Hill number entropy curve
`plot_estimates`(predictions, sample_weights)	Plot representative performance estimates.
`plot_metrics`(predictions, reference[, ...])	Plot performance metrics.
`plot_performance_disparities`(prediction, ...)	Plot largest performance disparities among predefined subgroups.
`plot_summaries`(predictions[, names, type, ...])	Plot summary statistics
`expand_grid`(**kwargs)	Create DataFrame from all combination of elements.
`load_module_parquet`(module, filename)	Load parquet file from a submodule using pyarrow engine.
`load_module_tsv`(module, filename[, dtype])	Load tsv file from a submodule.
`relevant_prediction_subset`(prediction, sample)	Return predicted clusters which intersect sampled clusters.
`sample_clusters`(membership[, weights, ...])	Sample clusters from a membership vector.
`average_cluster_size`(membership)	Compute the average cluster size.
`cluster_hill_number`(membership[, alpha])	Compute Hill number of a given order.
`cluster_sizes`(membership)	Compute the size of each cluster.
`cluster_sizes_distribution`(membership)	Compute the cluster size distribution
`homonymy_rate`(membership, names)	Compute the homonymy rate of a given clustering with a set of associated names.
`matching_rate`(membership)	Compute the matching rate for a given clustering.
`name_variation_rate`(membership, names)	Compute the name variation rate of a given clustering with a set of associated names.
`number_of_clusters`(membership)	Number of clusters in a given clustering.
`number_of_links`(membership)	Number of pairwise links associated with a given clustering.
`summary_statistics`(membership[, names])	Compute canonical set of summary statistics.

Classes#

MembershipVector([data, dropna])

Series wrapper to validate membership vector format and log potential issues.

Class Inheritance Diagram#