All Functions#
Functions#
|
Compress membership vectors to int values, preserving index compatibility. |
|
Transform clusters dictionary into Graph. |
|
Transform clusters dictionary into membership vector. |
|
Transform clusters dictionary into pairs list. |
|
Transform Graph into clusters dictionary. |
|
Transform Graph into membership vector. |
|
Transform Graph into pairs list. |
|
Check if given object is a clusters dictionary. |
|
Check if given object is an iGraph |
|
Check if given object is a membership vector. |
|
Check if given object is a pairs list. |
|
Transform membership vector into clusters dictionary. |
|
Transform membership vector into Graph. |
|
Transform membership vector into pairs list. |
|
Transform pairs list into clusters dictionary. |
|
Transform pairs list into Graph. |
|
Transform pairs list into membership vector. |
Load PatentsView dataset. |
|
Load reference disambiguation and predicted disambiguations for the PatentsView dataset. |
|
Load RLdata500 dataset. |
|
Load reference and predicted disambiguations for the RLdata500 dataset. |
|
Load RLdata10000 dataset. |
|
Load reference and predicted disambiguations for the RLdata10000 dataset. |
|
|
Count the number of extraneous elements to sampled clusters. |
|
Count the number of missin elements to sampled clusters. |
|
Error indicator metric. |
|
Compute canonical set of error metrics from record error table. |
|
Expected number of extraneous elements to records in sampled clusters. |
|
Expected number of missin elements to records in sampled clusters. |
|
Expected relative number of extraneous elements to records in sampled clusters. |
|
Expected relative number of missin elements to records in sampled clusters. |
|
Expected size difference between predicted and sampled clusters. |
|
Splitting entropy of true clusters. |
|
Compute cluster sizes from record error table. |
|
Compute error indicator from record error table. |
|
Compute canonical set of error metrics from record error table. |
|
Compute expected extra elements from record error table. |
|
Compute expected missin elements from record error table. |
|
Compute expected relative extra elements from record error table. |
|
Compute expected relative missin elements from record error table. |
|
Compute expected size difference from record error table. |
|
Fits a decision tree regressor model with optional preprocessing for numerical and categorical features. |
|
Compute predicted cluster sizes from record error table. |
|
Compute record error table. |
|
B-cubed precision design estimator. |
|
B-cubed recall design estimator. |
|
Cluster F-score design estimator. |
|
Cluster precision design estimator. |
|
Cluster recall design estimator. |
|
Create table of estimates applied to all combinations of predictions and (sample, weights) pairs. |
|
Design estimator for pairwise F-score. |
|
Design estimator for pairwise precision. |
|
Design estimator for pairwise recall. |
|
Compute the average cluster size estimator for the given sample, weights, prediction, and names. |
|
Compute the homonymy rate estimator for the given sample, weights, prediction, and names. |
|
Compute the matching rate estimator for the given sample, weights, prediction, and names. |
|
Compute the name variation estimator for the given sample, weights, prediction, and names. |
|
Generate a summary estimates table for the given sample, weights, predictions, and names. |
|
Compute the adjusted Rand index. |
|
B-cubed F score for the inner join of two clusterings. |
|
B-cubed precision for the inner join of two clusterings, with equal weight placed on each ground truth cluster. |
|
B-cubed recall for the inner join of two clusterings, with equal weight placed on each ground truth cluster. |
|
Cluster completeness score (based on conditional entropy) |
|
Cluster F score for the inner join of two clusterings. |
|
Cluster homogeneity score (based on conditional entropy). |
|
Cluster precision for the inner join of two clusterings. |
|
Cluster recall for the inner join of two clusterings. |
|
Compute the V-measure. |
|
Apply a set of metrics to all combinations of prediction and reference membership vectors. |
|
Pairwise F score for the inner join of two clusterings. |
|
Pairwise precision for the inner join of two clusterings. |
|
Pairwise recall for the inner join of two clusterings. |
|
Compute the Rand index. |
|
|
|
Combine multiple figures into one. |
|
Scatter plot of two cluster-wise error metrics. |
|
Plot the cluster size distribution |
|
Plot metrics computed for all prediction pairs. |
|
Fit a decision tree regressor to the data and create an interactive sunburst chart visualization of the resulting tree. |
|
Creates a sunburst plot of a decision tree regressor. |
|
Creates a tree plot of a decision tree regressor. |
|
Creates a treemap plot of a decision tree regressor. |
|
Plot the Hill number entropy curve |
|
Plot representative performance estimates. |
|
Plot performance metrics. |
|
Plot largest performance disparities among predefined subgroups. |
|
Plot summary statistics |
|
Create DataFrame from all combination of elements. |
|
Load parquet file from a submodule using pyarrow engine. |
|
Load tsv file from a submodule. |
|
Return predicted clusters which intersect sampled clusters. |
|
Sample clusters from a membership vector. |
|
Compute the average cluster size. |
|
Compute Hill number of a given order. |
|
Compute the size of each cluster. |
|
Compute the cluster size distribution |
|
Compute the homonymy rate of a given clustering with a set of associated names. |
|
Compute the matching rate for a given clustering. |
|
Compute the name variation rate of a given clustering with a set of associated names. |
|
Number of clusters in a given clustering. |
|
Number of pairwise links associated with a given clustering. |
|
Compute canonical set of summary statistics. |
Classes#
|
Series wrapper to validate membership vector format and log potential issues. |