er_evaluation.estimators#

Estimators Based on Ground Truth Clusters#

The estimators module provides a set of functions to estimate performance metrics such as pairwise precision and recall, cluster precision and recall, F-scores, and B-cubed metrics, as well as summary statistics such as the matching rate, homonymy rate, and name variation rate. The functions take as input a predicted disambiguation, a set of ground truth clusters, and a set of cluster sampling weights (e.g., inverse probability weights for each cluster). They return an estimate of the performance metric or summary statistic, along with an estimate of the standard deviation of the estimate.

Representative performance estimators are necessary for accurate evaluation of entity resolution algorithms due to the following reasons:

It is typically infeasible to manually label enough data to cover an entire population of interest.
Naively computing performance metrics on benchmark datasets leads to highly biased and over-optimistic results that are not representative of real-world performance. This is due to the non-linear scaling of entity resolution: while it might be easy to disambiguate a small benchmark dataset, the complexity of the problem grows quadratically in the dataset size.

To address these issues, we use performance estimators based on a weighted sample of ground truth clusters. This provides an efficient and accurate way of evaluating the performance of entity resolution algorithms on large datasets, while also taking into account sampling processes and biases.

Note: In order to obtain representative performance estimators, the set of predicted clusters given as an argument to estimator functions should cover the entire population of interest. Typically, this set of predicted clusters will be much larger than the set of sampled clusters.

Functions#

`b_cubed_precision_estimator`(prediction, ...)	B-cubed precision design estimator.
`b_cubed_recall_estimator`(prediction, sample, ...)	B-cubed recall design estimator.
`cluster_f_estimator`(prediction, sample, weights)	Cluster F-score design estimator.
`cluster_precision_estimator`(prediction, ...)	Cluster precision design estimator.
`cluster_recall_estimator`(prediction, sample, ...)	Cluster recall design estimator.
`estimates_table`(predictions, samples_weights)	Create table of estimates applied to all combinations of predictions and (sample, weights) pairs.
`pairwise_f_estimator`(prediction, sample, weights)	Design estimator for pairwise F-score.
`pairwise_precision_estimator`(prediction, ...)	Design estimator for pairwise precision.
`pairwise_recall_estimator`(prediction, ...)	Design estimator for pairwise recall.
`avg_cluster_size_estimator`(sample, weights)	Compute the average cluster size estimator for the given sample, weights, prediction, and names.
`homonymy_rate_estimator`(sample, weights[, ...])	Compute the homonymy rate estimator for the given sample, weights, prediction, and names.
`matching_rate_estimator`(sample, weights[, ...])	Compute the matching rate estimator for the given sample, weights, prediction, and names.
`name_variation_estimator`(sample, weights[, ...])	Compute the name variation estimator for the given sample, weights, prediction, and names.
`summary_estimates_table`(sample, weights, ...)	Generate a summary estimates table for the given sample, weights, predictions, and names.