error_metrics#
- er_evaluation.error_metrics(prediction, sample)[source]#
Compute canonical set of error metrics from record error table.
Error metrics included:
Expected extra links (see
er_evaluation.error_analysis.expected_extra())Expected relative extra links (see
er_evaluation.error_analysis.expected_relative_extra())Expected missin elements (see
er_evaluation.error_analysis.expected_missing())Expected relative missin elements (see
er_evaluation.error_analysis.expected_relative_missing())Error indicator (see
er_evaluation.error_analysis.error_indicator())
- Parameters:
prediction (Series) – Membership vector representing a predicted disambiguation.
sample (Series) – Membership vector representing a set of true clusters.
- Returns:
Dataframe indexed by cluster identifiers and with values corresponding to error metrics.
- Return type:
DataFrame
- Examples
>>> prediction = pd.Series(index=[1,2,3,4,5,6,7,8], data=[1,1,2,3,2,4,4,4]) >>> sample = pd.Series(index=[1,2,3,4,5,6,7, 8], data=["c1", "c1", "c1", "c2", "c2", "c3", "c3", "c3"]) >>> error_metrics(prediction, sample) expected_extra expected_relative_extra expected_missing expected_relative_missing error_indicator reference c1 0.333333 0.166667 1.333333 0.444444 1 c2 0.500000 0.250000 1.000000 0.500000 1 c3 1.000000 0.333333 0.000000 0.000000 0
Notes
The sample is restricted to the set of records which are present in the prediction.