record_error_table#

er_evaluation.record_error_table(prediction, sample)[source]#

Compute record error table.

Parameters:
  • prediction (Series) – Membership vector representation of a clustering.

  • sample (Series) – Membership vector representation of a clustering.

Returns:

Record error table.

Return type:

DataFrame

Examples

>>> prediction = pd.Series(index=[1,2,3,4,5,6,7,8], data=[1,1,2,3,2,4,4,4])
>>> sample = pd.Series(index=[1,2,3,4,5,6,7], data=["c1", "c1", "c1", "c2", "c2", "c3", "c3"])
>>> record_error_table(prediction, sample)  
            prediction  reference       pred_cluster_size       ref_cluster_size        extra       missing
index
1           1           c1              2                       3.0                     0.0             1.0
2           1           c1              2                       3.0                     0.0                 1.0
3           2           c1              2                       3.0                     1.0                 2.0
4           3           c2              1                       2.0                     0.0                 1.0
5           2           c2              2                       2.0                     1.0                 1.0
6           4           c3              3                       2.0                     1.0                 0.0
7           4           c3              3                       2.0                     1.0                 0.0

Notes

sample is subsetted to only include indices present in prediction.