cluster_sizes_from_table#

er_evaluation.error_analysis.cluster_sizes_from_table(error_table)[source]#

Compute cluster sizes from record error table.

Parameters:

error_table (DataFrame) – Record error table.

Returns:

Cluster sizes for each reference cluster.

Return type:

Series

Examples

>>> prediction = pd.Series(index=[1,2,3,4,5,6,7,8], data=[1,1,2,3,2,4,4,4])
>>> sample = pd.Series(index=[1,2,3,4,5,6,7], data=["c1", "c1", "c1", "c2", "c2", "c3", "c3"])
>>> error_table = record_error_table(prediction, sample)
>>> cluster_sizes_from_table(error_table)
reference
c1    3.0
c2    2.0
c3    2.0
Name: ref_cluster_size, dtype: float64