cluster_sizes_distribution#

er_evaluation.cluster_sizes_distribution(membership)[source]#

Compute the cluster size distribution

Parameters:

membership (Series) – Membership vector representation of a clustering.

Returns:

Pandas Series indexed by distinct cluster sizes and with values corresponding to the number of clusters of that size.

Return type:

Series

Examples

>>> membership = pd.Series(index=[1,2,3,4,5,6,7,8], data=[1,1,2,3,2,4,4,4])
>>> cluster_sizes_distribution(membership)
1    1
2    2
3    1
dtype: int64