homonymy_rate#

er_evaluation.summary.homonymy_rate(membership, names)[source]#

Compute the homonymy rate of a given clustering with a set of associated names.

Homonymy rate:

The homonymy rate is the proportion of clusters that share a name with another cluster.

Parameters:
  • membership (Series) – Membership vector representation of a clustering.

  • names (Series) – Series indexed by cluster elements and with values corresponding to the associated name. Note that the index of membership should be included in the index of names.

Returns:

homonymy rate

Return type:

float

Examples

>>> membership = pd.Series(index=[1,2,3,4,5,6,7,8], data=[1,1,2,3,2,4,4,4])
>>> names = pd.Series(index=[1,2,3,4,5,6,7,8], data=["n1", "n2", "n3", "n4", "n3", "n1", "n2", "n8"])
>>> homonymy_rate(membership, names)
0.5