pairwise_precision#

er_evaluation.pairwise_precision(prediction, reference)[source]#

Pairwise precision for the inner join of two clusterings.

Pairwise precision:

Consider two clusterings of a set of records, refered to as the predicted and reference clusterings. Let \(T\) be the set of record pairs which appear in the same reference cluster, and let \(P\) be the set of record pairs which appear in the same predicted clusters. Pairwise precision is then defined as

\[P = \frac{\lvert T \cap P \rvert}{\lvert P \rvert}\]

This is the proportion of correctly predicted links among all predicted links.

Parameters:
  • prediction (Series) – Membership vector for the predicted clustering.

  • reference (Series) – Membership vector for the reference clustering.

Returns:

Pairwise precision for the inner join of prediction and reference.

Return type:

float

Examples

>>> prediction = pd.Series(index=[1,2,3,4,5,6,7,8], data=[1,1,2,3,2,4,4,4])
>>> reference = pd.Series(index=[1,2,3,4,5,6,7,8], data=["c1", "c1", "c1", "c2", "c2", "c3", "c3", "c4"])
>>> pairwise_precision(prediction, reference)
0.4

Notes

NA values are dropped from membership vectors prior to computing the metric.