pairwise_recall#

er_evaluation.pairwise_recall(prediction, reference)[source]#

Pairwise recall for the inner join of two clusterings.

Pairwise recall:

Consider two clusterings of a set of records, refered to as the predicted and reference clusterings. Let \(T\) be the set of record pairs which appear in the same reference cluster, and let \(P\) be the set of record pairs which appear in the same predicted clusters. Pairwise recall is then defined as

\[R = \frac{\lvert T \cap P \rvert}{\lvert T \rvert}\]

This is the proportion of correctly predicted links among all true links.

Parameters:

prediction (Series) – Membership vector for the predicted clustering.
reference (Series) – Membership vector for the reference clustering.

Returns:

Pairwise recall computed on the inner join of predicted and reference.

Return type:

float

Examples

>>> prediction = pd.Series(index=[1,2,3,4,5,6,7,8], data=[1,1,2,3,2,4,4,4])
>>> reference = pd.Series(index=[1,2,3,4,5,6,7,8], data=["c1", "c1", "c1", "c2", "c2", "c3", "c3", "c4"])
>>> pairwise_recall(prediction, reference)
0.4

Notes

NA values are dropped from membership vectors prior to computing the metric.