b_cubed_precision#
- er_evaluation.metrics.b_cubed_precision(prediction, reference)[source]#
B-cubed precision for the inner join of two clusterings, with equal weight placed on each ground truth cluster.
Mathematically, this is defined as
\[P_{B^3} = \frac{1}{\lvert \mathcal{C}\rvert}\sum_{c \in \mathcal{C}} \frac{1}{\lvert c \rvert} \sum_{r \in c} \frac{\lvert c(r) \cap \hat c(r)\rvert }{\lvert \hat c(r) \rvert}\]where
\(\mathcal{C}\) is the set of ground truth clusters,
\(c\) is a ground truth cluster,
\(r\) is a mention in \(c\),
\(c(r)\) is the cluster associated with \(r\) in the ground truth clustering,
\(\hat c(r)\) is the cluster associated with \(r\) in the predicted clustering.
- Parameters:
prediction (Series) – Membership vector for the predicted clustering.
reference (Series) – Membership vector for the reference clustering.
- Returns:
B-cubed precision for the inner join of prediction and reference.
- Return type:
float
Examples
>>> prediction = pd.Series(index=[1,2,3,4,5,6,7,8], data=[1,1,2,3,2,4,4,4]) >>> reference = pd.Series(index=[1,2,3,4,5,6,7,8], data=["c1", "c1", "c1", "c2", "c2", "c3", "c3", "c4"]) >>> b_cubed_precision(prediction, reference) 0.6458333333333334
Notes
NA values are dropped from membership vectors prior to computing the metric.