load_rldata500#
- er_evaluation.datasets.load_rldata500()[source]#
Load RLdata500 dataset.
Dataset with 500 rows, including 50 noisy duplicate records, from the RecordLinkage R package.
Unique identifiers for each row can be obtained from
er_evaluation.datasets.load_rldata500_disambiguations().Columns are:
fname_c1: First name, first component.
fname_c2: First name, second component.
lname_c1: Last name, first component.
lname_c2: Last name, second component.
by: Year of birth.
bm: Month of birth.
bd: Day of birth.
- Returns:
RLdata500 dataset.
- Return type:
DataFrame