dedupe
dedupe copied to clipboard
Custom predicates for blocking
I have a really large dataset > 750 000 rows. I want to create a custom predicate where columns1 and columns3 should be the same across comparison pairs.
I could define them as such using dedupe predicate classes:
predicates.CompoundPredicate([ predicates.wholeFieldPredicate('pltr_gross_amt'), predicates.wholeFieldPredicate('pltr_tran_date') ])
Now, how do I implement my custom predicate, such that all uncertain rows that are shown would adhere to this predicate? Could I overwrite default predicates?