dedupe-examples
dedupe-examples copied to clipboard
Understanding of labeling sample size
hi @fgregg I request that you please guide me on data labeling. Right now, I ran the csv example on my csv, which has 3,84,984 rows. and the interactive shell is asking for only 10 pairs of records, which requires labeling. So how do we get to know how many positive pairs are required for labeling?
how many pairs required labeling is recommended ?
so that it works perfectly? and how to configure the labeling sample size in the CSV example?