Forest Gregg

Results 351 comments of Forest Gregg

@NickCrews, there's a lot of noise in the recall/precision differences. Should we increase the repetitions?

if we did this we would probably normalize the block table to look something like this | pred_id | component_idx | pred_length | value | record_id | |-|-|-|-|-| | 0...

> A while ago, I've experimented using infomap and louvain instead of HAC. It seems to work, but I didn't had a good truth set to figure out which one...

> The main question for me is how to apply the clustering threshold in the context of community detection? some community detection algos do have parameters, but one available option...

Are you going to explore any of these?

many of the examples in the dedupe-examples have ground truths, particularly csv_example, patent_example, and record_linkage_example