Dataproofer icon indicating copy to clipboard operation
Dataproofer copied to clipboard

Test: check for name clusters similar to Open Refine

Open newsroomdev opened this issue 9 years ago • 1 comments

Please read how to create a new test if you're interested in writing this test.

Check for inconsistent spellings like New York, Neew York, Nw York, etc. Open Refine, a desktop app for cleaning data, has a few Java-based algorithms to help with clustering, but I'm currently unaware of any pre-existing Node modules that make this process easier. More research needed

newsroomdev avatar Jan 13 '16 17:01 newsroomdev

Adding Javascript implementations of open refine clustering detection

Next steps here, verify the algorithm works and test it out against some data. Need to do some vetting and code review first.

newsroomdev avatar Mar 30 '16 01:03 newsroomdev