Gerald Rich
Gerald Rich
Picking things up here re large datasets since that's the biggest blocker preventing CLI and the desktop app from being closer to :100: in my opinion. It's the biggest recurring...
I'm gonna break this up into some sub-tasks because there's quite a lot of data to be vacuumed up and sorted properly. - MaxMind - [x] [countries](https://dev.maxmind.com/geoip/geoip2/geolite2/#Downloads) - [ ]...
Adding Javascript implementations of open refine clustering detection - [ngram-fingerprint npm module](https://github.com/finnp/ngram-fingerprint) - [gist](https://gist.github.com/andrei-m/982927) - [module](https://github.com/gf3/Levenshtein) - ["fast levenshtein" module](https://github.com/hiddentao/fast-levenshtein) Next steps here, verify the algorithm works and test it...
Note: this test can be implemented more efficiently using a Bloom or Cuckoo filter
Sure. If a column contains dates, we want to test for major gap between dates. We initially thought of this when we had to plot simple timeseries data, but were...
@pachevalier just wanted to give you a quick update that I've been working on a JSON output from the CLI. i'm currently working on the `cli` branch if you'd like...