Gerald Rich

Results 16 comments of Gerald Rich

Picking things up here re large datasets since that's the biggest blocker preventing CLI and the desktop app from being closer to :100: in my opinion. It's the biggest recurring...

I'm gonna break this up into some sub-tasks because there's quite a lot of data to be vacuumed up and sorted properly. - MaxMind - [x] [countries](https://dev.maxmind.com/geoip/geoip2/geolite2/#Downloads) - [ ]...

Adding Javascript implementations of open refine clustering detection - [ngram-fingerprint npm module](https://github.com/finnp/ngram-fingerprint) - [gist](https://gist.github.com/andrei-m/982927) - [module](https://github.com/gf3/Levenshtein) - ["fast levenshtein" module](https://github.com/hiddentao/fast-levenshtein) Next steps here, verify the algorithm works and test it...

Note: this test can be implemented more efficiently using a Bloom or Cuckoo filter

Sure. If a column contains dates, we want to test for major gap between dates. We initially thought of this when we had to plot simple timeseries data, but were...

@pachevalier just wanted to give you a quick update that I've been working on a JSON output from the CLI. i'm currently working on the `cli` branch if you'd like...