Dataproofer
Dataproofer copied to clipboard
A proofreader for your data
How do I change the separator? I have a .csv file that uses semicolons as separators. DataProofer thinks the file only contains a single column (with all data in it),...
I'm interested in exploring if we can integrate some of the functionality found here: https://github.com/datamade/dedupe
_Please read [how to create a new test](https://github.com/dataproofer/Dataproofer#creating-a-new-test) if you're interested in writing this test._ related to #6, #7, #8 Suspicious numbers are present If you see any of these...
This is an excellent idea from @ianrose who suggests we make it easier for users who aren't code-savvy to build their own tests. Think IFTTT but for data tests.
This is a sort of meta-issue that's more conceptual than technical. There will probably be additional issues with more specific tasks, but I want to have the overall discussion here....
### Summary Interested in seeing if we can replicate/automate the analysis discussed here: https://arxiv.org/abs/1410.6059 > We hypothesize that if election results are manipulated or forged, then, due to the well-known...
### Summary As a data cleaning tool, Dataproofer should not have hardcoded dependency on libgconf (or gnome, for that matter) so that it can be run from a variety of...
_Please read [how to create a new test](https://github.com/dataproofer/Dataproofer#creating-a-new-test) if you're interested in writing this test._ If column header is `country`, or some variation, check to see if the names match...
_Please read [how to create a new test](https://github.com/dataproofer/Dataproofer#creating-a-new-test) if you're interested in writing this test._ Check for inconsistent spellings like New York, Neew York, Nw York, etc. [Open Refine](http://openrefine.org/), a...
Will inform users we cannot read PDFs, but will be able to proof their data if they convert the PDF to machine-readable data using Tabula. Please see [Tabula](http://tabula.technology/) for more...