Dataproofer
Dataproofer copied to clipboard
Test: check numbers against Benford's Law to look for made up data
Please read how to create a new test if you're interested in writing this test.
Benford's Law is a theory which states that small digits (1, 2, 3) appear at the beginning of numbers much more frequently than large digits (7, 8, 9). In theory Benford's Law can be used to detect anomalies in accounting practices or election results, though in practice it can easily be misapplied. If you suspect a dataset has been created or modified to deceive, Benford's Law is an excellent first test, but you should always verify your results with an expert before concluding your data have been manipulated.
More info
- Karroubi’s Unlucky 7’s?, FiveThirtyEight
- Radiolab, Numbers
It's worth pairing it to logarithmic distribution of the numbers...so that people can also see if the numbers are meant to follow Benford...I saw a paper that did a grid of graphs comparing the benford distribution along with the log distribution but can't find it...stupid PDFs