benfordslaw
benfordslaw copied to clipboard
benfordslaw is about the frequency distribution of leading digits.
benfordslaw
-
benfordslaw
is Python package to test if an empirical (observed) distribution differs significantly from a theoretical (expected, Benfords) distribution. The law states that in many naturally occurring collections of numbers, the leading significant digit is likely to be small. This method can be used if you want to test whether your set of numbers may be artificial (or manipulated). If a certain set of values follows Benford's Law then model's for the corresponding predicted values should also follow Benford's Law. Normal data (Unmanipulated) does trend with Benford's Law, whereas Manipulated or fraudulent data does not. -
Assumptions of the data:
- The numbers need to be random and not assigned, with no imposed minimums or maximums.
- The numbers should cover several orders of magnitude
- Dataset should preferably cover at least 1000 samples. Though Benford's law has been shown to hold true for datasets containing as few as 50 numbers.
⭐️ Star this repo if you like it ⭐️
Install benfordslaw from PyPI
pip install benfordslaw
Import benfordslaw package
from benfordslaw import benfordslaw
Documentation pages
On the documentation pages you can find detailed information about the working of the benfordslaw
with many examples.
Examples
References
- https://en.wikipedia.org/wiki/Benford%27s_law
- https://towardsdatascience.com/frawd-detection-using-benfords-law-python-code-9db8db474cf8
Citation
Please cite in your publications if this is useful for your research (see citation).
Maintainers
- Erdogan Taskesen, github: erdogant
Contribute
- All kinds of contributions are welcome!
- If you wish to buy me a Coffee for this work, it is very appreciated :)
Licence
See LICENSE for details.