bounter
bounter copied to clipboard
Add power-user docs to `bounter.py`
Expand on the way the algos work under the hood, for power users: parameters of CMS (width, depth), memory vs accuracy trade-offs, log8/log1024 etc.
This should be either a part of bounter.py module docstring (where people are most likely to look for it; we also link to this from the README), or a link to a repo notebook / .md file linked to from bounter.py docstring.
Related to this, looking at the docs for the parameter:
size_mb (int): Desired memory footprint of the counter.
It's unclear what the default size is. If left to None, does it expand to available memory (this is what I'm assuming)?
@thoppe https://github.com/RaRe-Technologies/bounter/blob/1585aff2afb20dca5bb7115e119497a7c22f1d2b/bounter/bounter.py#L32-L33
doesn't seem to do anything but raise an error.
@piskvorky we should probably also move the docs style to be more like numpy docs (like is being done in gensim) wdyt?
Yeah, why not.
But either way we should make clear what parameters are mandatory (size_mb) and which are optional, with what defaults.