robosat
robosat copied to clipboard
Implements reservoir sampler randomly sampling stream of features
For #7. Work in progress.
This changeset implements a a randomized online algorithm "reservoir sampling" for randomly sampling k items from a stream of unknown n items. We can use this to randomly sample e.g. k building features in the osmium handlers without having to store all features first or doing two passes.
Tasks:
- [ ] Hook up to osmium handlers
- [ ] Let users pass number of samples for randomly sampling
Refs:
- https://en.wikipedia.org/wiki/Reservoir_sampling
- https://www.paypal-engineering.com/2016/04/11/statistics-for-software/#dipping_into_the_stream