Nick Gerner
Nick Gerner
I made reservoir sampling optional, controlled by a parameter similar to `impostor_store`, with a default `auto` choice that uses the same heuristic as sparse. Setting both params to `auto` will...
You're a tough customer. Do I get a citation out of this? j/k I think I see what you're saying, but it's kind of marginal, right? More than half the...
I converted the uniform sampling approach to use a polymorphic class with ReservoirSampler. This cleaned up _find_impostors_blockwise and I think accomplishes your goal of readability without duplicating the actual interesting...
Oh, I also re-ran timing and memory tests and the updated uniform sampling implementation is comparable to the previous one.
Are you suggesting creating a new file impostor_sampling.py to hold `ReservoirSampler` and `UniformSampler`? These aren't actually about impostors per se. They're more general than that which is why I put...
Done. Let me know what else I can do to streamline the merge and publish back to pypi. I love to give back to open source and I've learned a...
I get the feeling you're not just going to merge as is :) seems like there's three problems: 1. some stylistic violations (i.e. pep8 which I could solve by using...
My top post in this thread generates a few large-ish datasets (10k examples) with different properties and shows the memory runtime properties of both sampling strategies. A later post includes...
Thanks for the note, and agreed on not causing a regression. I'm thinking of just special casing one param vs two. That will make the uniform sampling approach more similar...