datasketches-java
datasketches-java copied to clipboard
A software library of stochastic streaming algorithms, a.k.a. sketches.
Initial commit on the jouney to removing dependency on DS-Memory. This will be a long trek!
We can make a number of improvements to reservoir sampling: * We can go from 2 random draws per accepted sample to 1 by picking a random long from 0...
Add function to transform tuple to theta sketch. Theta to tuple is available in library by using tuple sketch union, however the other way around seems to be not available.
We are currently depending on 8.0.0 version of datasketches-java, and that means we are stuck with Java 21 and no other version. As stated in the README of this repo,...
Merging an empty KllFloatsSketch with two KllFloatsSketch, with 1 and 200 "items" respectively, does not always produce the same result I would have expected the following test scenario to pass:...
As explained in https://github.com/apache/datasketches-java/issues/693, it would be very helpful for certain downstream projects to be able to use the KLL sketches, while still getting a deterministic result. This PR proposes...