redis-tdigest
redis-tdigest copied to clipboard
t-digest module for Redis
redis-tdigest
This is a Redis module for the t-digest data structure which can be used for accurate online accumulation of rank-based statistics such as quantiles and cumulative distribution at a point. The implementation is based on the Merging Digest implementation by the author.
Building & Loading
Before going ahead, make sure that the Redis server you're using has support for Redis modules.
First, you'll have to build the Redis t-digest module from source.
make
This should generate a shared library called tdigest.so
in the root folder. You can now load it into Redis by using the following redis.conf
configuration directive:
loadmodule /path/to/tdigest.so
Alternatively, you can load it on an already running Redis server by issuing the following commands:
MODULE LOAD /path/to/tdigest.so
API
TDIGEST.NEW key [compression]
Initializes a key
to an empty t-digest structure with the compression
provided or with the default compression of 400
.
Reply: "OK"
TDIGEST.ADD key value count [value count ...]
Adds a value
with the specified count
. If key
is missing, an empty t-digest structure is initialized with a default compression of 400
. Returns the sum of counts for all values added.
Reply: long long
TDIGEST.MERGE destkey sourcekey [sourcekey ...]
Merges one or more sourcekey
into destkey
. If destkey
is missing, an empty t-digest structure is initialized with a default compression of 400
.
Reply: "OK"
TDIGEST.CDF key value [value ...]
Returns the cumulative distribution for all provided values. value
must be a double. The cumulative distribution returned for all values is between 0..1
.
Reply: double
array or nil
if key missing
TDIGEST.QUANTILE key quantile [quantile ...]
Returns the estimate values at all provided quantiles. quantile
must be a double
between 0..1
.
Reply: double
array or nil
if key missing
TDIGEST.DEBUG key
Prints debug information about the t-digest.
Reply: bulk strings array
The reply is of the form:
1) TDIGEST (<compression>, <num_centroids>, <memory size>)
2) CENTROID (<mean>, <weight>)
3) CENTROID (<mean>, <weight>)
4) CENTROID (<mean>, <weight>)
5) ...
Centroids are printed in sorted order with respect to their mean.
Testing
The integration tests require a running Redis server so you must have redis-server
on your PATH
or pass its location in an environment variable called REDIS_SERVER
. Tests are written in Python and use the pytest unit testing library.
make test
Contributing
Bug reports, feature and pull requests are welcome! Please add tests for any non-trivial changes you submit.