tdigest icon indicating copy to clipboard operation
tdigest copied to clipboard

Alternative faster t-digest implementation

Open Jackmrzhou opened this issue 3 months ago • 1 comments

Hi, thanks for the t-digest implementation for python! I used this for my work and I found in the end, computing t-digest and merging t-digest becoming the bottleneck. So I read the original paper and implemented an another version of it(using the algorithm in the paper). Then I found the performance is better (around 50-100 times faster). I think the improvement part is that we can have some buffer and merge hundred of values into t-digest at once. I wonder if I could have a PR to this repo and add an alternative implementation to it? So I can use that in my day to day work, thanks.

Jackmrzhou avatar Apr 01 '24 13:04 Jackmrzhou

Hi @Jackmrzhou, unfortunately this repo isn't actively updated. You can still PR, but I can't promise a good review or testing, or even if it will be merged. You could fork this repo, PR to your own fork, and pip-install that forked version.

CamDavidsonPilon avatar Apr 01 '24 13:04 CamDavidsonPilon