tdigest icon indicating copy to clipboard operation
tdigest copied to clipboard

[Suggestion] Small exports for distributed programs

Open SGrondin opened this issue 8 years ago • 3 comments

Hi,

This is a fork used in large distributed programs where I work. It adds a Distributable class that inherits from Digest. The purpose of that class is to minimize the size of the exported state (toArray) so that a node wanting to read a percentile value can fetch lots of small internal states from each node and recompute the percentile quickly.

It implements toList(), which is a more compact version of toArray(). It uses arrays to save space on the countless mean: ..., n: .... The centroids can be pushed back into a new Distributable instance using .push(centroid[0], centroid[1]).

I have no idea if this would be useful to you or anyone else, but I'm opening this PR in case you find it interesting and/or want to merge it.

SGrondin avatar Feb 28 '16 16:02 SGrondin

Thanks! I'll give it a look later today.

welch avatar Feb 29 '16 20:02 welch

To be honest, I think it's a very narrow use case, the settings are hardcoded and there's no tests. I would be surprised if you merged is as is. I opened the PR because if I do work on top of open source software I like to show the author how it's being used in case it gives them ideas :)

SGrondin avatar Mar 01 '16 21:03 SGrondin

As you say, it's not mergeable code (I'd hit you up for unit tests at least). But it's a pretty classy way to submit a feature request :smile:

I'll take this on so I can get you back to running the main line.

Thanks!

welch avatar Mar 02 '16 17:03 welch