tdigestc
tdigestc copied to clipboard
possible bug in td_trimmed_mean
https://github.com/ajwerner/tdigestc/blob/a2a61e660188bffabf687f2fce1785439c994936/go/tdigest.c#L232
I believe this condition
if (n->count < left_tail_count) {
continue;
}
should really be
if (count_seen + n->count < left_tail_count) {
continue;
}
otherwise the code may not consider some relevant centroids.
Good catch, that's pretty egregious.
Not that explaining the source of bugs is all that valuable, the source of this bug is that I ported td_trimmed_mean back from https://github.com/ajwerner/tdigest/blob/9165f388a6b2c88703b9630c9d15e59d2d212d7e/internal/tdigest/tdigest.go#L17. In the go repo, which I've invested a bit more into than this one, the counts maintain a cumulative count value when used for reading. That cumulative value enables binary search for td_value_at.
Understood. I've made plenty such bugs ;-)
Hi, did this get fixed?
Doesn't look like it.