tdigestc icon indicating copy to clipboard operation
tdigestc copied to clipboard

possible bug in td_trimmed_mean

Open tvondra opened this issue 4 years ago • 5 comments

https://github.com/ajwerner/tdigestc/blob/a2a61e660188bffabf687f2fce1785439c994936/go/tdigest.c#L232

I believe this condition

          if (n->count < left_tail_count) {
               continue;
          }

should really be

          if (count_seen + n->count < left_tail_count) {
               continue;
          }

otherwise the code may not consider some relevant centroids.

tvondra avatar Dec 27 '20 17:12 tvondra

Good catch, that's pretty egregious.

ajwerner avatar Dec 27 '20 19:12 ajwerner

Not that explaining the source of bugs is all that valuable, the source of this bug is that I ported td_trimmed_mean back from https://github.com/ajwerner/tdigest/blob/9165f388a6b2c88703b9630c9d15e59d2d212d7e/internal/tdigest/tdigest.go#L17. In the go repo, which I've invested a bit more into than this one, the counts maintain a cumulative count value when used for reading. That cumulative value enables binary search for td_value_at.

ajwerner avatar Dec 27 '20 19:12 ajwerner

Understood. I've made plenty such bugs ;-)

tvondra avatar Jan 02 '21 20:01 tvondra

Hi, did this get fixed?

kdarby-cqg avatar Aug 25 '23 01:08 kdarby-cqg

Doesn't look like it.

ajwerner avatar Aug 25 '23 02:08 ajwerner