RedisBloom icon indicating copy to clipboard operation
RedisBloom copied to clipboard

TDIGEST.QUANTILE returns NaN when there is only one observation

Open OvervCW opened this issue 7 months ago • 3 comments

To reproduce:

> tdigest.create foo
OK
> tdigest.add foo 123
OK
> tdigest.quantile foo 0.9
1) "nan"

Isn't any quantile of a single observation the observation itself?

OvervCW avatar Apr 24 '25 10:04 OvervCW

From a theoretical perspective, given just one observation, we can't define the value that is smaller than x% of the observations. I believe nan is good reply. Think of it as inestimable / Incalculable.

LiorKogan avatar Apr 24 '25 12:04 LiorKogan

I think this is more of a limitation of the algorithm that is used here than that the answer does not exist. For example, Redis does produce a result when there are multiple identical observations, for which a "smaller than x%" is also not defined.

> tdigest.create foo
OK
> tdigest.add foo 123
OK
> tdigest.add foo 123
OK
> tdigest.quantile foo 0.9
1) "123"

OvervCW avatar May 07 '25 08:05 OvervCW

This depends on the definition. Suppose you have an ordered list with 5 observations - (123, 123, 123, 123, 123), you can define the 40th percentile as the value that is smaller than or equal to 40% of the values in this ordered list. That's not the accurate definition, but more of an intuition of how it works.

LiorKogan avatar May 07 '25 12:05 LiorKogan