RedisBloom
RedisBloom copied to clipboard
TDIGEST.QUANTILE returns NaN when there is only one observation
To reproduce:
> tdigest.create foo
OK
> tdigest.add foo 123
OK
> tdigest.quantile foo 0.9
1) "nan"
Isn't any quantile of a single observation the observation itself?
From a theoretical perspective, given just one observation, we can't define the value that is smaller than x% of the observations. I believe nan is good reply. Think of it as inestimable / Incalculable.
I think this is more of a limitation of the algorithm that is used here than that the answer does not exist. For example, Redis does produce a result when there are multiple identical observations, for which a "smaller than x%" is also not defined.
> tdigest.create foo
OK
> tdigest.add foo 123
OK
> tdigest.add foo 123
OK
> tdigest.quantile foo 0.9
1) "123"
This depends on the definition. Suppose you have an ordered list with 5 observations - (123, 123, 123, 123, 123), you can define the 40th percentile as the value that is smaller than or equal to 40% of the values in this ordered list. That's not the accurate definition, but more of an intuition of how it works.