pymc
pymc copied to clipboard
Log-probability derivation for arbitrary order statistics (for i.i.d. [univariate] random variables)
Description
Given an i.i.d. sample of univariate random variable $X_1, \dots, X_n$ with probability density function $f_X(x)$ and cumulative prob $F_X(x)$, the jth order statistic is denoted by $X_{(j)}$ and its probability density function is the following:
$$f_{X_{(j)}}(x) = \frac{n!}{(j - 1)!(n - 1)!} f_X(x) { F_{X}(x) }^{j - 1} {1 - F_{X}(x)}^{n - j} .$$
With the maximum and minimum statistics represented by $X_{(1)}$ and $X_{(n)}$, PyMC is capable of deriving their log-probability densities (#6790, #6846) and this issue directly extends that line of work for arbitrary $1 \leq k \leq n$.
Wikipedia reference: https://en.wikipedia.org/wiki/Order_statistic
CC @ricardoV94 @Dhruvanshu-Joshi
The challenge is to represent this with PyTensor. Max and min is easy, because there are Ops for it.
Then one could do sort(x)[idx], with idx == 0 or idx == -1 corresponding to max and min, but intermediate results would depend on the length of x. We need a pytensor.quantile anyway, and that would be a good candidate for how to represent orders in PyTensor: https://github.com/pymc-devs/pytensor/issues/53