pymc icon indicating copy to clipboard operation
pymc copied to clipboard

Log-probability derivation for arbitrary order statistics (for i.i.d. [univariate] random variables)

Open larryshamalama opened this issue 1 year ago • 1 comments

Description

Given an i.i.d. sample of univariate random variable $X_1, \dots, X_n$ with probability density function $f_X(x)$ and cumulative prob $F_X(x)$, the jth order statistic is denoted by $X_{(j)}$ and its probability density function is the following:

$$f_{X_{(j)}}(x) = \frac{n!}{(j - 1)!(n - 1)!} f_X(x) { F_{X}(x) }^{j - 1} {1 - F_{X}(x)}^{n - j} .$$

With the maximum and minimum statistics represented by $X_{(1)}$ and $X_{(n)}$, PyMC is capable of deriving their log-probability densities (#6790, #6846) and this issue directly extends that line of work for arbitrary $1 \leq k \leq n$.

Wikipedia reference: https://en.wikipedia.org/wiki/Order_statistic

CC @ricardoV94 @Dhruvanshu-Joshi

larryshamalama avatar Jan 29 '24 16:01 larryshamalama

The challenge is to represent this with PyTensor. Max and min is easy, because there are Ops for it.

Then one could do sort(x)[idx], with idx == 0 or idx == -1 corresponding to max and min, but intermediate results would depend on the length of x. We need a pytensor.quantile anyway, and that would be a good candidate for how to represent orders in PyTensor: https://github.com/pymc-devs/pytensor/issues/53

ricardoV94 avatar Feb 05 '24 10:02 ricardoV94