filterpy.stats.rand_student_t calculation question
I came across this function while reading https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/03-Gaussians.ipynb. It is copy pasted into the book:
def rand_student_t(df, mu=0, std=1):
"""return random number distributed by Student's t
distribution with `df` degrees of freedom with the
specified mean and standard deviation.
"""
x = random.gauss(0, std)
y = 2.0*random.gammavariate(0.5*df, 2.0)
return x / (math.sqrt(y / df)) + mu
This function is used to generate 5000 sample variety and then plots it into a graph:
def sense_t():
return 10 + rand_student_t(7)*2
zs = [sense_t() for i in range(5000)]
plt.plot(zs, lw=1);
I wanted to replace filterpy implementation with scipy.stats.t this way:
from scipy.stats import t
zs = t.rvs(df=7, loc=10, scale=2, size=5000)
plt.plot(zs, lw=1)
which should, if my understanding is correct, produce an equivalent result (sample from Student's t-distributed random variable with mean=10, std=2, and 7 degrees of freedom).
However when looking at the plots, it seems that filterpy version gives a bit "tighter" spread with shorter outliers. When looking into the filterpy implementation, I understood that it uses chi-square and gamma distributed random variables to express the Student's t-distributed one:
T = Z(mu, std) / sqrt(V(df)/df)
V = G(df/2, 2)
where Z(mu, std) is normally distributed random variable, and V = G(df/2, 2) is gamma distributed random variable (using scale notation).
This should be fine except the filterpy code uses V = 2*G(df/2, 2). Incidentally, when I remove multiplication by 2 from the formula, it seems to closer correspond to scipy.stats.t sample. So I wonder from where the multiplication by 2 comes?
I honestly have no idea, and believe you are correct!