best
best copied to clipboard
Sigma ≠ standard deviation ⇒ introduce Model v2
This PR fixes a bug in the model specification, which is present in the original publication and the R package as well.
The standard deviation of a t-distribution is not σ. Instead, variance = nu / (nu - 2) * sigma ** 2 if nu > 2, and variance = ∞ if 1 < nu <= 2. Therefore, any posterior sample nu <= 2 assigns a non-zero probability that the variance is ∞, which in real life is never the case.
Proof that SD is underestimated by the R package (Python output is essentially the same):
> require(BEST)
Loading required package: BEST
Loading required package: HDInterval
> N=1000
> testDataDrug = rt(N, df=4) # sigma = 1
> sd(testDataDrug)
[1] 1.417539
> BESTout = BESTmcmc(testDataDrug)
Waiting for parallel processing to complete...done.
> mean(BESTout$sigma)
[1] 1.006396
> mean(sqrt(BESTout$nu / (BESTout$nu - 2)) * BESTout$sigma)
[1] 1.47081
Samples of nu close to 2 are also problematic, as they blow up the SD estimation. Therefore I changed the sampling of nu as such:
- originally: ν ~ Exponential(1/29) + 1
- now: ν ~ Exponential(1/27.5) + 2.5
( t(df=2.5) is very similar to t(df=2) when it's not extremely far from the mean.)
The other change concerns the sampling of σ.
- Originally, σ was uniformly distributed between σ^ / 1000 and σ^ * 1000 (where σ^ is the sample standard deviation), meaning the prior probability of σ > σ^ was 1000 times that of σ < σ^. This caused an overestimation of σ at small sample sizes.
- Now, log(σ) is distributed uniformly between log( σ^ / 1000 ) and log( σ^ * 1000 ).
As a consequence, reduced the displayed limits of the std.devs in plot_posterior and plot_all to 99.5% of the samples.
(This is part 3 of 4 pull requests that build on top of one another.)