HypothesisTests.jl
HypothesisTests.jl copied to clipboard
Is Anderson-Darling correct with fitted distributions?
Hi!
I guest than OneSampleADTest(iris[:SepalWidth], Normal())
and OneSampleADTest(iris[:SepalWidth], fit(Normal,iris[:SepalWidth]))
should give the same output, but the output using the fitted distribution is different. Is that correct?
julia> OneSampleADTest(iris[:SepalWidth], Normal())
One sample Anderson-Darling test
--------------------------------
Population details:
parameter of interest: not implemented yet
value under h_0: NaN
point estimate: NaN
Test summary:
outcome with 95% confidence: reject h_0
two-sided p-value: 0.020226513622600077 (significant)
Details:
number of observations: 150
sample mean: 3.0573333333333337
sample SD: 0.4358662849366982
A² statistic: 0.907955047114541
julia> OneSampleADTest(iris[:SepalWidth], fit(Normal,iris[:SepalWidth]))
One sample Anderson-Darling test
--------------------------------
Population details:
parameter of interest: not implemented yet
value under h_0: NaN
point estimate: NaN
Test summary:
outcome with 95% confidence: reject h_0
two-sided p-value: 0.0 (extremely significant)
Details:
number of observations: 150
sample mean: 3.0573333333333337
sample SD: 0.4358662849366982
A² statistic: 3041.038575549485
Best,
The Anderson-Darling statistic is weighted more heavily in the tails of the distribution. So when you shifted the distribution, A2 statistic grew larger as differences become more prominent.
Normal()
is Normal(0,1)
and fit(Normal,iris[:SepalWidth])
is Normal{Float64}(3.057333333333334, 0.43441096773549437)
so why would you expect the test statistic to be the same?
OneSampleADTest
does a zscore transformation of the data. It's really testing if zscore(x)
, instead of x
, comes from the distribution d
.
zscore(iris[:SepalWidth])
comes from Normal(0,1)
but iris[:SepalWidth]
should come from Normal{Float64}(3.057333333333334, 0.43441096773549437)
... Is that correct?
Other thing about OneSampleADTest
:
The p value is calculated according to "D'Agostino, Goodness-of-Fit-Techniques, 1986 p.127", which are p values for testing normality. That book has other tables, with the other p values for other distributions. So, the actual p values are misleading if the distribution isn't Normal
.
@diegozea It is true that the data is transformed but the transformed data is compared to the input distribution in https://github.com/JuliaStats/HypothesisTests.jl/blob/e58291eb9edee9522842785bfc07414f1cf5a175/src/anderson_darling.jl#L19. I guess the implementation of the test only makes sense when the supplied distribution has zero mean and unit variance.
There are a few things odd about the current implementation. It doesn't work with non-standard normal distributions, for one, and I think the statistic is defined incorrectly in the first place:
https://github.com/JuliaStats/HypothesisTests.jl/blob/master/src/anderson_darling.jl#L16
squares i
but I think that should be 2i
.
The implementation also divides by n
much more than it needs to (it can be factored out) and risks roundoff errors due to sequential addition to an accumulator (sum
would mitigate this by doing pairwise addition).
squares
i
but I think that should be2i
.
Where do you see i
being squared? That line has i+i
, which is 2i
.
squares
i
but I think that should be2i
.Where do you see
i
being squared? That line hasi+i
, which is2i
.
D'oh, sorry. I was seeing things.
If this test only works properly with normal distributions, the function should be defined only for ::Normal
, and not all ::UnivariateDistribution
s.
Correction: It appears to work alright for all continuous distributions. Discrete distributions produce nonsensical p values. So that would be ::ContinuousUnivariateDistribution
.