statsmodels
statsmodels copied to clipboard
TST/Maint test failures in stats.statstools in pre-testing
https://dev.azure.com/statsmodels/statsmodels-testing/_build/results?buildId=4553&view=logs&j=3618ddf5-2391-5260-f9eb-8f35f95d270b&t=a24edd62-b2b9-54c8-9892-98a88509521b
There are also bugs in descriptivestats which look pandas related, I have no idea about those
The statstools failures look scipy related
I only checked shapiro so far, which looks like a scipy bug https://github.com/scipy/scipy/issues/16426
The adnorm failure might be a consequence of changes in data made by shapiro bug
ncf might be https://github.com/scipy/scipy/pull/15763 I don't know why precision is reduced now
2022-06-15T13:59:37.3464366Z ________________________________ test_noncent_f ________________________________
2022-06-15T13:59:37.3465454Z [gw1] linux -- Python 3.10.4 /opt/hostedtoolcache/Python/3.10.4/x64/bin/python
2022-06-15T13:59:37.3465924Z
2022-06-15T13:59:37.3466385Z def test_noncent_f():
2022-06-15T13:59:37.3467193Z # F(4, 75) = 3.5, confidence level = .95, two-sided CI:
2022-06-15T13:59:37.3467832Z # > lof(3.5,4,75,.95)
2022-06-15T13:59:37.3468321Z # [1] 0.7781436 0.9750039
2022-06-15T13:59:37.3469520Z # > hif(3.5,4,75,.95)
2022-06-15T13:59:37.3470738Z # [1] 29.72949219 0.02499965
2022-06-15T13:59:37.3473171Z f_stat, df1, df2 = 3.5, 4, 75
2022-06-15T13:59:37.3474526Z
2022-06-15T13:59:37.3475236Z ci_nc = [0.7781436, 29.72949219]
2022-06-15T13:59:37.3476091Z res = _noncentrality_f(f_stat, df1, df2, alpha=0.05)
2022-06-15T13:59:37.3476707Z assert_allclose(res.confint, ci_nc, rtol=0.005)
2022-06-15T13:59:37.3477268Z # verify umvue unbiased
2022-06-15T13:59:37.3477770Z mean = stats.ncf.mean(df1, df2, res.nc)
2022-06-15T13:59:37.3478671Z assert_allclose(f_stat, mean, rtol=1e-8)
2022-06-15T13:59:37.3479736Z
2022-06-15T13:59:37.3480486Z > assert_allclose(stats.ncf.cdf(f_stat, df1, df2, res.confint),
2022-06-15T13:59:37.3482118Z [0.975, 0.025], rtol=1e-10)
2022-06-15T13:59:37.3482716Z E AssertionError:
2022-06-15T13:59:37.3483793Z E Not equal to tolerance rtol=1e-10, atol=0
2022-06-15T13:59:37.3484354Z E
2022-06-15T13:59:37.3484843Z E Mismatched elements: 2 / 2 (100%)
2022-06-15T13:59:37.3485608Z E Max absolute difference: 1.36293827e-06
2022-06-15T13:59:37.3486501Z E Max relative difference: 2.51509652e-05
2022-06-15T13:59:37.3487123Z E x: array([0.975001, 0.025001])
2022-06-15T13:59:37.3487634Z E y: array([0.975, 0.025])
2022-06-15T13:59:37.3487923Z
2022-06-15T13:59:37.3488461Z statsmodels/stats/tests/test_effectsize.py:50: AssertionError
2022-06-15T13:59:37.3489168Z _________________________________ test_shapiro _________________________________
2022-06-15T13:59:37.3490153Z [gw0] linux -- Python 3.10.4 /opt/hostedtoolcache/Python/3.10.4/x64/bin/python
2022-06-15T13:59:37.3490635Z
2022-06-15T13:59:37.3491072Z def test_shapiro():
2022-06-15T13:59:37.3491561Z #tests against R fBasics
2022-06-15T13:59:37.3492034Z #testing scipy.stats
2022-06-15T13:59:37.3492545Z from scipy.stats import shapiro
2022-06-15T13:59:37.3493117Z
2022-06-15T13:59:37.3494571Z st_pv_R = np.array([0.939984787255526, 0.239621898000460])
2022-06-15T13:59:37.3495385Z sh = shapiro(x)
2022-06-15T13:59:37.3496173Z assert_almost_equal(sh, st_pv_R, 4)
2022-06-15T13:59:37.3496637Z
2022-06-15T13:59:37.3497452Z #st is ok -7.15e-06, pval agrees at -3.05e-10
2022-06-15T13:59:37.3498459Z st_pv_R = np.array([5.799574255943298e-01, 1.838456834681376e-06 * 1e4])
2022-06-15T13:59:37.3499157Z sh = shapiro(x**2) * np.array([1, 1e4])
2022-06-15T13:59:37.3499722Z > assert_almost_equal(sh, st_pv_R, 5)
2022-06-15T13:59:37.3500227Z E AssertionError:
2022-06-15T13:59:37.3500750Z E Arrays are not almost equal to 5 decimals
2022-06-15T13:59:37.3501222Z E
2022-06-15T13:59:37.3501693Z E Mismatched elements: 2 / 2 (100%)
2022-06-15T13:59:37.3502216Z E Max absolute difference: 0.03937477
2022-06-15T13:59:37.3502779Z E Max relative difference: 1.55701398
2022-06-15T13:59:37.3503309Z E x: array([0.61933, 0.04701])
2022-06-15T13:59:37.3503812Z E y: array([0.57996, 0.01838])
2022-06-15T13:59:37.3504108Z
2022-06-15T13:59:37.3504645Z statsmodels/stats/tests/test_statstools.py:127: AssertionError
2022-06-15T13:59:37.3505362Z _________________________________ test_adnorm __________________________________
2022-06-15T13:59:37.3506343Z [gw0] linux -- Python 3.10.4 /opt/hostedtoolcache/Python/3.10.4/x64/bin/python
2022-06-15T13:59:37.3506815Z
2022-06-15T13:59:37.3507242Z def test_adnorm():
2022-06-15T13:59:37.3507836Z #tests against R fBasics
2022-06-15T13:59:37.3508300Z st_pv = []
2022-06-15T13:59:37.3509029Z st_pv_R = np.array([0.5867235358882148, 0.1115380760041617])
2022-06-15T13:59:37.3509600Z ad = normal_ad(x)
2022-06-15T13:59:37.3510119Z assert_almost_equal(ad, st_pv_R, 12)
2022-06-15T13:59:37.3510629Z st_pv.append(st_pv_R)
2022-06-15T13:59:37.3511080Z
2022-06-15T13:59:37.3511886Z st_pv_R = np.array([2.976266267594575e+00, 8.753003709960645e-08])
2022-06-15T13:59:37.3513196Z ad = normal_ad(x**2)
2022-06-15T13:59:37.3514374Z > assert_almost_equal(ad, st_pv_R, 11)
2022-06-15T13:59:37.3515082Z E AssertionError:
2022-06-15T13:59:37.3516000Z E Arrays are not almost equal to 11 decimals
2022-06-15T13:59:37.3516550Z E
2022-06-15T13:59:37.3791749Z E Mismatched elements: 2 / 2 (100%)
2022-06-15T13:59:37.3792898Z E Max absolute difference: 0.13032568
2022-06-15T13:59:37.3793519Z E Max relative difference: 1.13989907
2022-06-15T13:59:37.3794746Z E x: array([2.84594059013e+00, 1.87305445261e-07])
2022-06-15T13:59:37.3796418Z E y: array([2.97626626759e+00, 8.75300370996e-08])
2022-06-15T13:59:37.3797785Z
2022-06-15T13:59:37.3798563Z statsmodels/stats/tests/test_statstools.py:149: AssertionError
looks like copying from azure test log only works with raw logfile
about the noncentral f failure
I guess there is now a precision issue because the rootfinding uses now a different implementation than the cdf
cdf uses now boost rootfinding for nc parameter uses special.ncfdtrinc https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.ncfdtrinc.html
before it was a roundtrip unit test in one implementation in scipy.special
Lots of SciPy instability these days. Failure in descriptive was also SciPy related fro changes in mode.
The shapiro test should just be cut. It is only testing SciPy code.
SciPy seems to have been fixed.