statsmodels TST/Maint test failures in stats.statstools in pre-testing

trafficstars

https://dev.azure.com/statsmodels/statsmodels-testing/_build/results?buildId=4553&view=logs&j=3618ddf5-2391-5260-f9eb-8f35f95d270b&t=a24edd62-b2b9-54c8-9892-98a88509521b

There are also bugs in descriptivestats which look pandas related, I have no idea about those

The statstools failures look scipy related

I only checked shapiro so far, which looks like a scipy bug https://github.com/scipy/scipy/issues/16426

The adnorm failure might be a consequence of changes in data made by shapiro bug

ncf might be https://github.com/scipy/scipy/pull/15763 I don't know why precision is reduced now

2022-06-15T13:59:37.3464366Z ________________________________ test_noncent_f ________________________________
2022-06-15T13:59:37.3465454Z [gw1] linux -- Python 3.10.4 /opt/hostedtoolcache/Python/3.10.4/x64/bin/python
2022-06-15T13:59:37.3465924Z 
2022-06-15T13:59:37.3466385Z     def test_noncent_f():
2022-06-15T13:59:37.3467193Z         # F(4, 75) = 3.5, confidence level = .95, two-sided CI:
2022-06-15T13:59:37.3467832Z         # > lof(3.5,4,75,.95)
2022-06-15T13:59:37.3468321Z         # [1] 0.7781436 0.9750039
2022-06-15T13:59:37.3469520Z         # > hif(3.5,4,75,.95)
2022-06-15T13:59:37.3470738Z         # [1] 29.72949219 0.02499965
2022-06-15T13:59:37.3473171Z         f_stat, df1, df2 = 3.5, 4, 75
2022-06-15T13:59:37.3474526Z     
2022-06-15T13:59:37.3475236Z         ci_nc = [0.7781436, 29.72949219]
2022-06-15T13:59:37.3476091Z         res = _noncentrality_f(f_stat, df1, df2, alpha=0.05)
2022-06-15T13:59:37.3476707Z         assert_allclose(res.confint, ci_nc, rtol=0.005)
2022-06-15T13:59:37.3477268Z         # verify umvue unbiased
2022-06-15T13:59:37.3477770Z         mean = stats.ncf.mean(df1, df2, res.nc)
2022-06-15T13:59:37.3478671Z         assert_allclose(f_stat, mean, rtol=1e-8)
2022-06-15T13:59:37.3479736Z     
2022-06-15T13:59:37.3480486Z >       assert_allclose(stats.ncf.cdf(f_stat, df1, df2, res.confint),
2022-06-15T13:59:37.3482118Z                         [0.975, 0.025], rtol=1e-10)
2022-06-15T13:59:37.3482716Z E       AssertionError: 
2022-06-15T13:59:37.3483793Z E       Not equal to tolerance rtol=1e-10, atol=0
2022-06-15T13:59:37.3484354Z E       
2022-06-15T13:59:37.3484843Z E       Mismatched elements: 2 / 2 (100%)
2022-06-15T13:59:37.3485608Z E       Max absolute difference: 1.36293827e-06
2022-06-15T13:59:37.3486501Z E       Max relative difference: 2.51509652e-05
2022-06-15T13:59:37.3487123Z E        x: array([0.975001, 0.025001])
2022-06-15T13:59:37.3487634Z E        y: array([0.975, 0.025])
2022-06-15T13:59:37.3487923Z 
2022-06-15T13:59:37.3488461Z statsmodels/stats/tests/test_effectsize.py:50: AssertionError
2022-06-15T13:59:37.3489168Z _________________________________ test_shapiro _________________________________
2022-06-15T13:59:37.3490153Z [gw0] linux -- Python 3.10.4 /opt/hostedtoolcache/Python/3.10.4/x64/bin/python
2022-06-15T13:59:37.3490635Z 
2022-06-15T13:59:37.3491072Z     def test_shapiro():
2022-06-15T13:59:37.3491561Z         #tests against R fBasics
2022-06-15T13:59:37.3492034Z         #testing scipy.stats
2022-06-15T13:59:37.3492545Z         from scipy.stats import shapiro
2022-06-15T13:59:37.3493117Z     
2022-06-15T13:59:37.3494571Z         st_pv_R = np.array([0.939984787255526, 0.239621898000460])
2022-06-15T13:59:37.3495385Z         sh = shapiro(x)
2022-06-15T13:59:37.3496173Z         assert_almost_equal(sh, st_pv_R, 4)
2022-06-15T13:59:37.3496637Z     
2022-06-15T13:59:37.3497452Z         #st is ok -7.15e-06, pval agrees at -3.05e-10
2022-06-15T13:59:37.3498459Z         st_pv_R = np.array([5.799574255943298e-01, 1.838456834681376e-06 * 1e4])
2022-06-15T13:59:37.3499157Z         sh = shapiro(x**2) * np.array([1, 1e4])
2022-06-15T13:59:37.3499722Z >       assert_almost_equal(sh, st_pv_R, 5)
2022-06-15T13:59:37.3500227Z E       AssertionError: 
2022-06-15T13:59:37.3500750Z E       Arrays are not almost equal to 5 decimals
2022-06-15T13:59:37.3501222Z E       
2022-06-15T13:59:37.3501693Z E       Mismatched elements: 2 / 2 (100%)
2022-06-15T13:59:37.3502216Z E       Max absolute difference: 0.03937477
2022-06-15T13:59:37.3502779Z E       Max relative difference: 1.55701398
2022-06-15T13:59:37.3503309Z E        x: array([0.61933, 0.04701])
2022-06-15T13:59:37.3503812Z E        y: array([0.57996, 0.01838])
2022-06-15T13:59:37.3504108Z 
2022-06-15T13:59:37.3504645Z statsmodels/stats/tests/test_statstools.py:127: AssertionError
2022-06-15T13:59:37.3505362Z _________________________________ test_adnorm __________________________________
2022-06-15T13:59:37.3506343Z [gw0] linux -- Python 3.10.4 /opt/hostedtoolcache/Python/3.10.4/x64/bin/python
2022-06-15T13:59:37.3506815Z 
2022-06-15T13:59:37.3507242Z     def test_adnorm():
2022-06-15T13:59:37.3507836Z         #tests against R fBasics
2022-06-15T13:59:37.3508300Z         st_pv = []
2022-06-15T13:59:37.3509029Z         st_pv_R = np.array([0.5867235358882148, 0.1115380760041617])
2022-06-15T13:59:37.3509600Z         ad = normal_ad(x)
2022-06-15T13:59:37.3510119Z         assert_almost_equal(ad, st_pv_R, 12)
2022-06-15T13:59:37.3510629Z         st_pv.append(st_pv_R)
2022-06-15T13:59:37.3511080Z     
2022-06-15T13:59:37.3511886Z         st_pv_R = np.array([2.976266267594575e+00, 8.753003709960645e-08])
2022-06-15T13:59:37.3513196Z         ad = normal_ad(x**2)
2022-06-15T13:59:37.3514374Z >       assert_almost_equal(ad, st_pv_R, 11)
2022-06-15T13:59:37.3515082Z E       AssertionError: 
2022-06-15T13:59:37.3516000Z E       Arrays are not almost equal to 11 decimals
2022-06-15T13:59:37.3516550Z E       
2022-06-15T13:59:37.3791749Z E       Mismatched elements: 2 / 2 (100%)
2022-06-15T13:59:37.3792898Z E       Max absolute difference: 0.13032568
2022-06-15T13:59:37.3793519Z E       Max relative difference: 1.13989907
2022-06-15T13:59:37.3794746Z E        x: array([2.84594059013e+00, 1.87305445261e-07])
2022-06-15T13:59:37.3796418Z E        y: array([2.97626626759e+00, 8.75300370996e-08])
2022-06-15T13:59:37.3797785Z 
2022-06-15T13:59:37.3798563Z statsmodels/stats/tests/test_statstools.py:149: AssertionError

Jun 17 '22 16:06 josef-pkt

looks like copying from azure test log only works with raw logfile

Jun 17 '22 16:06 josef-pkt

about the noncentral f failure

I guess there is now a precision issue because the rootfinding uses now a different implementation than the cdf

cdf uses now boost rootfinding for nc parameter uses special.ncfdtrinc https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.ncfdtrinc.html

before it was a roundtrip unit test in one implementation in scipy.special

Jun 17 '22 16:06 josef-pkt

Lots of SciPy instability these days. Failure in descriptive was also SciPy related fro changes in mode.

Jun 21 '22 06:06 bashtage

The shapiro test should just be cut. It is only testing SciPy code.

Jun 21 '22 07:06 bashtage

SciPy seems to have been fixed.

Jun 21 '22 07:06 bashtage

statsmodels statsmodels copied to clipboard

TST/Maint test failures in stats.statstools in pre-testing

statsmodels
statsmodels copied to clipboard