pyro icon indicating copy to clipboard operation
pyro copied to clipboard

Some distribution tests fail under PyTorch 2.0

Open eb8680 opened this issue 1 year ago • 1 comments

From #3192, the following distribution tests fail under torch>=2.0.0 and should be fixed prior to the next release:

  • [ ] tests/distributions/test_rejector.py::test_rejector
  • [ ] tests/distributions/test_stable.py::test_additive
  • [ ] tests/infer/reparam/test_stable.py::test_stable
  • [ ] tests/infer/reparam/test_stable.py::test_symmetric_stable
  • [ ] tests/infer/reparam/test_stable.py::test_distribution

@martinjankowiak says in #3192 that:

i think we can safely declare that and the stable errors as failing due to flakiness resulting from small numerical differences in ops

so presumably it should be sufficient to tweak the test tolerances until they pass more reliably.

Log with tracebacks:

FAILED tests/distributions/test_stable.py::test_additive[0.5-0.1-0.9--0.5-0.5] - assert 0.03662861114079258 > 0.05
 +  where 0.03662861114079258 = KstestResult(statistic=0.02, pvalue=0.03662861114079258, statistic_location=-0.10905032398437975, statistic_sign=-1).pvalue
 +    where KstestResult(statistic=0.02, pvalue=0.03662861114079258, statistic_location=-0.10905032398437975, statistic_sign=-1) = ks_2samp(tensor([  4.1167,  -3.4120,   5.0946,  ...,  -2.1908,   0.6321, -41.7160]), tensor([-23.4938, 778.4523,  -1.4387,  ...,  34.6576,  20.2388,  -7.4209]))
FAILED tests/distributions/test_stable.py::test_additive[0.99-0.1-0.9--0.5--0.5] - assert 0.03966362560959423 > 0.05
 +  where 0.03966362560959423 = KstestResult(statistic=0.0198, pvalue=0.03966362560959423, statistic_location=-31.744199393953476, statistic_sign=-1).pvalue
 +    where KstestResult(statistic=0.0198, pvalue=0.03966362560959423, statistic_location=-31.744199393953476, statistic_sign=-1) = ks_2samp(tensor([-31.1715, -34.2096, -31.3321,  ..., -32.4412, -32.3797, -37.9718]), tensor([-34.7358, -27.3010, -32.6543,  ..., -29.4273, -29.3065, -34.3996]))
FAILED tests/distributions/test_stable.py::test_additive[0.99-0.1-0.9--0.5-0.5] - assert 0.03966362560959423 > 0.05
 +  where 0.03966362560959423 = KstestResult(statistic=0.0198, pvalue=0.03966362560959423, statistic_location=-24.982819826249884, statistic_sign=-1).pvalue
 +    where KstestResult(statistic=0.0198, pvalue=0.03966362560959423, statistic_location=-24.982819826249884, statistic_sign=-1) = ks_2samp(tensor([-24.7473, -25.2959, -24.9695,  ..., -26.0039, -25.6245, -30.5095]), tensor([-27.9319, -19.4133, -26.0759,  ..., -22.2931, -22.2170, -27.6713]))
FAILED tests/distributions/test_stable.py::test_additive[0.99-0.1-0.9--0.5-0.9] - assert 0.04822817851247244 > 0.05
 +  where 0.04822817851247244 = KstestResult(statistic=0.0193, pvalue=0.04822817851247244, statistic_location=-22.396722620761377, statistic_sign=-1).pvalue
 +    where KstestResult(statistic=0.0193, pvalue=0.04822817851247244, statistic_location=-22.396722620761377, statistic_sign=-1) = ks_2samp(tensor([-22.1886, -21.8388, -22.4183,  ..., -23.4237, -22.8978, -27.6130]), tensor([-25.2121, -16.2496, -23.4453,  ..., -19.4321, -19.3740, -24.9817]))
FAILED tests/distributions/test_stable.py::test_additive[1.01-0.1-0.9--0.5--0.9] - assert 0.04126178292409227 > 0.05
 +  where 0.04126178292409227 = KstestResult(statistic=0.0197, pvalue=0.04126178292409227, statistic_location=34.410527012683424, statistic_sign=-1).pvalue
 +    where KstestResult(statistic=0.0197, pvalue=0.04126178292409227, statistic_location=34.410527012683424, statistic_sign=-1) = ks_2samp(tensor([34.9792, 31.2549, 34.8617,  ..., 33.7577, 33.7210, 28.1077]), tensor([31.4479, 37.9496, 33.4788,  ..., 36.4092, 36.5678, 31.7427]))
FAILED tests/distributions/test_stable.py::test_additive[1.01-0.1-0.9--0.5--0.5] - assert 0.03518876957795689 > 0.05
 +  where 0.03518876957795689 = KstestResult(statistic=0.0201, pvalue=0.03518876957795689, statistic_location=31.911593241318528, statistic_sign=-1).pvalue
 +    where KstestResult(statistic=0.0201, pvalue=0.03518876957795689, statistic_location=31.911593241318528, statistic_sign=-1) = ks_2samp(tensor([32.4639, 29.6834, 32.3071,  ..., 31.2335, 31.2668, 25.9957]), tensor([29.0656, 35.9287, 31.0151,  ..., 34.1378, 34.2820, 29.3342]))
FAILED tests/distributions/test_stable.py::test_additive[1.01-0.1-0.9--0.5-0.9] - assert 0.04639733883233797 > 0.05
 +  where 0.04639733883233797 = KstestResult(statistic=0.0194, pvalue=0.04639733883233797, statistic_location=23.462327711829396, statistic_sign=-1).pvalue
 +    where KstestResult(statistic=0.0194, pvalue=0.04639733883233797, statistic_location=23.462327711829396, statistic_sign=-1) = ks_2samp(tensor([23.6211, 24.0275, 23.3972,  ..., 22.4239, 22.9014, 18.4463]), tensor([20.7213, 28.8884, 22.3887,  ..., 26.2179, 26.3120, 20.8984]))
FAILED tests/infer/reparam/test_stable.py::test_stable[LatentStableReparam-(4,)] - AssertionError: tensor([[-0.6467, -0.5359, -0.5765, -0.1425],
        [ 3.6042,  3.6126,  3.5015,  3.9583],
        [ 1.4299,  0.9980,  1.2120,  1.1910],
        [ 1.4928,  1.0505,  1.2678,  0.8807],
        [ 2.0543,  1.8109,  1.8507,  1.4393],
        [ 4.7879,  4.6888,  4.6451,  4.2709]], grad_fn=<CatBackward0>) vs tensor([[-0.5921, -0.5500, -0.5829, -0.1410],
        [ 3.5901,  3.6017,  3.5045,  3.9636],
        [ 1.4160,  0.9878,  1.2189,  1.1884],
        [ 1.4777,  1.0416,  1.2741,  0.8781],
        [ 2.0403,  1.8027,  1.8583,  1.4356],
        [ 4.7713,  4.6789,  4.6547,  4.2681]], grad_fn=<CatBackward0>)
FAILED tests/infer/reparam/test_stable.py::test_symmetric_stable[(4,)] - AssertionError: tensor([-0.0403, -0.1768, -0.0293,  0.0101]) vs tensor([-0.0199,  0.1129, -0.0009, -0.0492])
FAILED tests/infer/reparam/test_stable.py::test_distribution[LatentStableReparam-0.1--0.5] - assert 0.04393955208216105 > 0.05
 +  where 0.04393955208216105 = KstestResult(statistic=0.013800000000000034, pvalue=0.04393955208216105, statistic_location=-0.8912201065651693, statistic_sign=-1).pvalue
 +    where KstestResult(statistic=0.013800000000000034, pvalue=0.04393955208216105, statistic_location=-0.8912201065651693, statistic_sign=-1) = ks_2samp(tensor([-4.0060e-01, -3.4117e+03, -9.7309e+00,  ...,  7.9049e-02,\n        -8.5932e+10, -1.9342e-01]), tensor([7.8844e-02, 4.1522e-02, 6.1860e+07,  ..., 1.0619e+06, 4.6877e+03,\n        7.4023e-02]))

eb8680 avatar May 18 '23 17:05 eb8680

I can take a look at those since I wrote the failing tests.

fritzo avatar May 18 '23 17:05 fritzo