pymc icon indicating copy to clipboard operation
pymc copied to clipboard

Skewed Student-T distribution

Open fonnesbeck opened this issue 1 year ago • 10 comments

Description

Added skewed Student T distribution (Jones and Faddy implementation).

Checklist

Type of change

  • [x] New feature / enhancement
  • [ ] Bug fix
  • [ ] Documentation
  • [ ] Maintenance
  • [ ] Other (please specify):

📚 Documentation preview 📚: https://pymc--7252.org.readthedocs.build/en/7252/

fonnesbeck avatar Apr 13 '24 18:04 fonnesbeck

Add in pymc-experimental first?

ricardoV94 avatar Apr 13 '24 18:04 ricardoV94

If something like this goes in experimental, there should be a clear roadmap to go from there back to the main repo. This is a straight-forward distribution that (I guess?) can't be easily implemented as a generative graph using only existing distributions via CustomDist. If it's in pmx, what's the criteria to move it over? It's not something like statespace or marginalmodel that proposes a huge new functionality, or something like r2d2m2 that implements a distribution that doesn't fit neatly into a typical pymc model as we present them in books/tutorials.

jessegrabowski avatar Apr 14 '24 11:04 jessegrabowski

I don't think this one has a straightforward generative graph

ricardoV94 avatar Apr 14 '24 11:04 ricardoV94

I assumed something "according to x and y" was a bit experimental, hence my suggestion

ricardoV94 avatar Apr 14 '24 11:04 ricardoV94

The "Following Jones and Faddy 2003" was to distinguish it from the range of available skewed Student T implementations (there are 3 or 4). I chose this one because it is available in SciPy and relatively straightforward to implement.

I think it will be a useful distribution because it allows one to specify a likelihood that is both skewed and overdispersed and converges to normal when a and b are both large and equal.

fonnesbeck avatar Apr 14 '24 21:04 fonnesbeck

Jax failure does not seem to be related to this PR

fonnesbeck avatar Apr 15 '24 02:04 fonnesbeck

Jax failure does not seem to be related to this PR

It's not. Has been failing since the last scipy

ricardoV94 avatar Apr 15 '24 08:04 ricardoV94

But this one is: https://github.com/pymc-devs/pymc/actions/runs/8682226852/job/23806440256?pr=7252#step:7:616

You should try running those logp/logcdf/icdf with n_samples=-1 or whatever it is, to test all combinations locally instead of only a random subset

ricardoV94 avatar Apr 15 '24 08:04 ricardoV94

You're missing the tests for the RV itself. That's also where we cover alternative parametrizations. Something like these

https://github.com/pymc-devs/pymc/blob/34c2d31484143b7b5c6130927ed4efde93a27a0f/tests/distributions/test_continuous.py#L2179-L2200

ricardoV94 avatar Apr 15 '24 08:04 ricardoV94

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.35%. Comparing base (60a6314) to head (bcb1b5d). Report is 4 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #7252      +/-   ##
==========================================
+ Coverage   91.67%   92.35%   +0.68%     
==========================================
  Files         102      102              
  Lines       17017    17069      +52     
==========================================
+ Hits        15600    15764     +164     
+ Misses       1417     1305     -112     
Files Coverage Δ
pymc/distributions/__init__.py 100.00% <ø> (ø)
pymc/distributions/continuous.py 97.93% <100.00%> (+0.09%) :arrow_up:

... and 4 files with indirect coverage changes

codecov-commenter avatar Apr 28 '24 20:04 codecov-commenter

Looks like test_skewstudentt_logp is failing on main, or is at least flaky?

michaelosthege avatar May 06 '24 07:05 michaelosthege

These tests run a random subset of 100 combinations. We should set n_samples=-1 locally and see if every combination passes (both float64 and float32), I guess it does not.

CC @fonnesbeck

ricardoV94 avatar May 06 '24 08:05 ricardoV94

Running float64 this gives:

E           AssertionError: 
E           Arrays are not almost equal to 6 decimals
E           {'a': array(0.01), 'b': array(100.), 'mu': array(-2.1), 'sigma': array(0.01), 'value': array(-0.01)}
E           x and y -inf location mismatch:
E            x: array(-751.339449)
E            y: array(-inf)

michaelosthege avatar May 07 '24 12:05 michaelosthege

Which underflows? PyMC or Scipy?

ricardoV94 avatar May 07 '24 12:05 ricardoV94

Which underflows? PyMC or Scipy?

SciPy produces the -inf.

What's the fix? I have it open and can push a branch real quick (if it's simple)

michaelosthege avatar May 07 '24 12:05 michaelosthege

Which underflows? PyMC or Scipy?

SciPy produces the -inf.

What's the fix? I have it open and can push a branch real quick (if it's simple)

Try slightly less extreme parameter domains

ricardoV94 avatar May 07 '24 12:05 ricardoV94