pymc icon indicating copy to clipboard operation
pymc copied to clipboard

Add Hilbert Space Student T Process

Open ferrine opened this issue 1 year ago • 7 comments

Closes #6920


:books: Documentation preview :books:: https://pymc--6997.org.readthedocs.build/en/6997/

ferrine avatar Nov 07 '23 15:11 ferrine

Codecov Report

Merging #6997 (e597d7d) into main (ec4407d) will increase coverage by 4.33%. The diff coverage is 94.73%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6997      +/-   ##
==========================================
+ Coverage   87.78%   92.12%   +4.33%     
==========================================
  Files         100      100              
  Lines       16896    16911      +15     
==========================================
+ Hits        14832    15579     +747     
+ Misses       2064     1332     -732     
Files Coverage Δ
pymc/gp/__init__.py 100.00% <100.00%> (ø)
pymc/gp/hsgp_approx.py 92.43% <94.11%> (+0.05%) :arrow_up:

... and 8 files with indirect coverage changes

codecov[bot] avatar Nov 07 '23 15:11 codecov[bot]

After seeing this there's a couple things that make me prefer the option of passing nu as an optional kwarg to HSGP.__init__. Mostly am concerned that Periodic HSGP https://github.com/pymc-devs/pymc/pull/6877 won't fit within this pattern, which is a bit difficult because it's structured differently itself (an approximate spectral density and a different basis). Also it's a bit unfortunate that the docstring is split, so now it ends up being mostly boilerplate for the t-process version.

What about nu as a kwarg for HSGP, with a default of None? And use the _base_dist idea that you have to set it. The docstring of HSGP can explain the nu argument. Then we could add a function to be an alias for HSTP, like

def HSTP(mean_func, cov_func, nu):
    """ Hilbert Space Student-t process

    Reference: ...
    """
    return HSGP(...)

Then we don't need to think about an inheritance structure or refactor again so it works with HSGP Periodic, especially because it'd be great to have HSTP Periodic.

RE _base_dist, could it be anything? I would expect it'd only be Normal or StudentT. Might be nice to be explicit about that.

But I'm super excited to use this, I think HSTP will be used as a default before HSGP in a lot of cases. What do you think @ferrine?

bwengals avatar Nov 07 '23 22:11 bwengals

Let's wait the https://github.com/pymc-devs/pymc/issues/6883 and then continue, so it will be one round of refactoring

ferrine avatar Dec 02 '23 09:12 ferrine

Let's wait the https://github.com/pymc-devs/pymc/issues/6883 and then continue, so it will be one round of refactoring

Are you sure? Pytensorifying the GP and covariance function library is a huge refactor. I think that it'd be great to get HSTP's in ASAP, and a refactor isn't necessary to do so (see my comment from Nov 7 above here).

bwengals avatar Dec 10 '23 05:12 bwengals

I agree with @bwengals, it might take a while to do the other changes.

ricardoV94 avatar Dec 10 '23 08:12 ricardoV94

Still not sure what should be the design choice, and should this eve stop me from taking any branch I find more appealing. We'll anyway refactor everything as far as I get it

ferrine avatar Dec 13 '23 21:12 ferrine

What about just keeping the _base_dist idea like you have, and have nu be a optional kwarg to HSGP? Then _base_dist can either be StudentT if nu is not None or Normal if nu is None. That would keep it all within HSGP, and you could just modify the docstring once there. I think it'd make HSTPs easier for users to discover. Then if you wanted to make an quick alias for HSTP you could. wdyt?

bwengals avatar Dec 14 '23 06:12 bwengals

After all the refactorings in HSGP, it makes sense to create the PR from scratch

ferrine avatar Jul 09 '24 09:07 ferrine