pymc icon indicating copy to clipboard operation
pymc copied to clipboard

Add KL Divergence helper

Open ferrine opened this issue 2 years ago • 11 comments

Description

Related Issue

  • [ ] Closes issue: #
  • [ ] Related issue (not closed by this PR): #

Checklist

Type of change

  • [x] New feature / enhancement
  • [ ] Bug fix
  • [ ] Documentation
  • [ ] Maintenance
  • [ ] Other (please specify):

:books: Documentation preview :books:: https://pymc--7062.org.readthedocs.build/en/7062/

ferrine avatar Dec 11 '23 19:12 ferrine

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (94020c9) 90.17% compared to head (e3a6c3e) 90.18%. Report is 7 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #7062      +/-   ##
==========================================
+ Coverage   90.17%   90.18%   +0.01%     
==========================================
  Files         101      103       +2     
  Lines       16932    16952      +20     
==========================================
+ Hits        15269    15289      +20     
  Misses       1663     1663              
Files Coverage Δ
pymc/distributions/__init__.py 100.00% <100.00%> (ø)
pymc/distributions/stats/__init__.py 100.00% <100.00%> (ø)
pymc/distributions/stats/kl_divergence.py 100.00% <100.00%> (ø)
pymc/logprob/__init__.py 100.00% <ø> (ø)
pymc/logprob/abstract.py 96.07% <100.00%> (+0.62%) :arrow_up:
pymc/logprob/basic.py 94.48% <100.00%> (+0.07%) :arrow_up:

codecov[bot] avatar Dec 11 '23 19:12 codecov[bot]

This looks very good @ferrine! One thing that you definitely need to do before you can merge this is to add a page about this to the documentation. Maybe make a folder for logprob and include a subfolder for the KL divergence, since it will start to grow as more divergences get added.

I found that our pm.Distribution like pm.Normal can pass isinstance(rv.owner.op, pm.Normal) checks, I wonder if I can further rely on such functionality

ferrine avatar Dec 12 '23 09:12 ferrine

I am also confused about the design choice, since moment implementation is scattered across the codebase...

ferrine avatar Dec 12 '23 09:12 ferrine

The base functionality should be in logprob, but the specific implementations should be in Distributions

I am also confused about the design choice, since moment implementation is scattered across the codebase...

moment was implemented at a different time for a different purpose. It's not even really moment but finite_logp_point. It's never used for logprob stuff, just samplers.

I found that our pm.Distribution like pm.Normal can pass isinstance(rv.owner.op, pm.Normal) checks, I wonder if I can further rely on such functionality

You can but it won't work for things that look like Distributions but are just helpers to create distributions like pm.OrderedLogistic, pm.ZeroInflatedBinomial, or LKJCholeskyCov. What should always be safe is the rv_op that's actually returned.

ricardoV94 avatar Dec 12 '23 14:12 ricardoV94

Moved kl into a private pymc.distribution._stats because these are functions that will be never used by anyone

ferrine avatar Dec 15 '23 12:12 ferrine

I don't like the underscore, why not just distributions/kl_div?

ricardoV94 avatar Dec 15 '23 12:12 ricardoV94

@ricardoV94 can you please reiterate on the review? Did I miss something?

ferrine avatar Jan 23 '24 07:01 ferrine

Tests are failing with import issue

ricardoV94 avatar Jan 23 '24 14:01 ricardoV94

How many pairs of distributions do we expect to actually be able to support?

ricardoV94 avatar Feb 02 '24 13:02 ricardoV94

How many pairs of distributions do we expect to actually be able to support?

Many of them https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/kl_divergence

ferrine avatar Feb 03 '24 11:02 ferrine

Just did the rebase, anything we can add or change on top of that?

ferrine avatar Jul 09 '24 08:07 ferrine