MeasureTheory.jl icon indicating copy to clipboard operation
MeasureTheory.jl copied to clipboard

Scaled canonical gaussian likelihoods

Open mschauer opened this issue 3 years ago • 5 comments

We would like to represent the bounded measure (but not necessarily probability measure) with density

Screen Shot 2022-08-22 at 17 59 17

(so H ( in MT parlance), and F = Λμ could be called potential parameter)

For some choice of c this is a probability measure, but the actual value of c itself contains important information about the evidence of a Bayesian model with Gaussian posterior represented in this form)

The likelihood object should pairing with Gaussian priors (giving a Gaussian posterior), support fusion #229, and pullback $$\exp(\tilde c + \tilde Fx + x'\tilde Hx) = \int \exp(c + Fy + y' H y) \kappa(x, dy) $$

where

$$\kappa(x) = N(A x + b, Q)$$

is a linear Gaussian kernel, with density

$$ \propto \exp(-\frac12 (y - A x)' Q^{-1} (y-Ax) )$$

mschauer avatar Aug 22 '22 16:08 mschauer

Seems at least loosely related to energy-based models: https://arxiv.org/abs/2101.03288

Also, I was thinking of abstracting quadratic forms, and a quick search found this: https://github.com/JuliaSmoothOptimizers/QuadraticModels.jl

cscherrer avatar Aug 22 '22 23:08 cscherrer

Sorry, one correction: If I think about it, this this is not a density but a likelihood, so we would like to represent the function class, not the class of measures. That goes in line with what you say!

mschauer avatar Aug 23 '22 06:08 mschauer

Great point. @nignatiadis has been thinking about related issues in https://github.com/cscherrer/MeasureTheory.jl/pull/226

I really like this idea of having more structured likelihoods, especially for these special cases where we can handle things analytically.

Also loosely related is my experimental code for exponential families here. It's currently completely undocumented, lacking even a simple example -- I'll try to add at least a commented-out example soon, and more after my current crunch time (prepping for a talk) has passed.

cscherrer avatar Aug 23 '22 13:08 cscherrer

Sometimes there is a conjugate distribution from a different class of distributions than the prior. But here Bayesian inversion is not only conjugate but also fusion and pullback through linear Gaussian kernels also preserve the class. This means you are not only conjugate for a single experiment, but for an entire DAG/Tilde model with the global posterior given by message passing/Doob's h-transform as composition of elementary transformations.

mschauer avatar Aug 23 '22 13:08 mschauer

Right. I think this is yet another case where multiple dispatch can help us. Let's look at a simple case:

Say you have parameter space $\Theta$ and observation space $X$. There's a prior $\pi \in \mathcal{M}(\Theta)$, and a function $f : \Theta \to \mathcal{M}(X)$, and we observe $x \in X$. I say function because in Julia it might literally be a function, or possibly a parameterized measure.

We have a few steps to go through:

  1. Create a kernel k = kernel(f)
  2. Compute the likelihood lik = likelihoodof(k, x)
  3. Compute the pointwise product post = π ⊙ lik

Even if the posterior is simple to compute, we still break things up into very small steps. That way, we can easily add methods to any one of these steps.

I really think that once we have a nice structure for likelihoods, conjugacy will come very cheap. We should think about what whether fusion and pullback are "atomic" or if they might also break apart into smaller steps. As a starting point... What might a type signature look like for fusion, say for a simple example? I guess that discussion should really go into a new issue.

cscherrer avatar Aug 23 '22 15:08 cscherrer