MeasureTheory.jl
MeasureTheory.jl copied to clipboard
Scaled canonical gaussian likelihoods
We would like to represent the bounded measure (but not necessarily probability measure) with density
(so H (=Λ in MT parlance), and F = Λμ could be called potential parameter)
For some choice of c this is a probability measure, but the actual value of c itself contains important information about the evidence of a Bayesian model with Gaussian posterior represented in this form)
The likelihood object should pairing with Gaussian priors (giving a Gaussian posterior), support fusion #229, and pullback $$\exp(\tilde c + \tilde Fx + x'\tilde Hx) = \int \exp(c + Fy + y' H y) \kappa(x, dy) $$
where
$$\kappa(x) = N(A x + b, Q)$$
is a linear Gaussian kernel, with density
$$ \propto \exp(-\frac12 (y - A x)' Q^{-1} (y-Ax) )$$
Seems at least loosely related to energy-based models: https://arxiv.org/abs/2101.03288
Also, I was thinking of abstracting quadratic forms, and a quick search found this: https://github.com/JuliaSmoothOptimizers/QuadraticModels.jl
Sorry, one correction: If I think about it, this this is not a density but a likelihood, so we would like to represent the function class, not the class of measures. That goes in line with what you say!
Great point. @nignatiadis has been thinking about related issues in https://github.com/cscherrer/MeasureTheory.jl/pull/226
I really like this idea of having more structured likelihoods, especially for these special cases where we can handle things analytically.
Also loosely related is my experimental code for exponential families here. It's currently completely undocumented, lacking even a simple example -- I'll try to add at least a commented-out example soon, and more after my current crunch time (prepping for a talk) has passed.
Sometimes there is a conjugate distribution from a different class of distributions than the prior. But here Bayesian inversion is not only conjugate but also fusion and pullback through linear Gaussian kernels also preserve the class. This means you are not only conjugate for a single experiment, but for an entire DAG/Tilde model with the global posterior given by message passing/Doob's h-transform as composition of elementary transformations.
Right. I think this is yet another case where multiple dispatch can help us. Let's look at a simple case:
Say you have parameter space $\Theta$ and observation space $X$. There's a prior $\pi \in \mathcal{M}(\Theta)$, and a function $f : \Theta \to \mathcal{M}(X)$, and we observe $x \in X$. I say function because in Julia it might literally be a function, or possibly a parameterized measure.
We have a few steps to go through:
- Create a kernel
k = kernel(f) - Compute the likelihood
lik = likelihoodof(k, x) - Compute the pointwise product
post = π ⊙ lik
Even if the posterior is simple to compute, we still break things up into very small steps. That way, we can easily add methods to any one of these steps.
I really think that once we have a nice structure for likelihoods, conjugacy will come very cheap. We should think about what whether fusion and pullback are "atomic" or if they might also break apart into smaller steps. As a starting point... What might a type signature look like for fusion, say for a simple example? I guess that discussion should really go into a new issue.