pdfs up to proportionality
In many cases, we don't need to compute all the factors of a pdf/logpdf:
- In likelihoods, we can ignore any factors not involving the parameters (e.g. the factorial in the Poisson)
- Many sampling techniques don't require computing the normalisation constant, i.e. factors not involving the point at which the density is evaluated (e.g. the Bessel function in the von Mises distribution)
- Posteriors are typically only known up to proportionality, so we would only need the prior (ignoring the normalisation constant), and the likelihood (ignoring non-parameter factors)
It would be good to have a consistent interface for all of this: we might also want to think about how this relates to autodiff functionality.
How about something like updf?
In some of the algorithms for random number generation, for efficiency reasons, some numbers should only to be calculated once given the parameter values. In C you could use static variables to store the information, but I don't know the best solution in Julia. At some point I experienced with generating these numbers, at the time the random variable object was constructed. That solution could be an alternative to an un-normalised pdf.
Possibly related might be the idea of non-normalized distributions which might be useful for working with mixture models. In particular, signed, non-normalized distributions (i.e. finite signed measures) form a vector space.
They form a vector space? How do you handle multiplication by large negative constants?
That's what signedness allows. See finite signed measure.
I would also like to have standard interface for normalizing constants, e.g. lognorm and norm
For exponential family distributions, the normalization constant has a name: partition function.
lognorm and norm would be easily confused with the typical meaning of norms.
+100 This is really important.
@sbos It's also worth noting that in many cases calculating the partition function is analytically and/or computationally intractable. This doesn't generally affect the standard distributions that the package targets, but it may be worth keeping in mind as the project evolves.
I am going to implement this (particularly for exponential family distributions). Will make a proposal with RFC pretty soon.
Having thought more about this, I decided what was really needed is a ratio of pdfs, so i was thinking:
pdfratio(d1::Distribution, d2::Distribution, x::Real) = pdf(d1,x) / pdf(d2,x)
pdfratio(d::Distribution, x1::Real, x2::Real) = pdf(d,x1) / pdf(d,x2)
pdfratio(d1::Distribution, d2::Distribution, x1::Real, x2::Real) = pdf(d1,x1) / pdf(d2,x2)
For multi-class conditional distribution, we still need unnormalized pdfs.
pdf( x | k ) # where k can take 1, 2, 3, ... K
This is very useful in computing posterior probability.
As far as I can tell, given a density $p(x \mid \theta) = f(x, \theta) g(\theta) h(x)$, it would be useful to have the following functions
logupdf(x) =$\log f(x, \theta) + \log h(x)$: used whenever parameters are fixed but arguments varylogulikelihood(x) =$\log f(x, \theta) + \log g(\theta)$: used whenever arguments are fixed but parameters varylogpdf(x) =$\log p(x \mid \theta)$: used whenever both arguments and parameters vary
Matrix-variate distributions have an internal method logkernel, which computes the same thing as logupdf, but I think the use of "kernel" here is too vague and doesn't generalize for the kernel of the likelihood.
This would support specification of likelihood functions and posterior distributions as well as named distributions for which the normalization factor is unknown, intractible, or expensive.
Do the current maintainers have any thoughts on this proposal?
I implemented that logkernel business for all the matrix-variates, so I'm obviously in favor of something like this. And I agree that "kernel" is probably not the language you'd want to settle on for a package-wide, outward-facing change.