pymc4
pymc4 copied to clipboard
Factories for distributions that can be reparametrized
It would be very nice to implement a pattern where we can have factories for distributions that can be re-parametrized in terms of scale/loc. A good example would be the beta where I'd love to be able to write something like:
pm.Beta.from_loc_scale(name='beta', loc=..., scale=...)
This could become very handy in a number of cases (I was thinking about regressions, but there are so much more)
This can be done easily and is very handy!
Quick fix straight from wikipedia
@staticmethod
def from_loc_scale(name, loc, scale, **kwargs):
"""
Beta distribution from `loc` and `scale` parameters.
Parameters
----------
loc : tensor, float
Mean of the distribution
scale : tensor, float
Variance of the distribution
"""
nu = (loc * (1 - loc)) / scale - 1
if tf.reduce_any(nu < 0):
raise ValueError("invaid value for `loc` or `scale`")
alpha = loc * nu
beta = (1 - loc) * nu
return Beta(nme=name, concentration0=alpha, concentration1=beta, **kwargs)
I wonder if this can be done for multivariate distributions though (or if it even makes sense to do so)? What do you say, @luke14free?
Yes, static methods for the win here. I am not sure how useful it would be to have this on multivariate distributions (there might be usecases but they don't pop out immediately in my head). My point was having to avoid recomputing simple transformation all times (I managed to introduce a couple of stupid bugs by transcribing the wrong transformations from paper to code in the past).
Maybe it would make sense to have it for multivariate like Dirichlet and Multinomials, while the most used ones like Multivariate Gaussians and T-student are already express in terms of mean/scale.
We plan to allow a single parametrization in the each distribution instance's initialization function. Pymc3 supported multiple parametrizations in __init__
(e.g. the Normal
) and that made things harder to maintain.
That being said, @luke14free, your idea of having a static factory method do this automatically is a perfectly valid approach. We just need to agree on the design here. I think that the simplest way to do this would be to implement these static methods in each distribution instance that needs them, but that would lead to essentially duplicate code in many places and would be harder to maintain. Maybe there could be some base classes that implement common reparametrizations (e.g. a normal's scale
and precision
) and have the appropriate classes inherit from these. I would like to hear what the others think. @twiecki, @junpenglao?
Yeah, I like the static method approach. In PyMC3 we just supported multiple kwargs which didn't work terribly either, usually there are not more than 2 parameterizations. Any reason not to do that?