pymc4 icon indicating copy to clipboard operation
pymc4 copied to clipboard

Factories for distributions that can be reparametrized

Open luke14free opened this issue 4 years ago • 4 comments

It would be very nice to implement a pattern where we can have factories for distributions that can be re-parametrized in terms of scale/loc. A good example would be the beta where I'd love to be able to write something like:

pm.Beta.from_loc_scale(name='beta', loc=..., scale=...)

This could become very handy in a number of cases (I was thinking about regressions, but there are so much more)

luke14free avatar Mar 29 '20 11:03 luke14free

This can be done easily and is very handy!

Quick fix straight from wikipedia
    @staticmethod
    def from_loc_scale(name, loc, scale, **kwargs):
        """
        Beta distribution from  `loc` and `scale` parameters.

        Parameters
        ----------
        loc : tensor, float
            Mean of the distribution
        scale : tensor, float
            Variance of the distribution
        """
        nu = (loc * (1 - loc)) / scale - 1
        if tf.reduce_any(nu < 0):
            raise ValueError("invaid value for `loc` or `scale`")
        alpha = loc * nu
        beta = (1 - loc) * nu
        return Beta(nme=name, concentration0=alpha, concentration1=beta, **kwargs)

I wonder if this can be done for multivariate distributions though (or if it even makes sense to do so)? What do you say, @luke14free?

tirthasheshpatel avatar Mar 29 '20 13:03 tirthasheshpatel

Yes, static methods for the win here. I am not sure how useful it would be to have this on multivariate distributions (there might be usecases but they don't pop out immediately in my head). My point was having to avoid recomputing simple transformation all times (I managed to introduce a couple of stupid bugs by transcribing the wrong transformations from paper to code in the past).

Maybe it would make sense to have it for multivariate like Dirichlet and Multinomials, while the most used ones like Multivariate Gaussians and T-student are already express in terms of mean/scale.

luke14free avatar Mar 30 '20 10:03 luke14free

We plan to allow a single parametrization in the each distribution instance's initialization function. Pymc3 supported multiple parametrizations in __init__ (e.g. the Normal) and that made things harder to maintain. That being said, @luke14free, your idea of having a static factory method do this automatically is a perfectly valid approach. We just need to agree on the design here. I think that the simplest way to do this would be to implement these static methods in each distribution instance that needs them, but that would lead to essentially duplicate code in many places and would be harder to maintain. Maybe there could be some base classes that implement common reparametrizations (e.g. a normal's scale and precision) and have the appropriate classes inherit from these. I would like to hear what the others think. @twiecki, @junpenglao?

lucianopaz avatar Mar 31 '20 21:03 lucianopaz

Yeah, I like the static method approach. In PyMC3 we just supported multiple kwargs which didn't work terribly either, usually there are not more than 2 parameterizations. Any reason not to do that?

twiecki avatar Apr 01 '20 06:04 twiecki