SossMLJ.jl icon indicating copy to clipboard operation
SossMLJ.jl copied to clipboard

Bayesian ElasticNet

Open tlienart opened this issue 6 years ago • 5 comments

Would be a first non/sklearn thing

  • requires ability to use "product" prior (Laplace x Normal)

tlienart avatar Oct 30 '19 18:10 tlienart

This would be a great example to have.

DilumAluthge avatar Aug 06 '20 16:08 DilumAluthge

I agree this would be at least interesting, and possibly very useful. Somewhere in that interval ;)

The challenge here isn't specific to SossMLJ, or even to Soss. the density for a mixture distribution is a linear combination of the densities of its components. But here, the linear combination for the prior isn't one of densities but of log-densities. So this "just" requires building a "geometric mixture".

Is that even a thing? I have no idea. But defining this (even entirely outside Soss) would make it pretty easy to express this model.

cscherrer avatar Aug 18 '20 14:08 cscherrer

I'm not sure what the problem is (sorry if I'm being dense), I mean product of distributions would pop up everywhere if you consider graphical models right? is the issue that it's potentially hard in general to sample from a product? or that there is no object currently to represent a product of distributions?

tlienart avatar Aug 18 '20 15:08 tlienart

Got this to work:

julia> using Distributions

julia> using Soss

julia> import TransformVariables

julia> struct Elastic{T} 
           L1 :: T
           σ :: T
       end

julia> function Distributions.logpdf(α::Elastic, x)
           ℓ1 = α.L1 * logpdf(Laplace(0,σ), x)
           ℓ2 = (1 - α.L1) * logpdf(Normal(0,σ), x)
           return ℓ1 + ℓ2
       end

julia> # Just a hack to account for another hack
       Base.rand(::Elastic) = 0.0

julia> Soss.xform(::Elastic, _data) = TransformVariables.asℝ

julia> mt = @model X begin
           n = size(X,1)
           k = size(X,2)
           p ~ Uniform()
           priorScale ~ HalfNormal()
           noise ~ HalfNormal()
           β ~ Normal(0,priorScale) |> iid(k)
           yhat = X * β
           y ~ For(n) do j
               Normal(yhat[j], noise)
           end
       end;

julia> m = @model X begin
           n = size(X,1)
           k = size(X,2)
           p ~ Uniform()
           priorScale ~ HalfNormal()
           noise ~ HalfNormal()
           β ~ Elastic(p,priorScale) |> iid(k)
           yhat = X * β
           y ~ For(n) do j
               Normal(yhat[j], noise)
           end
       end;

julia> X = rand(10,3);

julia> truth = rand(mt(X=X));

julia> post = dynamicHMC(m(X=X), (y=truth.y,));

julia> particles(post)
(priorScale = 0.825 ± 0.57, p = 0.481 ± 0.29, noise = 0.424 ± 0.12, β = Particles{Float64,1000}[0.394 ± 0.38, -0.367 ± 0.27, -0.173 ± 0.34])

The product in graphical models is a product distribution, as in tuples. Here it's a product as in multiplication. Gaussian behave nicely for this, but it's not so common. And here it's a bit weirder since we're actually doing fractional exponents.

Anyway, if you work in log space it's nice enough.

Sorry about the rand bit, that's just a hack in dynamicHMC to get a starting point. Probably a way to work around that. And there may be a nice way to sample from "geometric mean" distributions like this, but I'm not sure.

cscherrer avatar Aug 19 '20 00:08 cscherrer

shouldn't the sigma in your logpdf be read from alpha? other than that cool!

(PS: I don't get the distinction between Normal(r; params) * Laplace(x;params) * Normal(x;params) and an undirected graphical model with two nodes but we might be talking past each other and I doubt that it's a very important point).

tlienart avatar Aug 19 '20 07:08 tlienart