MeasureBase.jl icon indicating copy to clipboard operation
MeasureBase.jl copied to clipboard

Some small `pushfwd` issues

Open cscherrer opened this issue 2 years ago • 8 comments

I've been exploring our pushforward basics in MeasureBase.jl...

Say we have a uniform measure on $(-\pi/2, \pi/2)$ and want to push that through asin. So we do

# We'll use the others soon
using MeasureBase, ForwardDiff, ChangesOfVariables, IrrationalConstants

μ = Lebesgue(MeasureBase.BoundedReals(-halfπ, halfπ))

function f(x)
    asin(x)
end

function finv(y)
    if -halfπ ≤ y ≤ halfπ
        return sin(y)
    else
        @error "finv is only defined on [-π/2, π/2]"
    end
end

Now, we want to be able to do

ν = pushfwd(f, finv, μ)
logdensityof(ν, π/4)

We can't use InverseFunctions.jl to get the inverse automatically, because sin only has a one-sided inverse (not currently handled by that package). This three-argument form of pushfwd lets use say "this inverse works on the domain of $\mu$".

This won't quite work yet, because it doesn't know how to get the logjac. But we can define

function withlogjac(f, x)
    dx = ForwardDiff.Dual{ForwardDiff.Tag{typeof(f)}}(x, 1.0)
    dy = f(dx)
    value = dy.value
    deriv = first(dy.partials)
    (value, log(abs(deriv)))   
end

function ChangesOfVariables.with_logabsdet_jacobian(::typeof(f), x)
    withlogjac(f, x)
end

function ChangesOfVariables.with_logabsdet_jacobian(::typeof(finv), x)
    withlogjac(finv, x)
end

And we can test this implementation:

julia> ChangesOfVariables.test_with_logabsdet_jacobian(f, 2 * rand() - 1, ForwardDiff.derivative)
Test Summary:                                                | Pass  Total  Time
test_with_logabsdet_jacobian: f with input 0.359581995576044 |    2      2  0.0s

julia> ChangesOfVariables.test_with_logabsdet_jacobian(finv, π * rand() - halfπ, ForwardDiff.derivative)
Test Summary:                                                    | Pass  Total  Time
test_with_logabsdet_jacobian: finv with input 1.0808047560079297 |    2      2  0.0s

So now our pushforward works:

julia> ν = pushfwd(f, finv, μ)
PushforwardMeasure(
    f,
    finv,
    Lebesgue(MeasureBase.BoundedReals{Float64, Irrational{:halfπ}}(-1.5707963267948966, halfπ)))

julia> logdensityof(ν, π/4)
-0.3465735902799726

But let's make a small update to our functions to see what they're really doing:


function f(x)
    @info "calling f($x)"
    asin(x)
end

function finv(y)
    @info "calling finv($y)"
    if -halfπ ≤ y ≤ halfπ
        return sin(y)
    else
        @error "finv is only defined on [-π/2, π/2]"
    end
end

Our final call now looks like this:

julia> logdensityof(ν, π/4)
[ Info: calling finv(Dual{ForwardDiff.Tag{typeof(finv)}}(0.7853981633974483,1.0))
[ Info: calling finv(0.7853981633974483)
[ Info: calling finv(0.7853981633974483)
-0.3465735902799726

So this is making three calls to finv. These are

  1. To check insupport
  2. Calling logdensity_def
  3. Calling logdensity_def on the base measure

The base measure is

julia> basemeasure(ν)
PushforwardMeasure(f, finv, MeasureBase.LebesgueBase())

So this raises a few questions/comments:

  1. Can we update this to only call finv once? It's not so bad in this case, but it could sometimes get very expensive. Previously I had a MapsTo type for this sort of thing, maybe we need to bring that back?
  2. The fact that it only calls finv (and never f) gets me back to thinking it's much more natural in lots of cases to work in terms of a pullback
  3. Currently there's no way to get a density of nu with respect to, say, Lebesgue(). I think the right way to do this is to push Lebesgue() through finv (or pull back through f) and compare the result with mu.

cscherrer avatar Sep 15 '22 16:09 cscherrer

BTW for AffinePushfwd (which should share more code with this than it currently does) I handled (1) above by defining e.g.

@inline function logdensity_def(d::AffinePushfwd{(:λ,)}, x::AbstractArray)
    z = d.λ * x
    MeasureBase.unsafe_logdensityof(d.parent, z)
end

So computing logdensity_def on an AffinePushfwd goes all the way to the root measure in one shot. This works, but sacrifices some of the advantages of our basemeasure system

cscherrer avatar Sep 15 '22 16:09 cscherrer

Instead of pushfwd(f, finv, μ), we could do pushfwd(setinverse(f, finv), μ) if we implement JuliaMath/InverseFunctions.jl#18 . A setinverse construct convenient to have anyway, I think.

Regarding with_logabsdet_jacobian - ChangesOfVariables can't depend on AD packages, that would make it way too heavy. Also, ForwardDiff isn't necessarily always the right default mechanism. A further complication with logabsdet-via-AD is that it the result often needs to be auto-diffed again, e.g. to get gradients of transformed densities/measures, and not all AD packages supported nested-AD (though ForwardDiff does). This might work as a clean approach for cases where that would perform well: We could have a package AutoDiffLADJs.jl or so (name could be better) that depends on AbstractDifferentiation.jl and provides a function ladj_via_ad(f, backend) or so, which return a wrapper around f that supports with_logabsdet_jacobian.

oschulz avatar Sep 15 '22 18:09 oschulz

A setinverse construct convenient to have anyway, I think.

Nice! I like this idea, and made a comment in that issue.

ChangesOfVariables can't depend on AD packages, that would make it way too heavy. Also, ForwardDiff isn't necessarily always the right default mechanism. A further complication with logabsdet-via-AD is that it the result often needs to be auto-diffed again, e.g. to get gradients of transformed densities/measures, and not all AD packages supported nested-AD (though ForwardDiff does).

Yes, I understand that ForwardDiff isn't always the best approach. The withlogjac here is just one approach to get concrete results. The biggest issue here is this one:

  1. Can we update this to only call finv once? It's not so bad in this case, but it could sometimes get very expensive. Previously I had a MapsTo type for this sort of thing, maybe we need to bring that back?

MapsTo was kind of confusing and wasn't integrated cleanly. But I think this is an important problem, so we need to find an approach that works. I'd hate for us to be stuck with redundant function evaluations.

Then on your last point...

This might work as a clean approach for cases where that would perform well: We could have a package AutoDiffLADJs.jl or so (name could be better) that depends on AbstractDifferentiation.jl and provides a function ladj_via_ad(f, backend) or so, which return a wrapper around f that supports with_logabsdet_jacobian.

I like this idea. I think it's important for users to be able to build pushforwards easily, but I agree an extension package would be better than making ChangesOfVariables much heavier.

cscherrer avatar Sep 15 '22 19:09 cscherrer

Regarding the multiple call to finv - can we solve that with better forwarding mechanisms in MB somehow? Maybe we can avoid calling insupport separately, for example?

oschulz avatar Sep 15 '22 19:09 oschulz

can we solve that with better forwarding mechanisms in MB somehow?

That's the hope - that's why this is an MB issue ;)

cscherrer avatar Sep 15 '22 19:09 cscherrer

I think this problem of calling a function multiple time is specific to pushforwards, and doesn't come up in most cases. Here's the typically-called logdensityof:

@inline function logdensityof(μ::AbstractMeasure, x)
    result = dynamic(unsafe_logdensityof(μ, x))
    ifelse(insupport(μ, x) == true, result, oftype(result, -Inf))
end

The idea with MapsTo was, when we have a pushfoward taking x to y, we pass around (x ↦ y)::MapsTo{typeof(f)}, where we've defined f(m::MapsTo{typeof{f}}) = m.y. That's pretty rough, but that's the idea.

Oh! Or maybe we just grab https://github.com/longemen3000/CachedFunctions.jl or similar?

cscherrer avatar Sep 15 '22 19:09 cscherrer

What if we specialize logdensityof directly as well for pushforward measures?

oschulz avatar Sep 15 '22 19:09 oschulz

Ok, let's try that first. It seemed kind of hacky when I did it for affine transforms, because it's doing an end run around this whole nice system we set up. And it forces going all the way to the root measure, which is kind of weird. But it's quick to implement and will get performance back in line

cscherrer avatar Sep 15 '22 20:09 cscherrer