LogDensityProblems.jl icon indicating copy to clipboard operation
LogDensityProblems.jl copied to clipboard

RFC: how to handle implicit quantities associated with coordinates

Open tpapp opened this issue 3 years ago • 0 comments

Motivation

Suppose that for a set of parameters $x$, the equation $F(x, y) = 0$ defines $y(x)$ implicitly. Eg $x$ could be parameters to a problem that we approximate numerically, and $y$ the parameters of an approximation we obtain numerically (rootfinding etc). Given data $d$, the likelihood is defined as $\ell(d \mid x, y)$.

Theoretically, one could of course solve for the $y$ that belongs to each $x$. But this may be expensive and brittle, and if

$$ x_2 = x_1 + \Delta $$

then

$$ \hat{y}_2 = y_2 + \frac{\partial y}{\partial x} \Delta $$

would be a good initial guess for $y_2 = y(x_2)$.

Ideally, "users" like Turing.jl and DynamicHMC.jl should be able to ignore the details of these things and just carry on doing HMC/NUTS/etc with minimal changes.

Proposal: allow coordinates to be opaque

I propose an addition to the API composed of 3 functions, with the fallbacks

lift(ℓ, x::AbstractVector) = x
unlift(ℓ, x::AbstractVector) = x
translate(ℓ, x::AbstractVector, Δ::AbstractVector) = x .+ Δ

Specifically,

  1. "users" would call lift when generating random points for starting MCs, and in similar situations. Otherwise they would use translate,
  2. similarly, unlift would be called when coordinates are needed (eg turn statistics),
  3. leapfrog and RWMH steps would use translate.
  4. otherwise the result of lift and the x arguments of logdensity, logdensity_and_gradient, translate, unlift are allowed to be opaque objects, not an ::AbstractVector of real numbers. Nevertheless, logdensity_and_gradient should provide a valid gradient of x -> logdensity(ℓ, lift(ℓ, x)), but how that is done is up to the implementation of .

Bikeshedding names is appreciated :wink:, also alternative API suggestions.

How this meshes with AD

This is a bit tricky and I don't yet have a good API in mind. Related work is in

  • https://github.com/gdalle/ImplicitDifferentiation.jl
  • https://github.com/JuliaNonconvex/NonconvexUtils.jl
  • https://github.com/tpapp/ImplicitDifferentiables.jl

tpapp avatar Sep 05 '22 12:09 tpapp