ParameterHandling.jl icon indicating copy to clipboard operation
ParameterHandling.jl copied to clipboard

let `positive_definite` return a PDMat

Open st-- opened this issue 4 years ago • 16 comments

This would allow for avoiding a Cholesky each time e.g. when parametrising the covariance of an MvNormal. Would you be willing to accept the PDMats.jl dependency?

st-- avatar Oct 05 '21 14:10 st--

Good point. I don't generally tend to make a lot of use of PDMats.jl, so I'm not overly keen on making positive_definite give me a PDMat.

One alternative would be to make use of the deferred function, so you could write something like

my_pdmat = deferred(PDMat, positive_definite(S))

where S is your happens-to-be-positive-definite matrix, and then when you call value on my_pdmat you'll get a PDMat.

It's not quite as clean as returning a PDMat, but would it satisfy your use-case?

edit: fix typo

willtebbutt avatar Oct 05 '21 15:10 willtebbutt

(One thing I would definitely be up for is improving the docstring to suggest a strategy like this, or whatever other strategy seems useful)

willtebbutt avatar Oct 05 '21 15:10 willtebbutt

I don't generally tend to make a lot of use of PDMats.jl

Whenever you use an MvNormal, you do:

using Distributions, PDMats
@assert MvNormal(ones(1,1)).Σ isa PDMats.PDMat

and generally if you want to handle a positive-definite matrix, PDMat would be the right choice, would you not agree with that?

What I want to avoid is the unnecessary additional cholesky (which happens implicitly when you construct the PDMat from a full Matrix). It is possible to construct a PDMat directly from the lower-triangular Cholesky factor, and you can then pass that as the covariance matrix to MvNormal, which is what I believe should happen. (But this requires positive_definite to not already multiply out L*L'.)

NB- it would be even better if Distributions.jl wouldn't convert it back to a full Matrix upon calling cov... :roll_eyes: But that is easier to work around in the downstream code (e.g. ApproximateGPs) than it is to work around positive_definite multiplying out already.

st-- avatar Oct 06 '21 14:10 st--

Also, just wanted to point out that PDMats.jl is actually a very lighter-weight dependency - for sure compared to Bijectors.jl and its rat-tail of deps of deps! The only addition would be SuiteSparse from the stdlib.

st-- avatar Oct 06 '21 15:10 st--

Whenever you use an MvNormal, you do:

That's my trick -- I don't use MvNormal very much haha.

and generally if you want to handle a positive-definite matrix, PDMat would be the right choice, would you not agree with that?

It's definitely a choice, and I agree that it has the right kind of semantics, it's just the implementation that I've always taken issue with. Specifically the requirement that PDMat has that you must have both the Cholesky factorisation and the original matrix floating around, even if you construct it from just the Cholesky.

(But this requires positive_definite to not already multiply out L*L'.)

This is probably something that could be improved.

Also, just wanted to point out that PDMats.jl is actually a very lighter-weight dependency

True -- I had never noticed that before. I'd be fine having the additional dep. I wonder whether we should just implement methods directly to handle the types in that package then? i.e. implement flatten etc for PDMat, PDiagMat etc.

willtebbutt avatar Oct 06 '21 15:10 willtebbutt

I'd still prefer to simply have positive_definite return a PDMat, instead of duplicating all the code of positive_definite - at least I thought @theogf added it in #32 so that we could use in ApproximateGPs etc., where we directly use it to parametrize the MvNormal...

st-- avatar Oct 27 '21 14:10 st--

@st-- I can see the appeal of this change. How about just having a different pdmat function that returns it. I think most of the helper functions created for positive_definite can be reused.

theogf avatar Oct 27 '21 14:10 theogf

One thing I really like about Julia is the in general very high composability of packages ... but then I find these little things so irritating :laughing: surely a positive_definite thing should just be a PDMat (and PDMat should just be so it doesn't do things that makes others like @willtebbutt not want to use it)...

st-- avatar Oct 27 '21 14:10 st--

I think a fairer thing to do, would be to just have lower_triangular output, which would be easy to pass as a PDMat or a Gram matrix and it is up to the user to decide what to do?

theogf avatar Oct 27 '21 16:10 theogf

I like lower_triangular! And then change positive_definite to actually return a PDMat? :)

st-- avatar Oct 27 '21 16:10 st--

If we do something like this, would it make sense to have a positive_lower_triangular function (rather than lower_triangular) which represents a lower triangular matrix with a positive diagonal? This would make implementing positive_definite as @st-- suggests rather straightforward, and is often what you want even if you're not passing the thing into a PDMat.

willtebbutt avatar Oct 27 '21 16:10 willtebbutt

Why would you care about a positive diagonal? The result would be the same no?

theogf avatar Oct 27 '21 18:10 theogf

Positive diagonal means it's a valid cholesky factor, so can just be plugged into stuff that expects that kind of constraint, such as a Cholesky (i.e. constructing it directly, without making it do the decomposition again)

willtebbutt avatar Oct 29 '21 08:10 willtebbutt

Positive diagonal means it's a valid cholesky factor, so can just be plugged into stuff that expects that kind of constraint, such as a Cholesky (i.e. constructing it directly, without making it do the decomposition again)

There are other, valid reasons for wanting the diagonal to be positively-constrained. But you can construct Cholesky with a triangular factor that has negative values on the diagonal.

st-- avatar Oct 29 '21 11:10 st--

There are other, valid reasons for wanting the diagonal to be positively-constrained. But you can construct Cholesky with a triangular factor that has negative values on the diagonal.

Sure, you could, but it wouldn't satisfy the assumptions that the Cholesky type makes about its factors field. For example, the logdet implementation assumes that the diagonal is positive.

willtebbutt avatar Oct 29 '21 11:10 willtebbutt

Ah ok, so it's more about how the Cholesky type is implemented! Thanks for the answer!

theogf avatar Oct 29 '21 11:10 theogf