Distributions.jl icon indicating copy to clipboard operation
Distributions.jl copied to clipboard

Error thrown when using `kldivergence` with infinite values

Open ParadaCarleton opened this issue 3 years ago • 3 comments

julia> kldivergence(TDist(1), Normal(0, 1))
    ERROR: DomainError with -0.9999999999999964:
    integrand produced NaN in the interval (-1.0, -0.9999999999999929)

It makes sense why this happens (the divergence is infinite), but I think it would be a good idea to find some way to deal with cases like this and return Inf.

ParadaCarleton avatar Nov 29 '21 00:11 ParadaCarleton

It seems the KL divergence is finite and there's even a simpler form, also for the multivariate generalization: https://rpubs.com/FJRubio/DKLtn

The problem is probably the same as in https://github.com/JuliaMath/QuadGK.jl/issues/38. Maybe applying the suggestion and adjusting rtol fixes the issue?

In general, keep in mind that QuadGK is just used as a fallback in the univariate case if no closed form expressions are implemented. I'm actually happy that it does not just return an incorrect result here - even though the error is a bit confusing (confused me at least when I encountered it in some other setting). Maybe QuadGK could throw a more descriptive error in this case (and the one described in the linked issue)?

devmotion avatar Nov 29 '21 01:11 devmotion

It seems the KL divergence is finite and there's even a simpler form, also for the multivariate generalization: https://rpubs.com/FJRubio/DKLtn

The problem is probably the same as in JuliaMath/QuadGK.jl#38. Maybe applying the suggestion and adjusting rtol fixes the issue?

In general, keep in mind that QuadGK is just used as a fallback in the univariate case if no closed form expressions are implemented. I'm actually happy that it does not just return an incorrect result here - even though the error is a bit confusing (confused me at least when I encountered it in some other setting). Maybe QuadGK could throw a more descriptive error in this case (and the one described in the linked issue)?

The divergence is finite in one direction -- kldivergence(Normal(0, 1), TDist(1)) accurately returns 0.25924453248886237. It's the other direction that's infinite and throws an error (a normal will always be an infinitely bad approximation of the Cauchy distribution, but a Cauchy distribution can be used as an OK approximation of the normal).

ParadaCarleton avatar Nov 29 '21 16:11 ParadaCarleton

Ah, you're right of course, I missed completely that you tried to evaluate in the reverse order.

The other more general comments are still valid though. As the linked issue, and my personal experience, shows the error can occur due to numerical issues even if the integral is finite. Hence I think it's still safer to throw an error than to return a possibly incorrect value such as Inf in these cases.

devmotion avatar Nov 29 '21 17:11 devmotion