Distributions.jl icon indicating copy to clipboard operation
Distributions.jl copied to clipboard

Implicit broadcasting of `logpdf` is confusing

Open mhauru opened this issue 8 months ago • 1 comments

Today I tripped over this:

julia> logpdf(Normal(), [0.0, 1.0, 2.0])
3-element Vector{Float64}:
 -0.9189385332046728
 -1.4189385332046727
 -2.9189385332046727

I expected this to error based on mismatch of dimensions, but rather it does implicit broadcasting. I find this very confusing from a semantic point of view: I'm asking to evaluate the logpdf of a univariate distribution on a multidimensional value, that should be an error. If I want to broadcast, I can do so trivially easily:

julia> logpdf.(Normal(), [0.0, 1.0, 2.0])
3-element Vector{Float64}:
 -0.9189385332046728
 -1.4189385332046727
 -2.9189385332046727

The same confusion of course applies to multidimensional cases like

julia> dist = MvNormal(fill(1.0, 3));

julia> data = randn(3,2)
3×2 Matrix{Float64}:
  1.12869   0.610905
 -1.85984   0.782174
 -0.289514  0.854857

julia> logpdf(dist, data)
2-element Vector{Float64}:
 -5.165200520671484
 -3.6147061729216485

The way to do this with explicit broadcasting is a tiny bit more verbose, but in its explicitness far less ambiguous:

julia> logpdf.((dist,), eachcol(data))
2-element Vector{Float64}:
 -5.165200520671484
 -3.6147061729216485

I have a related confusion regarding

julia> loglikelihood(dist, data)
-8.779906693593134

but happy to consider that a distinct question.

mhauru avatar Apr 17 '25 13:04 mhauru

Today I tripped over this:

This use of logpdf is deprecated: https://github.com/JuliaStats/Distributions.jl/blob/efff906e2e6aad180d0be6dcaa9c98d3c398510d/src/deprecates.jl#L39

The same confusion of course applies to multidimensional cases like

This is not deprecated (yet). The main difference to the univariate case is that typically efficiency gains by vectorization are larger.

I have a related confusion regarding

This is intended. loglikelihood returns the log-likelihood of the samples, which is always a single scalar value. logpdf returns the log-density of each sample.

devmotion avatar Apr 17 '25 14:04 devmotion