Implicit broadcasting of `logpdf` is confusing
Today I tripped over this:
julia> logpdf(Normal(), [0.0, 1.0, 2.0])
3-element Vector{Float64}:
-0.9189385332046728
-1.4189385332046727
-2.9189385332046727
I expected this to error based on mismatch of dimensions, but rather it does implicit broadcasting. I find this very confusing from a semantic point of view: I'm asking to evaluate the logpdf of a univariate distribution on a multidimensional value, that should be an error. If I want to broadcast, I can do so trivially easily:
julia> logpdf.(Normal(), [0.0, 1.0, 2.0])
3-element Vector{Float64}:
-0.9189385332046728
-1.4189385332046727
-2.9189385332046727
The same confusion of course applies to multidimensional cases like
julia> dist = MvNormal(fill(1.0, 3));
julia> data = randn(3,2)
3×2 Matrix{Float64}:
1.12869 0.610905
-1.85984 0.782174
-0.289514 0.854857
julia> logpdf(dist, data)
2-element Vector{Float64}:
-5.165200520671484
-3.6147061729216485
The way to do this with explicit broadcasting is a tiny bit more verbose, but in its explicitness far less ambiguous:
julia> logpdf.((dist,), eachcol(data))
2-element Vector{Float64}:
-5.165200520671484
-3.6147061729216485
I have a related confusion regarding
julia> loglikelihood(dist, data)
-8.779906693593134
but happy to consider that a distinct question.
Today I tripped over this:
This use of logpdf is deprecated: https://github.com/JuliaStats/Distributions.jl/blob/efff906e2e6aad180d0be6dcaa9c98d3c398510d/src/deprecates.jl#L39
The same confusion of course applies to multidimensional cases like
This is not deprecated (yet). The main difference to the univariate case is that typically efficiency gains by vectorization are larger.
I have a related confusion regarding
This is intended. loglikelihood returns the log-likelihood of the samples, which is always a single scalar value. logpdf returns the log-density of each sample.