Add distribution axes
As discussed in #734, this PR adds axes(::Sampleable{<:ArraylikeVariate}) to the API. It also makes the following changes:
axesnow returns axes from a parameter array when reasonablerandnow initializes the container usingsimilarand the distribution's axesmean,var,cov,cor,minimum, andmaximum, when implemented, now use the distribution's axes (or are rewritten so that they preserve the parameter array's axes already)
For non-1-based indexed arrays (e.g. OffsetArrays), these changes cause some methods to fail that previously succeeded; this seems to be due to many methods in LinearAlgebra requiring 1-based indexing.
Example
Here's an example of the new behavior using DimensionalData.DimArray, which has its own custom axis type:
julia> using DimensionalData, Distributions, LinearAlgebra
julia> x = rand(X([:a, :b, :c]))
┌ 3-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────── dims ┐
↓ X Categorical{Symbol} [:a, …, :c] ForwardOrdered
└──────────────────────────────────────────────────────┘
:a 0.871622
:b 0.180058
:c 0.840337
julia> MvNormal(x, I) |> rand
┌ 3-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────── dims ┐
↓ X Categorical{Symbol} [:a, …, :c] ForwardOrdered
└──────────────────────────────────────────────────────┘
:a 1.05573
:b -0.729217
:c 0.332449
julia> MvNormal(x, I) |> mean
┌ 3-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────── dims ┐
↓ X Categorical{Symbol} [:a, …, :c] ForwardOrdered
└──────────────────────────────────────────────────────┘
:a 0.871622
:b 0.180058
:c 0.840337
julia> Dirichlet(x) |> rand
┌ 3-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────── dims ┐
↓ X Categorical{Symbol} [:a, …, :c] ForwardOrdered
└──────────────────────────────────────────────────────┘
:a 0.620296
:b 0.000925956
:c 0.378779
julia> Dirichlet(x) |> cov
┌ 3×3 DimArray{Float64, 2} ┐
├──────────────────────────┴───────────────────── dims ┐
↓ X Categorical{Symbol} [:a, …, :c] ForwardOrdered,
→ X Categorical{Symbol} [:a, …, :c] ForwardOrdered
└──────────────────────────────────────────────────────┘
↓ → :a :b :c
:a 0.0859104 -0.0151596 -0.0707508
:b -0.0151596 0.0297752 -0.0146155
:c -0.0707508 -0.0146155 0.0853663
julia> product_distribution(fill(TDist(3.0), X(2), Y(3))) |> rand
┌ 2×3 DimArray{Float64, 2} ┐
├──────────────────── dims ┤
↓ X, → Y
└──────────────────────────┘
-1.46186 -0.796371 -1.00161
-0.523147 0.824003 0.759503
To-Do
- [x] Implement
axesfor distributions - [x] Use
axesinrand - [x] Use
axesin estimates - [ ] Add tests with custom axes using DimensionalData
Here are some points for discussion.
There are some distributions (e.g. LKJ) for which there is no parameter array. To support custom axes, we'd need to add an axes field to these dists, which could be considered breaking.
There are other distributions like VonMisesFisher that still only support Vector parameter arrays. Supporting arrays with custom axes would require relaxing that type constraint; would that be breaking?
For more complete support, a PR to PDMats is probably needed to ensure that axes(::AbstractPDMat) forwards axes from the wrapped arrays. And ScalMat would need to accept user-provided axes.
This seems nice but not as important as #1905, so I think we should get #1905 in first.
Codecov Report
:x: Patch coverage is 94.73684% with 4 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 86.35%. Comparing base (bbdd4f1) to head (46654f3).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/genericrand.jl | 80.00% | 2 Missing :warning: |
| src/multivariate/dirichlet.jl | 83.33% | 1 Missing :warning: |
| src/multivariates.jl | 0.00% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #2009 +/- ##
=======================================
Coverage 86.35% 86.35%
=======================================
Files 146 146
Lines 8782 8790 +8
=======================================
+ Hits 7584 7591 +7
- Misses 1198 1199 +1
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
This seems nice but not as important as #1905, so I think we should get #1905 in first.
Makes sense. What is still missing in #1905?
A review.