Distributions.jl
Distributions.jl copied to clipboard
Add Wrapped distribution wrapper
Adds a Wrapped distribution that wraps an original distribution around some interval. Optionally a parameter k is used to indicate that it should be multiply-wrapped, i.e. that the resulting wrapped distribution should have a k-fold periodicity in the interval.
Fixes #1716 and #1715
This version of Wrapped allows for any range and periodicity to be specified as fields, but it comes with the trade-off that for distributions like Wrapped Cauchy, where we have algorithms to fit it well, we're not able to because the usual fit(::Type{<:Distribution}, x) interface doesn't allow specifying the upper and lower bounds or k. It would be nice if there was an alternate interface for wrapper distributions like these.
Codecov Report
Patch coverage has no change and project coverage change: -5.14 :warning:
Comparison is base (
ef42afb) 85.89% compared to head (3da8e84) 80.76%.
Additional details and impacted files
@@ Coverage Diff @@
## master #1724 +/- ##
==========================================
- Coverage 85.89% 80.76% -5.14%
==========================================
Files 139 144 +5
Lines 8389 7122 -1267
==========================================
- Hits 7206 5752 -1454
- Misses 1183 1370 +187
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/Distributions.jl | 100.00% <ø> (ø) |
|
| src/wrapped.jl | 0.00% <0.00%> (ø) |
|
| src/wrapped/cauchy.jl | 0.00% <0.00%> (ø) |
|
| src/wrapped/exponential.jl | 0.00% <0.00%> (ø) |
|
| src/wrapped/normal.jl | 0.00% <0.00%> (ø) |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Also #1665 adds a WrappedCauchy distribution. This implementation is a little more general because it also allows for k-wrapping and allows other intervals than those of length 2pi to be used.
Before I spend more time polishing this and adding a test suite, a few questions:
- Is the generalization from intervals of length 2π to arbitrary intervals welcome?
- Is the generalization to build distributions with
k-foldsymmetry welcome? - For the Wrapped Normal, Wrapped Cauchy, and Wrapped Exponential, is this approach of overloading the functions for
Wrappedpreferred, or is it preferred to defined special named distributions as in #1665? - How to define
fitmethods? (see https://github.com/JuliaStats/Distributions.jl/pull/1724#issuecomment-1553668551)
@devmotion @ararslan
I should preface this by saying that I know almost nothing about wrapped distributions, so any opinions expressed here are not strongly held and may be uninformed.
Is the generalization from intervals of length 2π to arbitrary intervals welcome?
I don't see why not. ¯\_(ツ)_/¯ I assume that's something that's not uncommon in practice and/or would be difficult to achieve with a simple data transformation or similar?
Is the generalization to build distributions with
k-foldsymmetry welcome?
I don't see why not. ¯\_(ツ)_/¯ Same general question though.
For the Wrapped Normal, Wrapped Cauchy, and Wrapped Exponential, is this approach of overloading the functions for
Wrappedpreferred, or is it preferred to defined special named distributions as in #1665?
Defining methods for the Wrapped wrapper has prior art with e.g. Truncated. In fact, there used to be a TruncatedNormal that was deprecated in favor of Truncated{Normal}. So I think your approach here is more consistent and extensible.
How to define
fitmethods?
Not all distributions have fit (or rather, fit_mle) methods, notably including Truncated-wrapped distributions, so I personally think it would be okay to punt on a decision for now and add it at a later time after some more extended design discussion.
This version of
Wrappedallows for any range and periodicity to be specified as fields, but it comes with the trade-off that for distributions like Wrapped Cauchy, where we have algorithms to fit it well, we're not able to because the usualfit(::Type{<:Distribution}, x)interface doesn't allow specifying the upper and lower bounds ork.
Could k be made a type parameter, similar to the dimensionality parameter for Array? Then the fit method could be something like fit(Wrapped{Cauchy,1}, x). That doesn't solve the bounds though.
Is the generalization from intervals of length 2π to arbitrary intervals welcome?
I don't see why not. ¯_(ツ)_/¯ I assume that's something that's not uncommon in practice and/or would be difficult to achieve with a simple data transformation or similar?
It's certainly useful. e.g. one might want to use [0, 2π) or [-π, π), or one might want to use degrees or days of the year. To support discrete distributions like wrapped Poisson it becomes necessary.
Is the generalization to build distributions with
k-foldsymmetry welcome?I don't see why not. ¯_(ツ)_/¯ Same general question though.
I did some searching, and the only mention I can find of k-times wrapping is in Directional Statistics by Mardia and Jupp, where they give no references for papers that use it. I suspect the most likely version to use besides k=1 is k=2, which turns any circular distribution into an axial one.
Could
kbe made a type parameter, similar to the dimensionality parameter forArray? Then thefitmethod could be something likefit(Wrapped{Cauchy,1}, x). That doesn't solve the bounds though.
Yes I think this is a reasonable solution. For bounds, I propose the fallback wrapped(d::ContinuousDistribution) = wrapped(d, -π, π) (taking care to not promote types unnecessarily; too bad there's no -π irrational). Then if one has data with a different period than 2π, they can scale it before fitting. Not perfect, but works.
If we want Wrapped{<:Cauchy} and Wrapped{<:Normal} to be like VonMises, where the support is [μ-π, μ+π), we can change the default e.g. wrapped(d::Normal) = wrapped(d, d.μ-π, d.μ+π)