Distributions.jl
                                
                                
                                
                                    Distributions.jl copied to clipboard
                            
                            
                            
                        Add Unicode alias `\oplus` for `convolve`
This PR
- ~~simplifies the implementation of 
convolve(::MvNormal, ::MvNormal),~~ - adds a Unicode alias 
⊕, - ~~adds documentation for 
convolveand⊕(Preview: https://juliastats.org/Distributions.jl/previews/PR1445/convolution/).~~ 
It seems convolve is difficult to discover currently, probably due to the missing documentation. Moreover, I (and apparently some users: https://github.com/JuliaStats/Distributions.jl/issues/1444) think the syntax could be a bit "nicer". The ⊕ was suggested and discussed as an alias for convolve before (e.g. https://github.com/JuliaStats/Distributions.jl/issues/142#issuecomment-506470526 and https://github.com/JuliaStats/Distributions.jl/pull/919#issuecomment-510064620), and I think it is a reasonable choice. Concerns against + were e.g. that it might look like adding two distributions (https://github.com/JuliaStats/Distributions.jl/issues/307#issuecomment-429484788) and does not indicate whether the random variables are independent or not (https://github.com/JuliaStats/Distributions.jl/issues/307#issuecomment-429488430).
Codecov Report
Merging #1445 (302ec60) into master (16ee99b) will not change coverage. The diff coverage is
n/a.
@@           Coverage Diff           @@
##           master    #1445   +/-   ##
=======================================
  Coverage   84.42%   84.42%           
=======================================
  Files         124      124           
  Lines        7456     7456           
=======================================
  Hits         6295     6295           
  Misses       1161     1161           
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/convolution.jl | 100.00% <ø> (ø) | 
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 16ee99b...302ec60. Read the comment docs.
FWIW in MeasureTheory we're trying to stick to categorical interpretation of \otimes and \oplus, so the latter will (I think) be a disjoint union. In any case, it seems worth avoiding to have \otimes mean one thing and \oplus something completely unrelated. We're moving toward \otimes for product measure, in part to avoid confusion with Distributions's implicit lifting of * and + for affine transformations.
I have no real stake in Distributions. But if you want things to be compatible, I hope you'll consider this.
I think there are probably different considerations in MeasureTheory/Base since distribution objects are viewed differently there. Distributions.jl does not work with measures but random variables when transforming distributions, as the package itself does not work with measures (see also https://github.com/JuliaStats/Distributions.jl/issues/1438). Hence I think the first question is: does the notation \oplus make sense for the sum of independent random variables? In my opinion, it does, as I tried to explain in https://github.com/JuliaStats/Distributions.jl/issues/1444#issuecomment-983010605.
Side remarks unrelated to this PR:
This PR does not deal with \otimes and it can only be considered as alias for product_distribution(::Tuple) when the ProductDistribution PR is merged. In any case, to me it would seem consistent with \oplus in this PR and I don't think it would lead to any inconsistencies (see also https://github.com/JuliaStats/Distributions.jl/issues/1444#issuecomment-983010605).
in part to avoid confusion with Distributions's implicit lifting of * and + for affine transformations.
As I said before, due to the interpretation of distributions as their corresponding random variables in these expressions there is no implicit lifting. Scalars are just scalars, no implicit Dirac measures.
Do we use oplus in MeasureTheory? I had oplus as convolution in https://github.com/mschauer/GaussianDistributions.jl/blob/c358ae9cd074e5d7a29a68271f0032dceabc432e/src/GaussianDistributions.jl#L106
We've discussed (here) the possibility of using ⊗ and ⊕ to mean product measure and superposition, but @ablaom pointed out that it's useful for ⊕ to be the coproduct, which is probably disjoint union of measures (still need to confirm this).
Following this "categorical imperative" :wink: is my personal preference. But as a broader guideline, I'd suggest that whenever * and + are both defined on a type, they should follow the usual distributive law, and similarly for ⊗ and ⊕. Violating this just seems to be asking for trouble.
I say this because I recall discussion of ⊗ being used in Distributions to denote product measure. But I may be misremembering that.
It's astonishing that every other PR in Distributions ends up with a general discussion of at least partially unrelated things :smile: Discussions are very important but I think ideally we should keep discussions focused and in the right place since otherwise I fear even small improvements will take a long time.
I don't think this PR is the right place to discuss things like e.g. \otimes. It has not even been discussed properly among maintainers of Distributions yet and additionally it is not even possible to define it properly currently, as I wrote above.
I still think some of the more general misunderstanding in the comments above is caused by the different design and interpretation of such operations in Distributions and MeasureTheory: MeasureTheory consistently deals with distributions as measures whereas Distributions does not use this measure-theoretic interpretation. Therefore e.g. in operations with + and * the distributions are viewed as their corresponding random variables. This is not my special interpretation, this is how the notation was motivated and designed: see e.g. https://github.com/JuliaStats/Distributions.jl/issues/307#issuecomment-429491374 and https://github.com/JuliaStats/Distributions.jl/issues/307#issuecomment-429488430
Test errors are unrelated, known, and caused by the RNG changes in Julia 1.7. Probably can be fixed (IMO ideally in a separate PR) by choosing a different seed or relaxing some tolerances.
I've tried in good faith to contribute meaningfully to discussions in several repos, and you've repeatedly rebuked me that "this isn't relevant" or "this isn't the place", or "this is redundant". I'll try to stay out of your way, just let me know if you'd value my input at some point.
It's nothing personal from my side, and hence I tried to formulate it as a general comment and to resolve some possible misunderstandings. It's just a general observation that in my experience sometimes slows down development unnecessarily and makes it more difficult to read up on old discussions (if they are scattered in different PRs and issues). I have to blame myself sometimes, e.g., if I bring up additional suggestions or observations that are not needed - and hence not completely relevant - for the PR at hand :slightly_smiling_face:
I would intuitively expect + to be the convolution if we're working with the random variables; it feels a bit weird to have one symbol here and another for AffineDistribution.
I think I personally would be fine with \oplus for both, although I'm a bit concerned about confusion with MeasureTheory.jl.
As mentioned above, the general opinion was to not overload + for this use case.
Concerns against + were e.g. that it might look like adding two distributions (#307 (comment)) and does not indicate whether the random variables are independent or not (#307 (comment)).
I found this PR a bit randomly but here is another idea:
How about giving a useful error message instead of defining a new unicode operator.
So if someone does Normal() + Normal(), they get an error message like
Addition of distributions is ambiguous, if you meant to add two random variables use the `convolve` function
                                    
                                    
                                    
                                
I like this suggestion 👍 However, it seems orthogonal to this PR, doesn't it? Such an informative error could be added regardless of whether a Unicode alias for convolve exists or not.
That's true! Just thought this PR could be turned around since there is still no agreement. Btw, looking quickly at Wikipedia, it looks like the correct symbol for convolution is '∗' (\ast) (I can definitely see how annoylingly similar to '*' it is though)
Btw, looking quickly at Wikipedia, it looks like the correct symbol for convolution is '∗' (\ast)\n(I can definitely see how annoylingly similar to '*' it is though)
That's what I did in https://github.com/JuliaStats/Distributions.jl/pull/1455😅
To keep this issue from tearing our community apart, I propose a compromise: circled ast :p
(Could actually work, though? It's an asterisk, but the circle visually distinguishes it. Or alternatively the boxed asterisk \boxast.)
Actually, it looks like there's already an alternate notation for convolution -- since it can be thought of as a tensor product:
