Distributions.jl icon indicating copy to clipboard operation
Distributions.jl copied to clipboard

RFC: Initial lightweight support for keyword argument constructors of `Arcsine`, `Beta`, `BetaPrime`, and `Normal`

Open devmotion opened this issue 4 years ago • 3 comments

I am aware of https://github.com/JuliaStats/Distributions.jl/pull/823 but I think maybe it is more promising to start with a slightly simpler approach and add keyword argument constructors step by step.

This PR tries not to cover all distributions but is limited to Arcsine, Beta, BetaPrime, and Normal (the first three continuous univariate distributions in alphabetical order and Normal due to its popularity). Support for other distributions would have to be added in subsequent PRs.

The approach in this PR is also a bit simpler as it does not try to dispatch on keyword arguments or fit distributions based on e.g. a given mean and standard deviation (I'm not sure if this is a good idea anyway since it seems to increase complexity for developers and users and would/could be supported by fit anyway). Instead it just adds the constructors

Arcsine(; a::Real=zero(b), b::Real=1.0, check_args::Bool=true)
Beta(; α::Real=1.0, β::Real=α, check_args::Bool=true)
BetaPrime(; α::Real=1.0, β::Real=α, check_args::Bool=true)
Normal(; μ::Real=0.0, σ::Real=one(μ), check_args::Bool=true)

Additionally (up for discussion), the PR adds ASCII alternatives alpha, beta, mu, and sigma. If both Unicode and ASCII alternatives are specified, the Unicode arguments have higher precedence. I'm not sure though if the support for users/editors without Unicode support is worth the increased code complexity.

The PR also ensures that check_args is passed around correctly, currently it is only supported by Arcsine(::T, ::T) where {T<:Real}, Beta(::T, ::T) where {T<:Real} etc. but not the other constructors.

devmotion avatar Oct 11 '21 16:10 devmotion

Codecov Report

Merging #1405 (6fdc296) into master (27fb31c) will increase coverage by 0.01%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1405      +/-   ##
==========================================
+ Coverage   83.05%   83.06%   +0.01%     
==========================================
  Files         117      117              
  Lines        6756     6756              
==========================================
+ Hits         5611     5612       +1     
+ Misses       1145     1144       -1     
Impacted Files Coverage Δ
src/univariate/continuous/arcsine.jl 88.57% <100.00%> (ø)
src/univariate/continuous/beta.jl 71.30% <100.00%> (+0.86%) :arrow_up:
src/univariate/continuous/betaprime.jl 92.50% <100.00%> (ø)
src/univariate/continuous/normal.jl 98.87% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 27fb31c...6fdc296. Read the comment docs.

codecov-commenter avatar Oct 11 '21 16:10 codecov-commenter

I'd like to say, I definitely like this! Thanks for the hard work David.

(I'm not sure if this is a good idea anyway since it seems to increase complexity for developers and users and would/could be supported by fit anyway.)

Perhaps we should add a fit_mom function, then? fit_mom(dist::Distribution, data::Array) could fit based on the empirical moments, calculating them directly, while fit_mom(dist::Distribution, mean, variance, ...) would return a version of dist with the specified moments.

ParadaCarleton avatar Dec 09 '21 19:12 ParadaCarleton