ExpectationMaximization.jl icon indicating copy to clipboard operation
ExpectationMaximization.jl copied to clipboard

Interface compatibility with Distributions.jl ?

Open lrnv opened this issue 1 year ago • 5 comments

Hey,

Thanks for this great addition to the ecosystem !

From Distribution.jl, it seems like the first argument to the fit_mle function should be the distributions type and not an instance of the type :

julia> fit_mle(Gamma,rand(1000))
Gamma{Float64}(α=1.604973623956157, θ=0.3097630701718876)

julia> fit_mle(Gamma,rand(100))
Gamma{Float64}(α=1.6985042802071995, θ=0.31065614192888746)

julia> fit_mle(Gamma(1,1),rand(100))
ERROR: MethodError: no method matching fit_mle(::Gamma{Float64}, ::Vector{Float64})
Closest candidates are:
  fit_mle(::Type{<:LogNormal}, ::AbstractArray{T}) where T<:Real at C:\Users\lrnv\.julia\packages\Distributions\bQ6Gj\src\univariate\continuous\lognormal.jl:163
  fit_mle(::Type{<:Weibull}, ::AbstractArray{<:Real}; alpha0, maxiter, tol) at C:\Users\lrnv\.julia\packages\Distributions\bQ6Gj\src\univariate\continuous\weibull.jl:145
  fit_mle(::Type{<:Beta}, ::AbstractArray{T}; maxiter, tol) where T<:Real at C:\Users\lrnv\.julia\packages\Distributions\bQ6Gj\src\univariate\continuous\beta.jl:217
  ...
Stacktrace:
 [1] top-level scope
   @ REPL[14]:1

julia> 

This is not really a problem for yo as you are free to overload this function as you want, and your interface actually makes a lot of sense since you exploit the guesses in your algorithm. But would it be possible to add methods following this convention, maybe with automatic guesses ? I have fit_mle bindings in Copulas.jl that assume this convention, and thus do not work directly with your package :(

Edit: I was trying to make a code example of what i would like, but I saw that mixures types do not include components types... More specifically, I would like to be able to type :

fit_mle(MixtureModel{Gamma,Gamma,Normal},data)

instead of

fit_mle(MixtureModel([Gamma(),Gamma(),Normal()],[1/3 1/3 1/3]),data)

Would that be possible ?

It would allow composability, as I am currently using :

using Copulas, Distributions, ExpectationMaximization, Random
X₁ = MixtureModel([Gamma(2,3), LogNormal(1,1)],[1/2,1/2])
X₂ = Pareto()
X₃ = LogNormal(0,1)
C = ClaytonCopula(3,0.7) # A 3-variate Frank Copula with θ = 0.7
D = SklarDist(C,(X₁,X₂,X₃)) # The final distribution

# This generates a (3,1000)-sized dataset from the multivariate distribution D
simu = rand(D,1000)

D̂ = fit(SklarDist{FrankCopula,Tuple{Gamma,Normal,LogNormal}}, simu) # works
# But how can i specify that i want a mixture for one of the variables ? 

which, under the hood, calls fit_mle(Marginal_Type,marginal_data) on each marignals.

lrnv avatar Mar 22 '23 17:03 lrnv