Distributions.jl
Distributions.jl copied to clipboard
Limited Support for ForwardDiff on CDFs
While the normal distribution supports ForwardDiff (has cd
julia> using Distributions, ForwardDiff
julia> n = Normal(0.2, 0.5);
julia> b = Beta(3, 4);
julia> g = Gamma(2, 6);
julia> x = rand(4); print(x')
[0.24138 0.63222 0.059794 0.153202]
julia> ForwardDiff.gradient(x -> sum(cdf(n, x)), x)'
1×4 RowVector{Float64,Array{Float64,1}}:
0.795157 0.54913 0.767124 0.794397
julia> pdf(n, x)'
1×4 RowVector{Float64,Array{Float64,1}}:
0.795157 0.54913 0.767124 0.794397
julia> ForwardDiff.gradient(x -> sum(cdf(b, x)), x)'
ERROR: MethodError: Cannot `convert` an object of type ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4} to an object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] betacdf at /home/chris/.julia/v0.6/StatsFuns/src/rmath.jl:62 [inlined]
[2] cdf at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:313 [inlined]
[3] _cdf!(::Array{ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4},1}, ::Distributions.Beta{Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4},1}) at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:194
[4] cdf(::Distributions.Beta{Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4},1}) at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:205
[5] vector_mode_gradient at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:94 [inlined]
[6] gradient(::##3#4, ::Array{Float64,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4,Array{ForwardDiff.Dual{ForwardDiff.Tag{##3#4,0xb046287d533b082d},Float64,4},1}}) at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:19
[7] gradient(::##3#4, ::Array{Float64,1}) at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:18
julia> pdf(b,x)'
1×4 RowVector{Float64,Array{Float64,1}}:
1.52625 1.19303 0.178294 0.855103
julia> ForwardDiff.gradient(x -> sum(cdf(g, x)), x)'
ERROR: MethodError: Cannot `convert` an object of type ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4} to an object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] gammacdf at /home/chris/.julia/v0.6/StatsFuns/src/rmath.jl:62 [inlined]
[2] cdf at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:313 [inlined]
[3] _cdf!(::Array{ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4},1}, ::Distributions.Gamma{Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4},1}) at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:194
[4] cdf(::Distributions.Gamma{Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4},1}) at /home/chris/.julia/v0.6/Distributions/src/univariates.jl:205
[5] vector_mode_gradient at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:94 [inlined]
[6] gradient(::##5#6, ::Array{Float64,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4,Array{ForwardDiff.Dual{ForwardDiff.Tag{##5#6,0xb046287d533b082d},Float64,4},1}}) at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:19
[7] gradient(::##5#6, ::Array{Float64,1}) at /home/chris/.julia/v0.6/ForwardDiff/src/gradient.jl:18
julia> pdf(g,x)'
1×4 RowVector{Float64,Array{Float64,1}}:
0.00644062 0.0158053 0.00164447 0.00414832
I take this as a sign that I should just work out the analytical derivatives in my application, but they are not always clear given more complicated functions of the cdf.
Status
- [x] Normal
- [ ] Gamma (
gamma_incinSpecialFunctions) - [ ] Beta (
beta_incinSpecialFunction)
The limitation here is that cdf, like any function ForwardDiff works on, must be able to accept any subtype of Real as an argument. Both the Beta and Gamma examples you give use rmath for the calculation, which only accepts Float64.
This will eventually be resolved when we remove the dependency on rmath, but in the meantime, you'll either need to implement the derivative function by hand or implement the cdf in pure Julia, which ForwardDiff should be able to handle.
It is probably worth mentioning that the Gamma cdf is a slightly complicated function and the Beta cdf is an extremely complicated function which would require a lot of work and skills to reimplement in Julia.
A much simpler answer would then be to simply define appropriate diffrules: http://www.juliadiff.org/DiffBase.jl/latest/diffrule_api.html This sounds like the way to go if one wants to support ForwardDiff on CDFs. Even if one were to implement slightly (let alone extremely) complicated functions, I would guess the simplicity of the pdfs would make calling them directly would perform better.
The normal part is fixed by #875
Two comments:
- We should probably just have a diff rule for
cdfthat returns thepdf - The more tricky case that is usually hit when fitting models is the case when you'd have to take the derivative with respect to the parameters. The limitations there (mentioned in the top post) are no longer in the RMath dependency but in
SpecialFunctions'beta_incandgamma_incwhich needDiffRulesdefinitions.
2.
SpecialFunctions'beta_incandgamma_incwhich needDiffRulesdefinitions.
The current design of DiffRules does not support defining rules for beta_inc and gamma_inc since they return tuples. Support for gamma_inc was added to ForwardDiff directly in https://github.com/JuliaDiff/ForwardDiff.jl/pull/587 but currently only derivatives with respect to the second argument are implemented as derivatives with respect to the first argument would require (regularized) hypergeometric functions.
Is Hypergeometricfunctions.jl not mature enough to be a dependency for ForwardDiff.jl?
Otherwise the following does the job for derivatives of beta_inc with respect to a and b and could be implemented in ForwardDiff similarly to https://github.com/JuliaDiff/ForwardDiff.jl/pull/587?
using SpecialFunctions, HypergeometricFunctions
function beta_inc_deriv(a, b, z, wrt)
if wrt == :a
return (log(z) - polygamma(0, a) + polygamma(0, a + b))*beta_inc(a, b, z)[1] -
(gamma(a)*gamma(a+b)/gamma(b))*z^a*pFq((a, a, 1-b), (a+1, a+1), z)/(gamma(a+1)^2)
elseif wrt == :b
return (-log(1-z) + polygamma(0, b) - polygamma(0, a + b))*beta_inc(b, a, 1-z)[1] +
(gamma(b)*gamma(a+b)/gamma(a))*(1-z)^b*pFq((b, b, 1-a), (b+1, b+1), 1-z)/(gamma(b+1)^2)
else
error("wrt must be :a or :b")
end
end