GLM.jl
GLM.jl copied to clipboard
Deviance NaN from first iteration onwards with Gamma distributed GLM with LogLink
I am trying to run a Gamma distributed GLM with a LogLink function, however, the deviance and diff.dev are both NaN from the first iteration. I have added the data files below (hopefully that's worked correctly - they are pretty small):
Reproducible example using data above:
using DelimitedFiles, GLM
X = readdlm("x.txt")
y = readdlm("y.txt")
y = reshape(y, 1000) # Otherwise is a 1000 × 1 Matrix not a vector
glm(X, y, Gamma(), LogLink(), maxiter=5, verbose = true)
The output I get:
Iteration: 1, deviance: NaN, diff.dev.:NaN
Iteration: 2, deviance: NaN, diff.dev.:NaN
Iteration: 3, deviance: NaN, diff.dev.:NaN
Iteration: 4, deviance: NaN, diff.dev.:NaN
Iteration: 5, deviance: NaN, diff.dev.:NaN
Running this for any amount of iterations leads to the error:
failure to converge after 5 iterations.
_fit!(m::GeneralizedLinearModel{GLM.GlmResp{Vector{Float64}, Gamma{Float64}, LogLink}, GLM.DensePredChol{Float64, Cholesky{Float64, Matrix{Float64}}}}, verbose::Bool, maxiter::Int64, minstepfac::Float64, atol::Float64, rtol::Float64, start::Nothing) at glmfit.jl:339
#fit!#12 at glmfit.jl:372 [inlined]
fit! at glmfit.jl:352 [inlined]
fit(::Type{GeneralizedLinearModel}, X::Matrix{Float64}, y::Vector{Float64}, d::Gamma{Float64}, l::LogLink; dofit::Bool, wts::Vector{Float64}, offset::Vector{Float64}, fitargs::Base.Iterators.Pairs{Symbol, Integer, Tuple{Symbol, Symbol}, NamedTuple{(:maxiter, :verbose), Tuple{Int64, Bool}}}) at glmfit.jl:468
(::StatsBase.var"#fit##kw")(::NamedTuple{(:maxiter, :verbose), Tuple{Int64, Bool}}, ::typeof(fit), ::Type{GeneralizedLinearModel}, X::Matrix{Float64}, y::Vector{Float64}, d::Gamma{Float64}, l::LogLink) at glmfit.jl:462
glm(::Matrix{Float64}, ::Vector{Float64}, ::Gamma{Float64}, ::Vararg{Any, N} where N; kwargs::Base.Iterators.Pairs{Symbol, Integer, Tuple{Symbol, Symbol}, NamedTuple{(:maxiter, :verbose), Tuple{Int64, Bool}}}) at glmfit.jl:484
(::GLM.var"#glm##kw")(::NamedTuple{(:maxiter, :verbose), Tuple{Int64, Bool}}, ::typeof(glm), ::Matrix{Float64}, ::Vector{Float64}, ::Gamma{Float64}, ::Vararg{Any, N} where N) at glmfit.jl:484
top-level scope at sandbox.jl:229
eval at boot.jl:360 [inlined]
I will try and dig into this and narrow it down a bit further but I am not very familiar with GAMs, so any help would be much appreciated!
The NaNs seem to be introduced in updateμ!(r::GlmResp{V,D,L}) where {V<:FPVector,D,L}
. Specifically line 105 μi, dμdη = inverselink(L(), η[i])
introduces NaNs if η is too large.
I think the issue is just that it does exp
(inverse of log) and the values are too great leading to Infs. I presume I'll be able to fix this by scaling my variables. The behaviour to keep looping through all the iterations and throwing an error saying "failure to converge after x iterations" probably could be clearer though?
Hi there,
I also face a similar problem with a Bernoulli/LogitLink model. It seems the source of the NaNs originates from the delbeta! method at least in mul!(p.delbeta, transpose(scr), r)
.
Do you have any idea of what could be causing the problem? Happy to provide more information. I can probably also provide the data to reproduce the problem if that helps. If so, please let me know how to do this best.
Thanks!