GLMNet.jl
GLMNet.jl copied to clipboard
`cv.meanloss` differs from `cv$cvm` in R
In trying to get the cross-validation output from glmnet
in R and GLMNet.jl
to conform, I find the losses differ even when everything else (lambda sequence, fold id) is the same across the two. This yields an argmin(cv.meanloss)
different from which.min(cv$cvm)
(R), and it sometimes matters. What is the source of the difference?
Example:
R
require(glmnet)
data <- iris
foldid <- rep(1:10, nrow(data) / 10)
x <- model.matrix(data = data, ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)
cvl <- cv.glmnet(y = data$Species, x = x,
family = "multinomial", alignment = "fraction", foldid = foldid)
round(cvl$lambda, 8)
cvl$cvm
2.1972246 2.0531159 1.9324868 ...
julia
using Pkg
Pkg.add("RDatasets")
Pkg.add("GLMNet")
Pkg.add("GLM")
using RDatasets, GLMNet, GLM
iris = dataset("datasets", "iris")
fml = @formula(Species ~ SepalLength + SepalWidth + PetalLength + PetalWidth + SepalLength)
x = ModelMatrix(ModelFrame(fml, iris)).m
foldid = repeat(1:10, Int(size(iris, 1) / 10))
cvl = glmnetcv( x, iris.Species; folds = foldid )
cvl.lambda'
cvl.meanloss
2.1955639962247964
2.0530748153377423
1.9324668652650965
...