grpreg copied to clipboard
A possible bug in group coefficients for group Lasso
I believe I have found a possible bug in group coefficients constraint for penalty="grLasso"
in grpreg
. Penalty vignette states "the coefficients within a group will either all equal zero or none will equal zero". This constraint seems to be broken in a data example I came across.
To reproduce this potential bug please execute:
#load data example from github
#prepare data in grpreg-accepted format
y <- ifelse(y == levels(y)[2], 1, 0)
X <- stats::model.matrix(y~., data = data.frame(y=y, X, check.names = TRUE))[, -1, drop=FALSE]
group <- rep(1:57, each = 3)
#run grpreg
fit<-grpreg(X, y, group = group, penalty="grLasso", family="binomial")
#examine coefficients
# 0.2199 0.2133 0.2069 0.2008 0.1948 0.189 0.1834 0.1779 0.1726
# X15c 0 -2.139903e-17 8.559611e-17 1.069951e-16 2.139903e-16 0.0000000 5.991728e-16 3.423845e-16 0.0000000
# X15g 0 3.998654e-02 7.887908e-02 1.167650e-01 1.537286e-01 0.1898477 2.251939e-01 2.598331e-01 0.2938264
# X15t 0 9.878343e-02 1.948222e-01 2.882890e-01 3.793487e-01 0.4681499 5.548270e-01 6.395019e-01 0.7222856
It is probably a numerical problem - you'll notice some coefficients very close to 0 in the first row X15c
of the output, however for the lambdas 0.189 and 0.1726 it is not close to 0, but exactly 0, breaking the abovementioned constraint.
Thank you in advance for having a look into it, Szymon PS. The data I used is a subset of Promoter dataset isolated for the purpose of reproducing this behavior.