mlr3learners
mlr3learners copied to clipboard
give glmnet the $importance slot
because then it could be used in combination with FilterEmbedded in mlr3featsel for feature selection in order of L1 inclusion. Importance could be the (approximate) lambda value at which a feature is first included and can easily be calculated from the model.
Some code I used for something similar (using "old" mlr). This only gets the order by which the features are introduced; I think the approximate lamba value would be more informative.
# orders features by in what order they are introduced when decreasing shrinkage in L1 regression.
slfun <- function(task, nselect, alpha = 1, ...) {
xy <- getTaskData(task, target.extra = TRUE)
if (getTaskType(task) == "regr") {
family <- "gaussian"
} else {
family <- if (length(levels(xy$target)) > 2) "multinomial" else "binomial"
}
fit <- glmnet(x = as.matrix(xy$data), y = xy$target, alpha = alpha,
lambda.min.ratio = 1e-4, family = family)
captured <- integer(0)
for (col in seq_len(ncol(fit$beta))) {
curcols <- which(fit$beta[, col] != 0)
newcols <- setdiff(curcols, captured)
captured <- c(captured, newcols)
}
captured <- c(captured, setdiff(seq_len(getTaskNFeats(task)), captured))
res <- -order(captured)
names(res) <- getTaskFeatureNames(task)
res
}
@mb706 Like this https://github.com/mlr-org/mlr3learners/commit/7594ad7b3f5f281d4d6cef1cfd8d0119e9c52f7a ?
Do you know how we can handle multi class tasks? We get for each target class a different beta
matrix and the positions at which the features are introduced varies between them.