mlr3learners icon indicating copy to clipboard operation
mlr3learners copied to clipboard

give glmnet the $importance slot

Open mb706 opened this issue 5 years ago • 2 comments

because then it could be used in combination with FilterEmbedded in mlr3featsel for feature selection in order of L1 inclusion. Importance could be the (approximate) lambda value at which a feature is first included and can easily be calculated from the model.

mb706 avatar Jul 14 '19 17:07 mb706

Some code I used for something similar (using "old" mlr). This only gets the order by which the features are introduced; I think the approximate lamba value would be more informative.

# orders features by in what order they are introduced when decreasing shrinkage in L1 regression.
slfun <- function(task, nselect, alpha = 1, ...) {
  xy <- getTaskData(task, target.extra = TRUE)
  if (getTaskType(task) == "regr") {
    family <- "gaussian"
  } else {
    family <- if (length(levels(xy$target)) > 2) "multinomial" else "binomial"
  }
  fit <- glmnet(x = as.matrix(xy$data), y = xy$target, alpha = alpha,
    lambda.min.ratio = 1e-4, family = family)
  captured <- integer(0)
  for (col in seq_len(ncol(fit$beta))) {
    curcols <- which(fit$beta[, col] != 0)
    newcols <- setdiff(curcols, captured)
    captured <- c(captured, newcols)
  }
  captured <- c(captured, setdiff(seq_len(getTaskNFeats(task)), captured))
  res <- -order(captured)
  names(res) <- getTaskFeatureNames(task)
  res
}

mb706 avatar Jul 14 '19 18:07 mb706

@mb706 Like this https://github.com/mlr-org/mlr3learners/commit/7594ad7b3f5f281d4d6cef1cfd8d0119e9c52f7a ? Do you know how we can handle multi class tasks? We get for each target class a different beta matrix and the positions at which the features are introduced varies between them.

be-marc avatar Dec 10 '19 21:12 be-marc