mlr3filters icon indicating copy to clipboard operation
mlr3filters copied to clipboard

add Gaussian Covariance filter

Open MislavSag opened this issue 2 years ago • 6 comments

Adding Gaussian Covariance 'filter' from the package https://cran.r-project.org/web/packages/gausscov/gausscov.pdf.

I am getting error when running test and examples but can't figure out why.

Error in .__Param__assert(self = self, private = private, super = super, :
Assertion on 'x' failed: Element 1 is not >= 1.

It seems like my filter is not in mlr_filter list.

The function works as expected when tried tu instantiate class and run calculate and score.

I am not sure if classif models are supported, so I added regr only. I can try to contact the author.

Missing values are not allowed.

MislavSag avatar Jan 31 '23 12:01 MislavSag

I have just checked examples in gausscov package and it has example with binary covariate. So, it works for classification too. But the target variable has to be a matrix, not factor. I can add classif example after you review initial PR.

MislavSag avatar Jan 31 '23 13:01 MislavSag

Sorry for not responding here (I did not see it).

Are you still interested in contributing this filter?

sebffischer avatar Oct 13 '23 12:10 sebffischer

Yes. I will send last version of the pipe. I think I have changed something till PR.

Is there anything I should add to current commit?

MislavSag avatar Oct 13 '23 16:10 MislavSag

When I run the test from the the pull request, I get a lot of NA values, can you explain why this happens?

sebffischer avatar Oct 20 '23 09:10 sebffischer

I can't make new PR for some reason, but can you try this code:

FilterGausscovF1st = R6::R6Class(
  "FilterGausscovF1st",
  inherit = mlr3filters::Filter,

  public = list(

    #' @description Create a GaussCov object.
    initialize = function() {
      param_set = ps(
        p0   = p_dbl(lower = 0, upper = 1, default = 0.01),
        kmn  = p_int(lower = 0, default = 0),
        kmx  = p_int(lower = 0, default = 0),
        mx   = p_int(lower = 1, default = 21),
        kex  = p_int(lower = 0, default = 0),
        sub  = p_lgl(default = TRUE),
        inr  = p_lgl(default = TRUE),
        xinr = p_lgl(default = FALSE),
        qq   = p_int(lower = 0, default = 0)
      )

      super$initialize(
        id = "gausscov_f1st",
        task_types = c("classif", "regr"),
        param_set = param_set,
        feature_types = c("integer", "numeric"),
        packages = "gausscov",
        label = "Gauss Covariance f1st",
        man = "mlr3filters::mlr_filters_gausscov_f1st"
      )
    }
  ),

  private = list(
    .calculate = function(task, nfeat) {
      # debug
      # pv = list(
      #   p0   = 0.01,
      #   kmn  = 0,
      #   kmx  = 0,
      #   mx   = 21,
      #   kex  = 0,
      #   sub  = TRUE,
      #   inr  = TRUE,
      #   xinr = FALSE,
      #   qq   = 0
      # )

      # empty vector with variable names as vector names
      scores = rep(-1, length(task$feature_names))
      scores = mlr3misc::set_names(scores, task$feature_names)

      # calculate gausscov pvalues
      pv = self$param_set$values
      x = as.matrix(task$data(cols = task$feature_names))
      if (task$task_type == "classif") {
        y = as.matrix(as.integer(task$truth()))
      } else {
        y = as.matrix(task$truth())
      }
      res = mlr3misc::invoke(gausscov::f1st, y = y, x = x, .args = pv)
      res_1 = res[[1]]
      res_1 = res_1[res_1[, 1] != 0, , drop = FALSE]
      scores[res_1[, 1]] = abs(res_1[, 4])

      # save scores
      dir_name = "./gausscov_f1"
      if (!dir.exists(dir_name)) {
        dir.create(dir_name)
      }
      random_id <- paste0(sample(0:9, 15, replace = TRUE), collapse = "")
      file_name = paste0("gausscov_f1-", task$id, "-", random_id, ".rds")
      file_name = file.path(dir_name, file_name)
      saveRDS(scores, file_name)

      sort(scores, decreasing = TRUE)
    }
  )
)

MislavSag avatar Jan 19 '24 12:01 MislavSag

You can't make a new PR from your main branch because you already have a PR open. You could e.g. make a new branch in your fork and then create a new PR.

sebffischer avatar Jan 19 '24 14:01 sebffischer