sl3 icon indicating copy to clipboard operation
sl3 copied to clipboard

`Lrnr_rpart$train()` works, but as part of stack fails

Open kmishra9 opened this issue 5 years ago • 5 comments

Hey there,

So as the title indicates, I had a stack with a bunch of learners running within the delayed framework:

stack = make_learner(
    Stack,
    lrnr_glm,
    lrnr_randomForest,
    lrnr_xgboost,
    lrnr_xgboost_limited,
    lrnr_rpart,
    lrnr_svm,
    lrnr_solnp,
    lrnr_earth
)
[...]
scheduled_super_learner = Scheduler$new(
    delayed_object = delayed_learner_train(learner = super_learner, task = train_task),
    job_type =  FutureJob,
    nworkers = cpus_logical,
    verbose = TRUE
)

model_13 = scheduled_super_learner$compute()

and got this error:

Error in order(results$index) : argument 1 is not a vector
In addition: There were 11 warnings (use warnings() to see them)
Failed on predict
Error in self$compute_step() : 
  Error in order(results$index) : argument 1 is not a vector

updating chain from ready to running
run:1 ready:0 workers:12
updating chain from running to resolved
Failed on chain
Error in self$compute_step() : Error in self$compute_step() : 
  Error in order(results$index) : argument 1 is not a vector

Removing lrnr_rpart from the stack works, but using lrnr_rpart on the train_task directly also appears to work. 🤷‍♂

No worries if this is unhelpful, vague, or just irrelevant, but trying to provide feedback when I run into bugs if it helps the package mature! I simply removed lrnr_rpart from the stack and continued on my way.

Big fan of the sl3 framework thus far!

kmishra9 avatar Aug 09 '19 17:08 kmishra9

Thanks. I'll try to reproduce!

jeremyrcoyle avatar Aug 09 '19 18:08 jeremyrcoyle

Hi @kmishra9 , sorry for the long delay. I tried to reproduce the issue with rpart in stacks as follows:

# try to reproduce https://github.com/tlverse/sl3/issues/230
library(sl3)
library(testthat)
library(rpart)

# define test dataset
data(mtcars)
task <- sl3_Task$new(mtcars, covariates = c(
  "cyl", "disp", "hp", "drat", "wt", "qsec",
  "vs", "am", "gear", "carb"
), outcome = "mpg")


lrnr_rpart <- Lrnr_rpart$new()
lrnr_mean <- Lrnr_mean$new()
stack <- Stack$new(lrnr_rpart, lrnr_mean)

stack_fit <- stack$train(task)
predict <- stack_fit$predict()

But wasn't able to. I understand it may not be possible due to private data or other concerns, but I think i'll need a MRE in order to identify the issue here. Until then, i'll close this issue.

jeremyrcoyle avatar Jan 16 '20 19:01 jeremyrcoyle

For categorical data, this is because Lrnr_rpart needs to pack_predictions. Will fix ASAP

jeremyrcoyle avatar Oct 14 '20 20:10 jeremyrcoyle

Seems like this is affecting Lrnr_ranger as well

jeremyrcoyle avatar Oct 14 '20 20:10 jeremyrcoyle

Has this been resolved @jeremyrcoyle?

nhejazi avatar Mar 02 '21 02:03 nhejazi