tmle3
tmle3 copied to clipboard
"NA not permitted in response" when train models
I have right-censored data in the outcome. The tmle3 package has been working correctly, but recently it stopped working to train the models.
When using any model, I get the message that NAs are not accepted. Before I was able to train the models correctly and no error or warning messages were displayed. For example, I tried to train a RandomForest model but I get the following error:
Error in private$.train(subsetted_task, trained_sublearners) :
All learners in stack have failed
In addition: Warning message:
In private$.train(subsetted_task, trained_sublearners) :
Lrnr_randomForest_500_TRUE_5_100_25 failed with message: Error in (function (x, y = NULL, xtest = NULL, ytest = NULL, ntree = 500, : NA not permitted in response.
. It will be removed from the stack
Failed on Stack
Error in self$compute_step() :
Error in private$.train(subsetted_task, trained_sublearners) :
All learners in stack have failed
Here is the code:
node_list <- list(
W = colnames(O_data)[!colnames(O_data) %in% c("Y","A","Delta")],
A = "A",
Y = "Y"
)
#' #' 1.1. Missingness
#' processed <- process_missing(O_data, node_list)
#' O_data <- processed$data
#' node_list <- processed$node_list
#' 2. Create a "Spec" Object
ate_spec <- tmle_ATE(
treatment_level = 0,
control_level = 1
)
lrnr_rf <- make_learner(Lrnr_randomForest, mtry=100, max_nodes=25, ntree=500)
sl_Y <- Lrnr_sl$new(
learners = lrnr_rf
)
sl_Delta <- Lrnr_sl$new(
learners = lrnr_rf
)
sl_A <- Lrnr_sl$new(
learners = lrnr_rf
)
learner_list <- list(A = sl_A, delta_Y = sl_Delta, Y = sl_Y)
#' 4. Initial Likelihood
tmle_task <- ate_spec$make_tmle_task(data = O_data, node_list = node_list)
initial_likelihood <- ate_spec$make_initial_likelihood(tmle_task,learner_list) ## Here is the error!
print(initial_likelihood)
I have the latest version of the tmle3 repository installed.
I would be grateful if you could help me
Your code appears correct, and there's no obvious reason it shouldn't work. Can you provide some sample data so that I can try to replicate the issue?
Thanks for your answer.
The structure of sample data is:
- Y: outcome with NAs
- Delta: censoring
- A: treatment
- W -> variables with prefixes "clinical_" and "mrna_"
Regards.