rpart
rpart copied to clipboard
Survival trees outcome
Hi, I am fitting survival trees with rpart but I have some doubts concerning the resulting outcome. I understood that the node outcome (yval) should be a measure of relative risk.
However I don't understand how this value is computed and its interpretation. Moreover, using survival times following an Exponential distribution (simulated data), I don't understand how to get the correspondent parameter lambda from the node outcome.
Could you please help me? Thank you.
Here below an example of R code:
library(simsurv)
library(survival)
library(rpart)
library(rpart.plot)
maxtime <- 3 #length of the follow-up
#********Parameter for Exponential times********
lamb1 <- 1.5 #lambda for Exponential time node 1 if X1=1
lamb2 <- 0.8 #lambda for Exponential time node 2 if X1=0
N <- 2000
seed <- 1511
set.seed(seed)
##########################
# Data generating process
X1 <- factor(rbinom(N,1,0.5) )
x1_1 <- X1==1 #node 1 lamb1
n_x1_1 <- sum(x1_1)
n_x1_0 <- N - n_x1_1
simdata <- as.data.frame(matrix(NA, nrow=N, ncol=2))
simdata[x1_1,] <- simsurv(dist="exponential",lambda=lamb1,x=as.data.frame(1:n_x1_1),
maxt=maxtime,seed=seed)[,2:3]
simdata[!x1_1,] <- simsurv(dist="exponential",lambda=lamb2,x=as.data.frame(1:n_x1_0),
maxt=maxtime,seed=seed)[,2:3]
simdata <- cbind(simdata, X1)
colnames(simdata) <- c("eventtime","status","X1")
##########################
# Tree estimation
formula <- Surv(simdata$eventtime, simdata$status)~X1
survtree_rpart <- rpart(formula,data=simdata,
control=rpart.control(usesurrogate = 2, minsplit = 20, xval=10))
bestcp <- survtree_rpart$cptable[which.min(survtree_rpart$cptable[,"xerror"]),"CP"]
pruned_survtree_rpart <- prune(survtree_rpart,cp=bestcp)
rpart.plot(pruned_survtree_rpart, type=5, tweak=1.5, gap=0.02, branch.lty=2)