rstanarm
rstanarm copied to clipboard
stan_surv doesn't interpret correctly NAs with type="interval2" survival data
Summary:
stan_surv
doesn't interpret correcty the convention of having NA
s instead of infinite values with type = "interval2"
survival data.
Description:
As per ?Surv
, when type = "interval2"
is specified, "Infinite values can be represented either by actual infinity (Inf) or NA. The second form has proven to be the more useful one."
However, stan_surv
seem to interpret correctly only the first form (infinite values).
From the reproducible example below. First form (infinite values):
stan_surv
baseline hazard: M-splines on hazard scale
formula: Surv(l, u, type = "interval2") ~ grp
observations: 144
events: 0 (0%)
left censored: 62 (43.1%)
right censored: 82 (56.9%)
delayed entry: no
Second form (NAs):
stan_surv
baseline hazard: M-splines on hazard scale
formula: Surv(l, u, type = "interval2") ~ grp
observations: 82
events: 0 (0%)
right censored: 82 (100%)
delayed entry: no
Reproducible Steps:
library(rstanarm)
library(tidyverse)
stan_surv(Surv(l, u, type = "interval2") ~ grp,
data = mice, chains = 1, refresh = 0, iter = 600)
mice2 <-
mice %>%
mutate(l = replace(l, is.infinite(l), NA_real_),
u = replace(u, is.infinite(l), NA_real_))
stan_surv(Surv(l, u, type = "interval2") ~ grp,
data = mice2, chains = 1, refresh = 0, iter = 600)
RStanARM Version:
2.21.2
R Version:
3.6.3
Operating System:
OS X 10.14.6
Hi @anddis. Thanks for reporting this! I've not dug into it in any detail, but I'm guessing the NA's need to be handled (converted to Inf
?) here:
https://github.com/stan-dev/rstanarm/blob/ff0a22b90cd70828616f73c39fa54ebb68522040/R/stan_surv.R#L2142
If it is corrected there, then I think it should flow through correctly. Surv()
is only used for the initial formula parsing and then just converted to separate time and status variables, the actual Surv
object itself isn't used in the later estimation code anywhere.
Whatever I used NA or Inf (-Inf), stan_surv stops me from doing this because it produces the error:
Rejecting initial value: Log probability evaluates to log(0), i.e. negative infinity
It seems like stan_surv needs actual initial values for sampling.