rstanarm icon indicating copy to clipboard operation
rstanarm copied to clipboard

stan_surv doesn't interpret correctly NAs with type="interval2" survival data

Open anddis opened this issue 4 years ago • 2 comments

Summary:

stan_surv doesn't interpret correcty the convention of having NAs instead of infinite values with type = "interval2" survival data.

Description:

As per ?Surv, when type = "interval2" is specified, "Infinite values can be represented either by actual infinity (Inf) or NA. The second form has proven to be the more useful one."

However, stan_surv seem to interpret correctly only the first form (infinite values).

From the reproducible example below. First form (infinite values):

stan_surv
 baseline hazard: M-splines on hazard scale
 formula:         Surv(l, u, type = "interval2") ~ grp
 observations:    144
 events:          0 (0%)
 left censored:   62 (43.1%)
 right censored:  82 (56.9%)
 delayed entry:   no

Second form (NAs):

stan_surv
 baseline hazard: M-splines on hazard scale
 formula:         Surv(l, u, type = "interval2") ~ grp
 observations:    82
 events:          0 (0%)
 right censored:  82 (100%)
 delayed entry:   no

Reproducible Steps:

library(rstanarm)
library(tidyverse)

stan_surv(Surv(l, u, type = "interval2") ~ grp, 
          data = mice, chains = 1, refresh = 0, iter = 600)

mice2 <- 
  mice %>% 
  mutate(l = replace(l, is.infinite(l), NA_real_),
         u = replace(u, is.infinite(l), NA_real_))

stan_surv(Surv(l, u, type = "interval2") ~ grp, 
                        data = mice2, chains = 1, refresh = 0, iter = 600)

RStanARM Version:

2.21.2

R Version:

3.6.3

Operating System:

OS X 10.14.6

anddis avatar Nov 21 '20 12:11 anddis

Hi @anddis. Thanks for reporting this! I've not dug into it in any detail, but I'm guessing the NA's need to be handled (converted to Inf?) here:

https://github.com/stan-dev/rstanarm/blob/ff0a22b90cd70828616f73c39fa54ebb68522040/R/stan_surv.R#L2142

If it is corrected there, then I think it should flow through correctly. Surv() is only used for the initial formula parsing and then just converted to separate time and status variables, the actual Surv object itself isn't used in the later estimation code anywhere.

sambrilleman avatar Nov 23 '20 22:11 sambrilleman

Whatever I used NA or Inf (-Inf), stan_surv stops me from doing this because it produces the error:

Rejecting initial value: Log probability evaluates to log(0), i.e. negative infinity

It seems like stan_surv needs actual initial values for sampling.

imba-ang avatar Jul 14 '21 13:07 imba-ang