flexsurv-dev allow conditional sampling in 'sim.fmsm()' ?

Hi,

this is related to #150 - I am trying to sample forward from the predictive distribution of an individual given current state and sojourn time in that state for a multistate model.

sim.fmsm() only seems to support sampling conditional on last state but cannot accept the respective sojourn time ('start' paramter in simulate.flexsurvreg()).

Would this be feasible to add to sim.fmsm()? I suspect the critical lines are around

https://github.com/chjackson/flexsurv-dev/blob/96dfe2ea0a360d4a79996e3b390d7cd27e0ab265/R/mstate.R#L862

Jan 20 '23 12:01 kkmann

@danielinteractive, you might find this interesting too.

Jan 20 '23 12:01 kkmann

Here is a suggestion (admittedly super inefficient) of how this could look like. One of the many problems is that it is super slow to not preallocate the return vecors. Maybe chunked pre-allocation could work. Ideally, the functionality could be integrated in the sim.fmsm() function though.

library(flexsurv)

mod_nobos_bos <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==1),
                             data = bosms3, dist = "weibull")
mod_nobos_death <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==2),
                               data = bosms3, dist = "weibull")
mod_bos_death <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==3),
                             data = bosms3, dist = "weibull")

tmat <- rbind(c(NA, 1, 2), c(NA, NA, 3), c(NA, NA, NA))

mdl <- fmsm(mod_nobos_bos, mod_nobos_death, mod_bos_death, trans = tmat)

sim <- function(mdl, newdata = data.frame(..null = NA), start = 0, nsim = 1L) {
  as.numeric(simulate(mdl, newdata = newdata, nsim = nsim)[1, 1:nsim])
}

simulate_fmsm <- function(mdl, state, newdata = data.frame(..null = NA),
                          start = 0, tmax = Inf) {
  tmat <- attr(mdl, "trans")
  terminal_states <- as.integer(which(rowSums(!is.na(tmat)) == 0))
  from <- integer()
  to <- integer()
  t <- numeric()
  tt <- 0
  while (!(state %in% terminal_states) & (tt < tmax)) {
    # sample next state
    from <- c(from, state)
    next_states <- as.integer(which(!is.na(tmat[state, ])))
    next_trans_ids <- tmat[state, next_states]
    # sample from transitions
    times <- numeric(length(next_states))
    for (i in seq_along(next_trans_ids)) {
      trans <- next_trans_ids[i]
      times[i] <- sim(mdl[[i]], newdata = newdata, start = start, nsim = 1L)
    }
    state <- next_states[which.min(times)]
    to <- c(to, state)
    tt <- tt + min(times)
    t <- c(t, tt)
    start <- sqrt(.Machine$double.eps) # future states are censored at 0
  }
  # combine vectors into a data frame and return
  res <- cbind(from, to, t)
  return(res)
}

simulate_fmsm(mdl, state = 1)

Jan 20 '23 13:01 kkmann

sim.fmsm simulates the entire pathway through a multistate model though. Do I understand correctly that you want to simulate data where each individual spends a time period of at least "start" in every one of their states, so that their sojourn times follow standard parametric distributions that are left-truncated by this time? That seems quite specialised. What is the application?

Jan 20 '23 14:01 chjackson

No, only the current state ('start' in sim.fmsm) sojourn would be needed. Imagine simulating forward for each individual at an interim analysis in a CT setting. For simulate.flexsurvreg() this is possible via the start parameter (see #150) but sim.fmsm() seems to use a slightly different mechanism for sampling (or I missed the point) and I couldn't find a way to specify the time already spent in the starting state. Also, the 'start' paramter has a different meaning in sim.fmsm(). I think my example code achieves what I want (going directly via simulate.flexsurvreg()) but to me the sim.fmsm() function would be the natural place for this functionality.

Edit: also interesting link to #38.

Jan 20 '23 15:01 kkmann

flexsurv-dev flexsurv-dev copied to clipboard

allow conditional sampling in 'sim.fmsm()' ?

flexsurv-dev
flexsurv-dev copied to clipboard