rstanarm
rstanarm copied to clipboard
Survival models
I'm opening this issue so there is a forum in which to discuss progress on getting survival models into rstanarm.
I've started a new survival branch here. It currently has a stan_surv
modelling function, calling a surv.stan
model file. The modelling function returns an object of class stansurv
that inherits stanreg
.
The current task list looks something like:
- [x]
print
method for stansurv - [x]
summary
method for stansurv - [x]
prior_summary
method for stansurv - [ ]
basehaz_summary
method for stansurv - [x]
posterior_survfit
method for stansurv - [x]
log_lik
,loo
, andwaic
method for stansurv - [x]
plot
method that includes plot type for baseline hazard - [ ] add option for rcs splines for the log hazard (i.e. instead of B-splines)
- [ ] get the piecewise constant baseline hazard from
stan_jm
working withstan_surv
- [ ] attempt to implement a constrained log-likelihood by just shifting hazard (setting equal to 1E-8 or similar) when the hazard strays negative, see section 2.2 of this paper
- [x] add functionality for time-dependent effects (i.e. non-proportional hazards) -- should this use survival package's
tt
approach, or something else (e.g. atde()
special function for use in the formula) - [x] add AFT models
- [ ] add semi-parametric model (possibly like this paper)
- [ ] add frailty models
Thanks @sambrilleman for setting this up. At this stage I am trying to make sense of your code / approach. So some questions follow every now and then. Once this is clarified I will move on an try to contribute :)
You mention that for the ms
approach
A closed form solution is available for both the hazard and cumulative hazard so this approach should be relatively fast.
as well as
The default locations for the internal knots, as well as the basis terms for the splines, are calculated with respect to log time.
I see how one can get a closed form solution if the splines work directly on the time scale, but if you use the log time scale wouldn't you need to evaluate an integral with an integrand that is the product of a weighted sum of each of the M-splines and exp(integration-variable), with lower integration bound being -\infty and upper one being \log{t}. I am not aware of a solution like this for M-splines... Maybe I am overseeing something here
Glad you both are working on this!
Thanks @sambrilleman for setting this up. At this stage I am trying to make sense of your code / approach. So some questions follow every now and then. Once this is clarified I will move on an try to contribute :)
@ermeel86 Sounds good!
I see how one can get a closed form solution if the splines work directly on the time scale, but if you use the log time scale wouldn't you need to evaluate an integral with an integrand that is the product of a weighted sum of each of the M-splines and exp(integration-variable), with lower integration bound being -\infty and upper one being \log{t}.
Hmm, maybe you are right (and it is safer to just stick with t, rather than log(t)). But I had the impression (without writing down the math) that transforming the time scale would be somewhat irrelevant from the perspective of integrating the basis terms. The I-splines are still just evaluated as the integral of the M-splines, but with respect to whatever transformation of time you choose. So, if you choose log(time) as the time scale for the splines, then you are modelling h(log(t)) using M-splines, and you are modelling H(log(t)) using I-splines...
(But I may well be missing something here...!)

@bgoodri @jgabry I'm hoping you can maybe help me with some methodology and implementation.
For stan_surv
I am formulating a regression model defined on the hazard scale, that allows for either time-fixed or time-varying coefficients. In survival analysis, the time-varying coefficients are often referred to as "non-proportional hazards", since the effect of the covariate on the hazard is allowed to vary over time.
I am formulating the time-varying coefficient (say beta_p(t)) as a B-spline function, as shown in Equation 4 of the attached PDF. But I want to penalise the spline coefficients to avoid overfitting. I see that in stan_gamm
you use a hierarchical smoothing prior for the spline coefficients but they are not penalised, rather, the coefficients are treated as independent of one another (i.e. each spline coefficient has a univariate normal prior).
But since I am using B-splines (whereas I think stan_gamm
uses thin plate regression splines) I want to introduce a penalty matrix.
Following Lang and Brezger, I think I can do this via the prior distribution defined in Equation 5 of the attached PDF. If I am correct, that prior distribution for the B-spline coefficients basically equates to a multivariate normal distribution with precision matrix equal to the r-th difference penalty matrix scaled by the smoothing SD. The smoothing SD is then given some hyper-prior as chosen by the user (e.g. exponential, half-normal, half-t or half-Cauchy -- i.e. the equivalent of prior_smooth
in stan_gamm
).
My question is: I am looking for advice on the implementation of the MVN prior for the B-spline coefficients. You can see my current Stan code here. It currently shows a univariate normal prior for each of the B-spline coefficients, and a hyperprior for the smoothing SD (i.e. the same as used in stan_gamm
). But, the commented out lines (i.e. here) show my initial attempt at a MVN prior with penalty matrix for the coefficients. But that code is probably wrong, and in any case I want to be able to reparameterise the MVN prior using Matt's trick, but don't know how to do that for a precision-based MVN distribution.
Any chance you have advice/direction/anything to add on this? If not, a more general question might be "is it possible to apply Matt's trick to a MVN distribution parameterised in terms of it's precision matrix?"
Thanks in advance! notes_for_Ben_and_Jonah.pdf
The short answer is that with stan_gamm4
, the design matrix for the
smooth function has been transformed by a SVD so that its columns are
orthogonal and hence the independent normal priors on the coefficients are
arguably appropriate. This involves the diagonalize = TRUE
argument to
mgcv::jagam
https://github.com/stan-dev/rstanarm/blob/master/R/stan_gamm4.R#L182
On Mon, Oct 8, 2018 at 3:20 AM Sam Brilleman [email protected] wrote:
@bgoodri https://github.com/bgoodri @jgabry https://github.com/jgabry I'm hoping you can maybe help me with some methodology and implementation.
For stan_surv I am formulating a regression model defined on the hazard scale, that allows for either time-fixed or time-varying coefficients. In survival analysis, the time-varying coefficients are often referred to as "non-proportional hazards", since the effect of the covariate on the hazard is allowed to vary over time.
I am formulating the time-varying coefficient (say beta_p(t)) as a B-spline function, as shown in Equation 4 of the attached PDF. But I want to penalise the spline coefficients to avoid overfitting. I see that in stan_gamm you use a hierarchical smoothing prior for the spline coefficients but they are not penalised, rather, the coefficients are treated as independent of one another (i.e. each spline coefficient has a univariate normal prior).
But since I am using B-splines (whereas I think stan_gamm uses thin plate regression splines) I want to introduce a penalty matrix.
Following Lang and Brezger https://www.jstor.org/stable/1391151, I think I can do this via the prior distribution defined in Equation 5 of the attached PDF. If I am correct, that prior distribution for the B-spline coefficients basically equates to a multivariate normal distribution with precision matrix equal to the r-th difference penalty matrix scaled by the smoothing SD. The smoothing SD is then given some hyper-prior as chosen by the user (e.g. exponential, half-normal, half-t or half-Cauchy -- i.e. the equivalent of prior_smooth in stan_gamm).
My question is: I am looking for advice on the implementation of the MVN prior for the B-spline coefficients. You can see my current Stan code here https://github.com/stan-dev/rstanarm/blob/feature-interval-censored-jm/src/stan_files/surv.stan#L248. It currently shows a univariate normal prior for each of the B-spline coefficients, and a hyperprior for the smoothing SD (i.e. the same as used in stan_gamm). But, the commented out lines (i.e. here https://github.com/stan-dev/rstanarm/blob/feature-interval-censored-jm/src/stan_files/surv.stan#L251) show my initial attempt at a MVN prior with penalty matrix for the coefficients. But that code is probably wrong, and in any case I want to be able to reparameterise the MVN prior using Matt's trick, but don't know how to do that for a precision-based MVN distribution.
Any chance you have advice/direction/anything to add on this? If not, a more general question might be "is it possible to apply Matt's trick to a MVN distribution parameterised in terms of it's precision matrix?"
Thanks in advance! notes_for_Ben_and_Jonah.pdf https://github.com/stan-dev/rstanarm/files/2454965/notes_for_Ben_and_Jonah.pdf
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/280#issuecomment-427740798, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqr0zZm_vhBSMQzsoOQHcx9Z-m--bks5uivzUgaJpZM4T7CES .
@bgoodri Thanks, that makes sense.
But just to elaborate, suppose I wish to specify the coefficients with MVN prior with penalty matrix, is it possible to specify a non-centered parameterisation for that?
The Stan manual shows a MVN non-centered reparameterisation as:
data {
int<lower=2> K;
vector[K] mu;
cov_matrix[K] Sigma;
...
transformed data {
matrix[K, K] L;
L = cholesky_decompose(Sigma);
}
parameters {
vector[K] alpha;
...
transformed parameters {
vector[K] beta;
beta = mu + L * alpha;
}
model {
alpha ~ normal(0, 1);
// implies: beta ~ multi_normal(mu, Sigma)
...
but I think what I need is a non-centered reparameterisation of beta ~ multi_normal_prec(mu, smooth_sd * Tau)
, not beta ~ multi_normal(mu, Sigma)
. In theory, is that simple to specify?
The stochastic representation of the multivariate normal is in terms of the Cholesky factor of the covariance matrix. I guess you could do something with the inverse of a Cholesky factor of a precision matrix, but that would require that you solve a potentially large triangular system at every leapfrog step.
On Mon, Oct 8, 2018 at 7:53 PM Sam Brilleman [email protected] wrote:
@bgoodri https://github.com/bgoodri Thanks, that makes sense.
But just to elaborate, suppose I wish to specify the coefficients with MVN prior with penalty matrix, is it possible to specify a non-centered parameterisation for that?
The Stan manual shows a MVN non-centered reparameterisation as:
data { int<lower=2> K; vector[K] mu; cov_matrix[K] Sigma; ... transformed data { matrix[K, K] L; L = cholesky_decompose(Sigma); } parameters { vector[K] alpha; ... transformed parameters { vector[K] beta; beta = mu + L * alpha; } model { alpha ~ normal(0, 1); // implies: beta ~ multi_normal(mu, Sigma) ...
but I think what I need is a non-centered reparameterisation of beta ~ multi_normal_prec(mu, smooth_sd * Tau), not beta ~ multi_normal(mu, Sigma). In theory, is that simple to specify?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/280#issuecomment-428014992, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqoAig50xggz5gCgcA4ihVR-Ibo-Jks5ui-VwgaJpZM4T7CES .
I'm opening this issue so there is a forum in which to discuss progress on getting survival models into rstanarm.
I've started a new survival branch here. It currently has a
stan_surv
modelling function, calling asurv.stan
model file. The modelling function returns an object of classstansurv
that inheritsstanreg
.The current task list looks something like:
- [x]
- [ ]
summary
method for stansurv- [x]
prior_summary
method for stansurv- [ ]
basehaz_summary
method for stansurv- [ ]
posterior_survfit
method for stansurv- [ ]
log_lik
,loo
, andwaic
method for stansurv- [ ]
plot
method that includes plot type for baseline hazard- [ ] add option for rcs splines for the log hazard (i.e. instead of B-splines)
- [ ] get the piecewise constant baseline hazard from
stan_jm
working withstan_surv
- [ ] attempt to implement a constrained log-likelihood by just shifting hazard (setting equal to 1E-8 or similar) when the hazard strays negative, see section 2.2 of this paper
- [ ] add functionality for time-dependent effects (i.e. non-proportional hazards) -- should this use survival package's
tt
approach, or something else (e.g. atde()
special function for use in the formula)- [ ] add AFT models
- [ ] add semi-parametric model (possibly like this paper)
- [ ] add frailty models
Hi, could we do a quick update of all the milestones that are currently accomplished? I'd be able to add plot method and posterior_survfit method if they are not ticked yet.
@sambrilleman good idea to use a R.W. priors for the spline coefficients to model the time dependent effects. In general I think we should also offer / implement a R.W. prior for the m-/i-splines in the baseline hazard/cumulative baseline hazard (better a r.w. prior for $\log{\gamma}$, to ensure non-negativity). What do you think?
@ermeel86 Thanks, I would love for your advice on this! I had tried to implement the prior distribution as proposed in Lang and Brezger's "Bayesian P-splines" paper, but got completely confused as to the appropriate Stan implementation. But am I correct in thinking that it is somewhat equivalent to a RW prior? (I've not really used RW priors before)
If a RW prior for the B-spline coefficients is the way to go, then what are your thoughts on this document? In the last section entitled "Location of Knots and the Choice of Priors" Milad discusses smoothing B-splines via a RW prior and the implementation seems to be much more straightforward than what I was attempting when messing around with the Lang and Brezger approach. The code that Milad provides is basically
transformed parameters {
row_vector[num_basis] a;
vector[num_data] Y_hat;
a[1] = a_raw[1];
for (i in 2:num_basis)
a[i] = a[i-1] + a_raw[i]*tau;
Y_hat = a0*to_vector(X) + to_vector(a*B);
}
where the vector of B-spline coefficients is a
and the smoothing parameter (sd) is tau
. That would be super easy to incorporate into stan_surv
. Do you feel that would be an appropriate implementation?
And as you mention, a random walk for $\log{\gamma}$ should also be easy enough.
@csetraynor Thanks, I think I've pretty much sorted a plot.stansurv
method for the baseline hazard and the time-dependent hazard ratio, and otherwise the method calls NextMethod
(i.e. plot.stanreg
) which seems to work for posterior densities, traceplots, etc. And I've gone and ticked off a few other things on that list, including posterior_survfit
. Most are working to some degree, but buggy atm.
To make matters worse, I've muddled branches a bit, because I wanted to incorporate interval censoring in both stan_surv
and stan_jm
, and it seems to be working for the former, but not for the latter. argh. So I'm now working on this branch, which is quite a bit ahead of the survival branch, but has errors for stan_jm
. I'll try tidy some of this stuff up and get a clean-ish branch with up to date changes, but I think it might be a bit difficult to work out how we can coordinate working on such aspects at the same time.
I had tried to implement the prior distribution as proposed in Lang and Brezger's "Bayesian P-splines" paper, but got completely confused as to the appropriate Stan implementation. But am I correct in thinking that it is somewhat equivalent to a RW prior? (I've not really used RW priors before)
See this (adapted from "Bayesian Regression modelling with INLA" by Wang et. al):

This applies for the lag-1 case. I think that one can recover Eq. 5 from the Lang and Brezger paper by taking the limit $\rho\rightarrow 1$. This would then also explain why they say
the prior (5) is improper.
We could also add a hyper prior for $\rho$ over $[-1,1]$. So from an analytic perspective I think the two approaches are equivalent, computationally I am not sure. Maybe @bgoodri can help out here?
Most of the time (i.e. all of the time as far as rstanarm is concerned) you would want to do the multivariate version of this rather than the marginal & conditionals version. But you would want to write a specialized function that does B' * Q * B for a tridiagonal Q and computes the log-determinant of a tridiagonal Q. Alternatively, you can post-multiply the inverse of the Cholesky factor of that Q by a row-vector of coefficients to non-center the betas.
On Sun, Oct 14, 2018 at 2:19 PM ermeel [email protected] wrote:
I had tried to implement the prior distribution as proposed in Lang and Brezger's "Bayesian P-splines" paper, but got completely confused as to the appropriate Stan implementation. But am I correct in thinking that it is somewhat equivalent to a RW prior? (I've not really used RW priors before)
See the this (adapted from "Bayesian Regression modelling with INLA" by Wang et. al):
[image: ar_multivariate_normals] https://user-images.githubusercontent.com/5255975/46920373-b1ae7780-cfed-11e8-9df5-e3aeccc590ed.png
This applies for the lag-1 case. I think that one can recover Eq. 5 from the Lang and Brezger paper by taking the limit $\rho\rightarrow 1$. This would then also explain why they say
the prior (5) is improper.
We could also add a hyper prior for $\rho$ over $[-1,1]$. So from an analytic perspective I think the two approaches are equivalent, computationally I am not sure. Maybe @bgoodri https://github.com/bgoodri can help out here?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/280#issuecomment-429649310, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqk9fDLJsQMDUua4SK-C-AeyERYOtks5uk4A2gaJpZM4T7CES .
but I think what I need is a non-centered reparameterisation of beta ~ multi_normal_prec(mu, smooth_sd * Tau), not beta ~ multi_normal(mu, Sigma). In theory, is that simple to specify?
@sambrilleman couldn’t we use the methods described in algorithm 2 and 3 of https://arxiv.org/abs/1801.03791 here? Especially the sparse back substitution algorithm mentioned there?
It isn't so complicated: If you can construct the precision matrix, you can use the inverse of its Cholesky factor in an uncentered parameterization of the coefficients. A similar thing came up in https://andrewgelman.com/2016/09/02/two-weird-tricks-for-fast-conditional-autoregressive-models-in-stan/#comment-302825
On Mon, Oct 15, 2018 at 4:30 PM ermeel [email protected] wrote:
but I think what I need is a non-centered reparameterisation of beta ~ multi_normal_prec(mu, smooth_sd * Tau), not beta ~ multi_normal(mu, Sigma). In theory, is that simple to specify?
@sambrilleman https://github.com/sambrilleman couldn’t we use the methods described in algorithm 2 and 3 of https://arxiv.org/abs/1801.03791 here? Especially the sparse back substitution algorithm mentioned there?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/280#issuecomment-430001868, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqu5dn4sEDxR6HRvzmUi2XhBjGaCDks5ulPBkgaJpZM4T7CES .
There you say
However, this triangular solve is an O(N^3) operation, so it is slow for large N (which could easily be much larger than in the Scottish Lip Cancer example).
So if this is the case. The sparse back substitution algorithm from the above reference is asymptotically computationally more efficient, since it runs in linear time. However, i think you have to do the inverse calculation only once and a linear matrix multiplication at every leap frog step, but the above approach I refer to would do only a linear operation every leap frog step, right?
Yeah, you can write your own code to do the bidiagonal solve.
On Mon, Oct 15, 2018 at 4:43 PM ermeel [email protected] wrote:
There you say
However, this triangular solve is an O(N^3) operation, so it is slow for large N (which could easily be much larger than in the Scottish Lip Cancer example).
So if this is the case. The sparse back substitution algorithm from the above reference is asymptotically computationally more efficient, since it runs in linear time. However, i think you have to do the inverse calculation only once and a linear matrix multiplication at every leap frog step, but the above approach I refer to would do only a linear operation every leap frog step, right?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/280#issuecomment-430006125, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqo80TMmaJtylya4bocN4StRmxIVyks5ulPN9gaJpZM4T7CES .
@ermeel86 @bgoodri Just to follow up on this. I've opened a pull request with a first working version of stan_surv
that I hope is ready for release. I've just implemented the time-dependent effects with a random walk prior using the conditional specification as describe in Milad's case study. You'll find the Stan code here. It seems to be working, but I'm happy for someone to implement the fancier multivariate version (discussed above) if you want!
Hi All, I apologise if this is not the correct place to ask but I have been using stan_surv, and am having difficulty with hierarchical modelling. The model is a simple time to event analysis with a single predictor and three levels of a grouping parameter where each one represents a different cancer (number of levels is three).
I keep getting errors such as
Error in scale(PRS_Score) | Cancer : operations are possible only for numeric, logical or complex types
when the grouping variable is a factor variable with three levels.
When I change the group variable to a numeric I get this error message:
Error: Cannot deal with empty interaction levels found in columns: PRS_Score | CancerTRUE
The model is not complex:
Mod <- stan_surv(formula = Surv(y, failed) ~ PRS_Score + (PRS_Score | Cancer), data = comb, basehaz = "exp")
But it seems to work fine if I try and run it as a stan_glm ignoring the time variable:
fit4 <- stan_glmer(failed ~ PRS_Score + (PRS_Score - 1 | Cancer), data = comb, family = binomial(link = "logit"), cores = 4, iter = 500)
and the stan_surv work fine when I don't include the hierarchical parameter.
Just wondering if I am doing anything wrong here that I can work on.
Thank you all in advance!
Hi Karl
I've not yet implemented hierarchical parameters for stan_surv
. Sorry that the error message was so useless, I will fix that.
Hierarchical survival models will happen eventually. But I ideally wanted to get a version of stan_surv
out on CRAN and roadtested a bit before adding too many features.
Ahhhhh ok no problem! It is already really very good! Thank you. Is there any chance you know of any workarounds I may use in the meantime? Again though, thank you!
@karlsmithbyrne If you want to go the Bayesian route with a proportional hazards model, then I'm not sure there are many alternatives. I think brms currently fits exponential and Weibull accelerated failure time (AFT) models with groups-specific parameters. I see that you were trying to use an exponential baseline hazard with stan_surv
in the example above. The exponential or Weibull proportional hazards model is mathematically equivalent to an exponential or Weibull AFT model, just reparameterised. This is true without group-specific parameters anyway -- I'm not sure if it holds true with group-specific parameters -- maybe someone else (@ermeel86 @statwonk) can help clarify whether that is the case. So, one option is to just use brms to fit an exponential/Weibull AFT model and if you want hazard ratios then you can obtain them using a one-to-one tranformation of the AFT coefficients.
There may be other packages I'm not aware of that fit Bayesian PH models with random effects -- you could check the CRAN task view.
In any case, with only three levels for your grouping factor Cancer
, you are going to have very little information with which to estimate the variance parameter for distribution of the Cancer
-specific slope parameters. Perhaps you would be better of treating it as a fixed effect and modelling the differences between cancers via an interaction term, i.e. using a model like:
Mod <- stan_surv(formula = Surv(y, failed) ~ PRS_Score + PRS_Score * Cancer, data = comb, basehaz = "exp")
Although I'm not familiar with the research question etc.
Hi guys.
Apologies if this is not the correct place to ask this, but I thought I would post about the survival models in this open Issue rather than starting a new one.
Firstly, would like to say the preprint on arXiv "Bayesian Survival Analysis Using the rstanarm R Package" is excellent. I would recommend possibly putting some code up on how to install the specific feature branch though, as I don't think its intuitive how to download that version of the GitHub rstanarm package (get the stan_surv() functions) from your preprint (devtools::install_github('stan-dev/rstanarm@feature/frailty-models')
).
Secondly, I am implementing a random effects model and am trying to create population-level predictions (i.e. averaging over the random effects but having different predictions for each group-level effect).
In this vignette, it seems that I want to set dynamic = FALSE
in posterior_survfit()
, but when I create dummy random effect variables that are not currently used, I get an error that my factor has new levels.
When I supply levels of the random factor that are already present, the code works and I think having dynamic = FALSE
still only conditions on the group-level effects and marginalises over individual level estimates.
Just checking that (a) my interpretation is correct and (b) highlighting that you cannot specific new levels in this branch atm when using posterior_survfit()
.
Many thanks and super impressed with the models and the vignettes and the preprint.
Hey @padpadpadpad... so sorry it has taken me this long to get back to you!!
I would recommend possibly putting some code up on how to install the specific feature branch though, as I don't think its intuitive how to download that version of the GitHub rstanarm package (get the stan_surv() functions) from your preprint (devtools::install_github('stan-dev/rstanarm@feature/frailty-models')).
That is a great idea, I will aim to do that soon. However, the more up to date branch is in fact feature/survival
-- it includes everything on feature/frailty-models
as well as a few additional more recent commits. I'd suggest you install feature/survival
(sorry for the confusion!).
In this vignette, it seems that I want to set dynamic = FALSE in posterior_survfit()
That vignette is specific to a certain class of survival models -- joint models for longitudinal and survival data. The posterior_survfit()
method for stan_jm
models has the dynamic
argument, but there is no such argument for the posterior_survfit()
method for stan_surv
models. After you install the feature/survival
branch, check out the function signatures under ?posterior_survfit
and you should see that only the method for stanjm
uses the dynamic argument (since it requires conditioning on longitudinal data).
but when I create dummy random effect variables that are not currently used, I get an error that my factor has new levels.
From memory, you should be able to provide new levels. If the newdata
in posterior_survfit.stansurv()
has random effect factor levels that were in the original data then it should condition on them, and if they were not in the original data then it should marginalise over them. I'm not sure why you are getting the error. Perhaps install the feature/survival
branch and then try run the following example, which demonstrates conditioning on: (i) a site
with a positive frailty (i.e. higher hazard, worse survival), (ii) a site
with a negative frailty (i.e. lower hazard, better survival), and (iii) a new site
that was not in the original data, so the predictions are marginalised over the random effects. All have trt = 0
so that random effect factor levels are the only thing that differ across the prediction covariate data:
mod <- stan_surv(
formula = Surv(eventtime, status) ~ trt + (1 | site),
data = frail[1:40,],
basehaz = "exp",
chains = 1,
refresh = 0,
iter = 600,
seed = 123)
# (i) prediction data for an existing site with a positive random effect
nd1 = frail[1,]
# (ii) prediction data for an existing site with a negative random effect
nd2 = frail[12,]
# (iii) prediction data for a new site (random effects will be marginalised over)
nd3 = nd1
nd3[['id']] = 999
nd3[['site']] = 9999
# predictions
ps1 = posterior_survfit(mod, nd1)
ps2 = posterior_survfit(mod, nd2)
ps3 = posterior_survfit(mod, nd3)
head(ps1)
head(ps2)
head(ps3)
The example should show that, as expected, ps1
has worse survival, ps2
has better survival, and ps3
is in between (essentially a random intercept = 0) but with much wider confidence interval due to the uncertainty in marginalising over the random intercepts.
If that example doesn't work for you or doesn't help clarify, then perhaps open a separate Github issue and I can try debug the issue for you!
Hey @sambrilleman thank you so much for your response.
I have installed from the survival branch now but cannot get the man pages to install (see this issue. This means I cannot look at ?posterior_survfit
unfortunately.
I did manage to get the code working though so that's great. In ps3, we are predicting for a new level of the random effect and marginalising over the random effect, but is there also a way to not include the random effects in the prediction and instead use only the fixed/treatment effects (akin to re.form= NA
in posterior_predict
). So to get predictions over the treatment effects but not the random effects?
Many thanks Dan
I did manage to get the code working though so that's great. In ps3, we are predicting for a new level of the random effect and marginalising over the random effect, but is there also a way to not include the random effects in the prediction and instead use only the fixed/treatment effects (akin to re.form= NA in posterior_predict). So to get predictions over the treatment effects but not the random effects?
hmm, sorry I don't think I implemented that! :-( I guess what you are suggesting is akin to a new group/cluster/factor level for the random effect, but you know with absolute certainty that their random effect value is 0. So I'd almost argue that such a prediction isn't a realistic acknowledgement of the uncertainty in the model...? But maybe I am wrong...
Hi Sam
These are valid points for the prediction of new individuals. I took the approach of standardising over all individuals within each treatment. But it meant I had to do a separate call of posterior_survfit()
for each treatment, instead of a single call to posterior_survfit()
which standardises over each individual for each covariate.
This approach seems to work for what I want. Thank you for responding and being so prompt with your responses.
Sweet, glad you found a solution that seems to get you what you need!
Yeah, you can even do causal inference type things where you standardise over all the individuals (i.e. from both treatments) first assuming they have treatment fixed to 0 and then treatment fixed to 1, if that's what you want, so for e.g. something like:
mod <- stan_surv(
formula = Surv(eventtime, status) ~ trt + (1 | site),
data = frail,
basehaz = "exp",
chains = 1,
refresh = 0,
iter = 600,
seed = 123)
# create two prediction datasets each based on the original data
frail_0 = frail
frail_1 = frail
# fix the covariate value in the two prediction datasets
frail_0['trt'] = 0
frail_1['trt'] = 1
# predict using all ids and sites from the original data, but fixing the covariate value
ps_0 = posterior_survfit(mod, frail_0, standardise = TRUE, times = 0)
ps_1 = posterior_survfit(mod, frail_1, standardise = TRUE, times = 0)
plot(ps_0)
plot(ps_1)
but perhaps reducing frail_0
and frail_1
to one row per site
(since the original data would have multiple rows per site, which would carry through as multiple rows in the prediction data, and then essentially end up like weightings in the standardised predictions I guess).
Or whatever makes sense for the type of comparisons you want to make!
Hi all and thanks for making survival analysis with rstanarm a reality!
I encountered a small bug when working through and extending the examples in the paper. In section 6.6. we fit a simple multilevel model. Elsewhere in the paper it is mentioned that ranef()
should return the random effect parameter estimates, but it does not seem to work:
mod_randint <-
rstanarm::stan_surv(
formula = Surv(eventtime, status) ~ trt + (1 | site),
data = frail,
basehaz = "exp"
)
rstanarm::ranef(mod_randint)
Error: This method is for stan_glmer and stan_lmer models only.
I am using linux, and rstanarm survival branch (version 2.19.1).
EDIT:
After digging a bit deeper, it seems that ranef.stanreg
calls .glmer_check
, which in turn calls is.mer
, which fails because the model object lacks the correct class, and is also missing glmod
component, whatever it is.
What is btw the preferred way to post issues related to the survival functionality in rstanarm? This issue or a separate issue, or something else?
And while I'm at it, do you have any estimate on when you could be merging the feature/survival branch and getting the functionality to CRAN?