rms
rms copied to clipboard
Predicting from cph model with categorical variables interacting with strat term
There seems to be an issue with making predictions from a model that contains interactions between a strat
term and a categorical factor. To demonstrate, I'm going to use the example from the cph
help file.
library(rms)
#example from cph help page
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
#> Error in label(age) <- "Age": could not find function "label<-"
sex <- factor(sample(c('Male','Female'), n,
rep=TRUE, prob=c(.6, .4)))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
#> Error in label(dt) <- "Follow-up Time": could not find function "label<-"
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
#> Error in UseMethod("units<-"): no applicable method for 'units<-' applied to an object of class "c('double', 'numeric')"
dd <- datadist(age, sex)
#> Error in datadist(age, sex): could not find function "datadist"
options(datadist='dd')
S <- Surv(dt,e)
#> Error in Surv(dt, e): could not find function "Surv"
# categorize age to demonstrate issue
age2 = cut2(age, g = 3)
#> Error in cut2(age, g = 3): could not find function "cut2"
# fit model using age categories
f = cph(S ~ age2*strat(sex), x = TRUE)
#> Error in cph(S ~ age2 * strat(sex), x = TRUE): could not find function "cph"
Created on 2021-07-22 by the reprex package (v1.0.0)
Just to demonstrate the issue I'm having, I've created quantile-based age categories instead of using continuous age
, then fit a model that has age category as a predictor which is interacted with sex
as a stratification factor.
Now, when I try to make a prediction from this model I get an error.
#try to make prediction from model object f
predict(f, newdata = data.frame(age2 = "[43.6,54.6)", sex="Male"))
#> Error in matxv(X, coeff, kint = kint): columns in a (5) must be <= length of b (4)
Created on 2021-07-22 by the reprex package (v1.0.0)
This seems odds, but asking instead for the X matrix for those predictor values I think we see why the issue arises.
#try to get X matrix for object f
predict(f, newdata = data.frame(age2 = "[43.6,54.6)", sex="Male"), type = "x")
#> age2[43.6,54.6) age2[54.6,98.9] age2[11.3,43.6):strat(sex)Male
#> 1 1 0 0
#> age2[43.6,54.6):strat(sex)Male age2[54.6,98.9]:strat(sex)Male
#> 1 1 0
#> attr(,"strata")
#> [1] sex=Male
#> Levels: sex=Female sex=Male
Created on 2021-07-22 by the reprex package (v1.0.0)
It looks like what's happening is that an extra column is sneaking into X called age2[11.3,43.6):strat(sex)Male
which wouldn't be represented by a beta coeficient in the Cox PH model.
Please provide a reproducible example that does not throw any errors before predict()
, i.e., where you properly require(rms)
or library(rms)
at the beginning of the code R can find label<-
and units<-
.
Was so excited to see reprex working I missed the errors. Sorry! Updated now with fully working code.
library(rms)
#> Loading required package: Hmisc
#> Loading required package: lattice
#> Loading required package: survival
#> Loading required package: Formula
#> Loading required package: ggplot2
#>
#> Attaching package: 'Hmisc'
#> The following objects are masked from 'package:base':
#>
#> format.pval, units
#> Loading required package: SparseM
#>
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#>
#> backsolve
#example from cph help page
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n,
rep=TRUE, prob=c(.6, .4)))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
dd <- datadist(age, sex)
options(datadist='dd')
S <- Surv(dt,e)
# categorize age to demonstrate issue
age2 = cut2(age, g = 3)
# fit model using age categories
f = cph(S ~ age2*strat(sex), x = TRUE)
#try to make prediction from model object f
predict(f, newdata = data.frame(age2 = "[43.6,54.6)", sex="Male"))
#> Error in matxv(X, coeff, kint = kint): columns in a (5) must be <= length of b (4)
#try to get X matrix for object f
predict(f, newdata = data.frame(age2 = "[43.6,54.6)", sex="Male"), type = "x")
#> age2[43.6,54.6) age2[54.6,98.9] age2[11.3,43.6):strat(sex)Male
#> 1 1 0 0
#> age2[43.6,54.6):strat(sex)Male age2[54.6,98.9]:strat(sex)Male
#> 1 1 0
#> attr(,"strata")
#> [1] sex=Male
#> Levels: sex=Female sex=Male
Created on 2021-09-08 by the reprex package (v1.0.0)