survival icon indicating copy to clipboard operation
survival copied to clipboard

Feature request: support for non-syntactic names in `survfit`

Open mattsecrest opened this issue 1 year ago • 1 comments

I wonder if non-syntactic names can be supported consistently? It can be confusing that they work on LHS of Surv() formula but not RHS. The below example is for survival 3.5.7

library(survival)
library(tibble)

df <- tibble(
  os_months = abs(rnorm(100, 12, .5)),
  os_event = rbinom(100, 1, .5),
  `OS event non-syntactic` = os_event,
  group = sample(c("group 1", "group 2"), 100, replace = TRUE),
  `group non-syntactic` = group
)

# This works
survfit(
  Surv(os_months, os_event) ~ group,
  data = df
)

# This also works
survfit(
  Surv(os_months, `OS event non-syntactic`) ~ group,
  data = df
)

# This does not work
survfit(
  Surv(os_months, os_event) ~ `group non-syntactic`,
  data = df
)

Alternatively, a clearer message to the user when non-syntactic names are used could be helpful as well:

Error in `[.data.frame`(mf, ll) : undefined columns selected

mattsecrest avatar Aug 29 '23 00:08 mattsecrest

I have very little sympathy for non-syntactic names, first of all. It's along the lines of my argument that "A_very_long_file_name_is_not_a_substitute_for_documentation". Second, and more importantly, I do all the formula processing via calls to the standard model.frame() function within R: if those fail I'm not about to fix it. Third, I have a lot of other things for survival in the queue, a couple are actual bugs (gives a wrong answer).

In this case a traceback shows that it is the strata() function which fails. Perhaps you would like to figure it out and submit a patch?

therneau avatar Oct 11 '23 22:10 therneau