healthcareai-r icon indicating copy to clipboard operation
healthcareai-r copied to clipboard

prep_data date column recognition is limited

Open michaellevy opened this issue 6 years ago • 0 comments

It works for Catalyst data where all date or datetime columns will end with DTS, but otherwise I don't think there's a way to declare or get datetime columns noticed, which feels pretty limiting if use beyond Catalyst is desirable. posix time columns error, and character timestamps get removed as all-unique:

library(healthcareai)
pima_diabetes$admit_timestamp <- as.POSIXlt(rnorm(nrow(pima_diabetes), sd = 3600 * 24 * 365), origin = Sys.time())
prep_data(pima_diabetes, patient_id, outcome = diabetes)
#> Error in model.frame.default(formula, data): invalid type (list) for variable 'admit_timestamp'
library(healthcareai)
pima_diabetes$admit_timestamp <- as.character(as.POSIXlt(rnorm(nrow(pima_diabetes), sd = 3600 * 24 * 365), origin = Sys.time()))
prep_data(pima_diabetes, patient_id, outcome = diabetes)
#> Warning in find_columns_to_ignore(d, c(rlang::quo_name(outcome), ignored)):
#> The following column(s) have a unique value for every row so will be
#> ignored: admit_timestamp
#> Warning in prep_data(pima_diabetes, patient_id, outcome = diabetes): The
#> following variable(s) look a lot like identifiers: They are character-type
#> and have a unique value on every row. They will be ignored: admit_timestamp
#> Training new data prep recipe...

michaellevy avatar Sep 06 '18 05:09 michaellevy