comorbidity
comorbidity copied to clipboard
Variable labels disappearing after deriving scores
Possibly related to #39 - in the code below I can see the variable labels and access them through the variable.labels attribute of the dataframe:
set.seed(1)
x <- data.frame(
id = sample(1:15, size = 200, replace = TRUE),
code = sample_diag(200),
stringsAsFactors = FALSE
)
# Charlson score based on ICD-10 diagnostic codes:
x1 <- comorbidity(x = x, id = "id", code = "code", map = "charlson_icd10_quan", assign0 = FALSE)
attributes(x1)
However, if I append the dataframe with the score as well (since I'm interested both in the scores and underlying comorbidities) then I lose the variable.labels attribute (using tidyverse since it's in my workflow):
library(tidyverse)
set.seed(1)
x <- data.frame(
id = sample(1:15, size = 200, replace = TRUE),
code = sample_diag(200),
stringsAsFactors = FALSE
)
# Charlson score based on ICD-10 diagnostic codes:
x1 <- comorbidity(x = x, id = "id", code = "code", map = "charlson_icd10_quan", assign0 = FALSE) %>%
score(x = ., weights = "charlson", assign0 = FALSE)
attributes(x1)
This seems to be a result of applying the variable labels as an attribute of the dataframe, rather than of the variable. But this is harder to work around now that mapping and scoring are distinct functions.
Hi,
x1
here should not have any name, as it's just the score column?
The example code above does not add the score to x
:
library(comorbidity)
#> This is {comorbidity} version 1.0.0.
#> A lot has changed since the last release on CRAN, please check-out breaking changes here:
#> -> https://ellessenne.github.io/comorbidity/articles/C-changes.html
library(tidyverse)
set.seed(1)
x <- data.frame(
id = sample(1:15, size = 200, replace = TRUE),
code = sample_diag(200),
stringsAsFactors = FALSE
)
# Charlson score based on ICD-10 diagnostic codes:
x1 <- comorbidity(x = x, id = "id", code = "code", map = "charlson_icd10_quan", assign0 = FALSE) %>%
score(x = ., weights = "charlson", assign0 = FALSE)
attributes(x1)
#> $map
#> [1] "charlson_icd10_quan"
#>
#> $weights
#> [1] "charlson"
x1
#> [1] 2 6 0 2 0 0 4 0 3 2 0 0 3 0 2
#> attr(,"map")
#> [1] "charlson_icd10_quan"
#> attr(,"weights")
#> [1] "charlson"
Created on 2022-03-01 by the reprex package (v2.0.1)
Sorry, I made a typo in my MWE but also I've tried it on another computer and cannot replicate the issue - clearly a clash somewhere. I'll dig further.
In the meantime here's the proper MWE:
library(comorbidity)
library(tidyverse)
set.seed(1)
x <- data.frame(
id = sample(1:15, size = 200, replace = TRUE),
code = sample_diag(200),
stringsAsFactors = FALSE
)
# Charlson score based on ICD-10 diagnostic codes:
x1 <- comorbidity(x = x, id = "id", code = "code", map = "charlson_icd10_quan", assign0 = FALSE)
x2 <- x1 %>%
mutate(score = score(x = ., weights = "charlson", assign0 = FALSE))
attributes(x1)
attributes(x2)
Update - it looks like it's linked to updating dplyr
from 1.0.6 to 1.0.8 - sorry! No idea why it's happening though.
If you fancy trying it yourself:
require(devtools)
install_version("dplyr", version = "1.0.6", repos = "http://cran.us.r-project.org")
library(tidyverse)
library(comorbidity)
#> This is {comorbidity} version 1.0.0.
#> A lot has changed since the last release on CRAN, please check-out breaking changes here:
#> -> https://ellessenne.github.io/comorbidity/articles/C-changes.html
set.seed(1)
x <- data.frame(
id = sample(1:15, size = 200, replace = TRUE),
code = sample_diag(200),
stringsAsFactors = FALSE
)
# Charlson score based on ICD-10 diagnostic codes:
x1 <- comorbidity(x = x, id = "id", code = "code", map = "charlson_icd10_quan", assign0 = FALSE)
x2 <- x1 %>%
mutate(score = score(x = ., weights = "charlson", assign0 = FALSE)) %>%
rename_with(.fn = ~paste0(., "_cci"), -"id")
attributes(x1)
attributes(x2)
Thanks, this seems to be an issue related to {dplyr}, as the following does not drop the attributes:
library(comorbidity)
set.seed(1)
x <- data.frame(
id = sample(1:15, size = 200, replace = TRUE),
code = sample_diag(200),
stringsAsFactors = FALSE
)
x1 <- comorbidity(x = x, id = "id", code = "code", map = "charlson_icd10_quan", assign0 = FALSE)
x1$score <- score(x = x1, weights = "charlson", assign0 = FALSE)
attributes(x1)
#> $names
#> [1] "id" "ami" "chf" "pvd" "cevd" "dementia"
#> [7] "copd" "rheumd" "pud" "mld" "diab" "diabwc"
#> [13] "hp" "rend" "canc" "msld" "metacanc" "aids"
#> [19] "score"
#>
#> $row.names
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
#>
#> $variable.labels
#> [1] "ID"
#> [2] "Myocardial infarction"
#> [3] "Congestive heart failure"
#> [4] "Peripheral vascular disease"
#> [5] "Cerebrovascular disease"
#> [6] "Dementia"
#> [7] "Chronic obstructive pulmonary disease"
#> [8] "Rheumatoid disease"
#> [9] "Peptic ulcer disease"
#> [10] "Mild liver disease"
#> [11] "Diabetes without chronic complications"
#> [12] "Diabetes with chronic complications"
#> [13] "Hemiplegia or paraplegia"
#> [14] "Renal disease"
#> [15] "Cancer (any malignancy)"
#> [16] "Moderate or severe liver disease"
#> [17] "Metastatic solid tumour"
#> [18] "AIDS/HIV"
#>
#> $map
#> [1] "charlson_icd10_quan"
#>
#> $class
#> [1] "comorbidity" "data.frame"
Created on 2022-03-02 by the reprex package (v2.0.1)
This has been already reported (I think), see e.g. https://github.com/tidyverse/dplyr/issues/6100 and https://github.com/tidyverse/dplyr/pull/6102, my understanding is that a fix is planned to {dplyr} 1.1.0.