labelled
labelled copied to clipboard
Setting value labels for several variables at once (preferably using dplyr::across)
Inspired by this question (https://stackoverflow.com/questions/73818355/how-to-recode-values-in-haven-labelled-vectors-in-r) I'm wondering, how one could set value labels for several variables at once. Apparently, dplyr::across doesn't work, but since labelled vectors are kind of a close relative of haven/tidyverse, I'm wondering if it would be possible to implement the possibility to use set_value_labels in a dplyr::mutate. Or at least make it possible to use a tidyselect selection of variables in set_value_labels.
regex:
x <- structure(list(q0015_0001 = structure(c(3, 5, NA, 3, 1, 2, NA, NA, 3, 4, 2, NA, 2, 2, 4, NA,
4, 3, 3, 3, 3, 2, NA, NA, 2), label = "Menu Options/Variety", format.spss = "F8.2", labels =
c(`Very Dissatisfied` = 1, Dissatisfied = 2, Neutral = 3, Satisfied = 4, `Very Satisfied` = 5),
class = c("haven_labelled", "vctrs_vctr", "double")), q0015_0002 = structure(c(4, 4, NA, 5, 3, 3,
NA, NA, 3, 4, 2, NA, 5, 2, 4, NA, 4, 3, 4, 4, 4, 4, NA, NA, 2), label = "Cleanliness", format.spss
= "F8.2", labels = c(`Very Dissatisfied` = 1, Dissatisfied = 2, Neutral = 3, Satisfied = 4, `Very
Satisfied` = 5), class = c("haven_labelled", "vctrs_vctr", "double")), q0015_0003 =
structure(c(2, 2, NA, 3, 1, 2, NA, NA, 3, 4, 3, NA, 4, 3, 4, NA, 3, 2, 4, 4, 2, 2, NA, NA, 1),
label = "Taste and Quality of Food", format.spss = "F8.2", labels = c(`Very Dissatisfied` = 1,
Dissatisfied = 2, Neutral = 3, Satisfied = 4, `Very Satisfied` = 5), class = c("haven_labelled",
"vctrs_vctr", "double"))), row.names = c(NA, -25L), class = c("tbl_df", "tbl", "data.frame"),
label = "File created by user")
Doesn't work:
library(labelled)
library(tidyverse)
x |>
mutate(across(starts_with("q0015"),
~dplyr::recode(., `1` = -2, `2` = -1, `3` = 0, `4` = 1, `5` = 2))) |>
mutate(across(starts_with("q0015"),
~set_value_labels(., c("Very Dissatisfied" = -2, "Dissatisfied" = -1, "Neutral" = 0, "Satisfied" = 1, "Very Satisfied" = 2))))
I also tried different variants of using purrr::map, with no success. Or are the possibly other relative easy solution to set labels for several vars (I rememeber that for VARIABLE labels I used an approach of providing a named list of vars and new var labels, but not sure how that could look for VALUE labels, because each element/variable would have several value lables).
UPDATE:
Got it working with this ugly chunk of code, but wondering if there could be an easier solution:
x |>
mutate(across(starts_with("q0015"),
~dplyr::recode(., `1` = -2, `2` = -1, `3` = 0, `4` = 1, `5` = 2))) |>
set_value_labels(.labels = rep(list(c("Very Dissatisfied" = -2,
"Dissatisfied" = -1,
"Neutral" = 0,
"Satisfied" = 1,
"Very Satisfied" = 2)),
x |>
select(starts_with("q0015")) |>
ncol()) |>
setNames(nm = x |>
select(starts_with("q0015")) |>
names()))
set_value_labels() is designed to be applied on a data frame and not on a vector (so it cannot be applied within mutate() or across()).
The easiest way is probably to write your own function to be called in across()
x <- structure(list(
q0015_0001 = structure(c(
3, 5, NA, 3, 1, 2, NA, NA, 3, 4, 2, NA, 2, 2, 4, NA,
4, 3, 3, 3, 3, 2, NA, NA, 2
),
label = "Menu Options/Variety", format.spss = "F8.2", labels =
c(`Very Dissatisfied` = 1, Dissatisfied = 2, Neutral = 3, Satisfied = 4, `Very Satisfied` = 5),
class = c("haven_labelled", "vctrs_vctr", "double")
), q0015_0002 = structure(c(
4, 4, NA, 5, 3, 3,
NA, NA, 3, 4, 2, NA, 5, 2, 4, NA, 4, 3, 4, 4, 4, 4, NA, NA, 2
),
label = "Cleanliness",
format.spss = "F8.2", labels = c(`Very Dissatisfied` = 1, Dissatisfied = 2, Neutral = 3, Satisfied = 4, `Very
Satisfied` = 5), class = c("haven_labelled", "vctrs_vctr", "double")
), q0015_0003 =
structure(c(2, 2, NA, 3, 1, 2, NA, NA, 3, 4, 3, NA, 4, 3, 4, NA, 3, 2, 4, 4, 2, 2, NA, NA, 1),
label = "Taste and Quality of Food", format.spss = "F8.2", labels = c(
`Very Dissatisfied` = 1,
Dissatisfied = 2, Neutral = 3, Satisfied = 4, `Very Satisfied` = 5
), class = c(
"haven_labelled",
"vctrs_vctr", "double"
)
)
),
row.names = c(NA, -25L), class = c("tbl_df", "tbl", "data.frame"),
label = "File created by user"
)
x
#> # A tibble: 25 × 3
#> q0015_0001 q0015_0002 q0015_0003
#> <hvn_lbll> <hvn_lbll> <hvn_lbll>
#> 1 3 4 2
#> 2 5 4 2
#> 3 NA NA NA
#> 4 3 5 3
#> 5 1 3 1
#> 6 2 3 2
#> 7 NA NA NA
#> 8 NA NA NA
#> 9 3 3 3
#> 10 4 4 4
#> # … with 15 more rows
library(labelled)
library(tidyverse)
recode_satisfaction <- function(v) {
v <- v |>
dplyr::recode(`1` = -2, `2` = -1, `3` = 0, `4` = 1, `5` = 2)
val_labels(v) <- c("Very Dissatisfied" = -2, "Dissatisfied" = -1, "Neutral" = 0, "Satisfied" = 1, "Very Satisfied" = 2)
v
}
x |>
mutate(across(starts_with("q0015"), recode_satisfaction))
#> # A tibble: 25 × 3
#> q0015_0001 q0015_0002 q0015_0003
#> <dbl+lbl> <dbl+lbl> <dbl+lbl>
#> 1 0 [Neutral] 1 [Satisfied] -1 [Dissatisfied]
#> 2 2 [Very Satisfied] 1 [Satisfied] -1 [Dissatisfied]
#> 3 NA NA NA
#> 4 0 [Neutral] 2 [Very Satisfied] 0 [Neutral]
#> 5 -2 [Very Dissatisfied] 0 [Neutral] -2 [Very Dissatisfied]
#> 6 -1 [Dissatisfied] 0 [Neutral] -1 [Dissatisfied]
#> 7 NA NA NA
#> 8 NA NA NA
#> 9 0 [Neutral] 0 [Neutral] 0 [Neutral]
#> 10 1 [Satisfied] 1 [Satisfied] 1 [Satisfied]
#> # … with 15 more rows
Created on 2022-09-23 with reprex v2.0.2
Fair enough, that would work. It's an interesting question conceptually, though. I see how for VARIABLE labels it makes only sense to apply them on a data frame level. For value labels, though, I think the option to apply them on a column-level (and conseuqently across several columns) could have some value. If that's a potential future option to add such a feature to the package, that'd be great. If not, I think your custom function workaround makes sense.
I will explore the possibility of allowing set_value_labels() to be applied to a vector.
At that stage, I do not want to complexify to much the list of functions with the package
You may have a look at #127 with extends set_value_labels() and similar verbs to vectors.
Thank you! At first glance, this works nicely:
Using the reprex from initial post:
# Using the set_value_labels from #127
test_new <- x |>
mutate(across(starts_with("q0015"),
~dplyr::recode(., `1` = -2, `2` = -1, `3` = 0, `4` = 1, `5` = 2))) |>
mutate(across(everything(), ~set_value_labels(., .labels = c("Very Dissatisfied" = -2, "Dissatisfied" = -1, "Neutral" = 0, "Satisfied" = 1, "Very Satisfied" = 2))))
# Using set_value_labels from the current CRAN version
test_old <- x |>
mutate(across(starts_with("q0015"),
~dplyr::recode(., `1` = -2, `2` = -1, `3` = 0, `4` = 1, `5` = 2))) |>
labelled::set_value_labels(.labels = rep(list(c("Very Dissatisfied" = -2,
"Dissatisfied" = -1,
"Neutral" = 0,
"Satisfied" = 1,
"Very Satisfied" = 2)),
x |>
select(starts_with("q0015")) |>
ncol()) |>
setNames(nm = x |>
select(starts_with("q0015")) |>
names()))
identical(test_new, test_old)
[1] TRUE