vctrs icon indicating copy to clipboard operation
vctrs copied to clipboard

C implementations of `vec_case_when()` and `vec_case_match()`

Open DavisVaughan opened this issue 3 years ago • 2 comments

And possibly vec_if_else() because it would be nice for, say, ggplot2 to be able to use this.

Consider if we can figure out some kind of 1:1 interface that doesn't always require a list for values and haystacks to nicely replace plyr::mapvalues() https://github.com/tidyverse/dplyr/issues/7027 (the list approach is very powerful and general because it allows for 1:m and m:1 replacements, but is not always needed)

DavisVaughan avatar Aug 22 '22 14:08 DavisVaughan

Is it still planned? I saw this was the proposed solution to replacing splicing for dplyr::recode(). Using recode() is slowing down code because of lifecycle, so I wondered if I could rely on a faster vctrs implementation sometimes in the future..

Cf. https://github.com/tidyverse/dplyr/issues/6623#issuecomment-1362887413

The bench marks for recreating the formula can be a bit expansive

# manually created
a_formula <- c("xx" ~ "x", "y" ~ "yy")

dplyr::case_match(
  c("x", "y", "z"),
  "zz" ~ "a",
  !!!a_formula,
  .default = NA_character_
)
#> [1] NA   "yy" NA

# What I have 
a_list <- c("xx" = "x", "y" = "yy")
dplyr::recode(
  c("x", "y", "z"),
  !!!a_list,
  .default = NA_character_
)
#> [1] NA   "yy" NA

# programatically recreated
a_formula_from_list <- purrr::map2(
  names(a_list),
  unname(a_list),
  rlang::new_formula
)
dplyr::case_match(
  c("x", "y", "z"),
  "zz" ~ "a",
  !!!a_formula_from_list,
  .default = NA_character_
)
#> [1] NA   "yy" NA

bench::mark(
  recode = dplyr::recode(
    c("x", "y", "z"),
    !!!a_list,
    .default = NA_character_
  ),
  casematch_program = {
    a_formula_from_list <- purrr::map2(
      names(a_list),
      unname(a_list),
      rlang::new_formula
    )
    dplyr::case_match(
      c("x", "y", "z"),
      "zz" ~ "a",
      !!!a_formula_from_list,
      .default = NA_character_
    )
  },
  casematch_regular = dplyr::case_match(
    c("x", "y", "z"),
    "zz" ~ "a",
    !!!a_formula,
    .default = NA_character_
  )
)
#> # A tibble: 3 × 6
#>   expression            min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>       <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 recode             802µs    888µs      988.        0B     8.43
#> 2 casematch_program  366µs    385µs     2355.     1.3KB    10.4 
#> 3 casematch_regular  295µs    311µs     2860.    1.05KB    10.4
# programatically recreating the values can become expansive

Created on 2024-05-07 with reprex v2.1.0

olivroy avatar May 08 '24 00:05 olivroy

Unfortunately it is taking us longer than expected to get some time for a vctrs release, but this is definitely still something I want to add, as I think a lot of people would like a low level type stable vec_if_else() that doesn't need dplyr (particularly ggplot2)

DavisVaughan avatar May 14 '24 19:05 DavisVaughan

In fact @DavisVaughan, I saw some discussion in the R email lists that a new version of the "if.else" function like dplyr::if_else is a thing would be of value in base R.

jrosell avatar Sep 17 '25 14:09 jrosell

@jrosell we've actually got vec_if_else() in https://github.com/r-lib/vctrs/pull/2030 as of last week

The "atomic" path might be interesting to base R. It is hyperoptimized and absurdly fast and memory efficient compared to base R's current approach.

The "generic" path is pretty vctrs specific and the base R fallback would be different.

data.table's implementation is faster because it uses multiple threads in some cases. But note the character vector benchmark where we are faster than them. That's a case where they can't use multiple threads, which I think suggests that our implementation is a bit faster in general on a single thread.

DavisVaughan avatar Sep 17 '25 15:09 DavisVaughan

The link to the discussion https://stat.ethz.ch/pipermail/r-devel/2025-July/084096.html

jrosell avatar Sep 17 '25 17:09 jrosell

Closed by https://github.com/r-lib/vctrs/pull/2024 and https://github.com/r-lib/vctrs/pull/2027

DavisVaughan avatar Oct 07 '25 17:10 DavisVaughan