vctrs icon indicating copy to clipboard operation
vctrs copied to clipboard

Early exit for only `TRUE` in `vec_slice()`

Open mgirlich opened this issue 2 years ago • 1 comments

Useful e.g. in vec_slice(x, !is.na(x))

library(rlang)
library(vctrs)

n <- 10e6
x <- 0L + 1:n
idx <- vec_rep(TRUE, n)

bench::mark(
  false_n = vec_slice(x, idx),
  check = FALSE,
  min_iterations = 20
)
# Before
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 false_n      29.6ms   30.6ms      28.9    76.3MB     30.3

# After
#> # A tibble: 1 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 false_n      7.39ms   7.91ms      121.    3.73KB        0

Created on 2022-07-18 by the reprex package (v2.0.1)

mgirlich avatar Jul 18 '22 12:07 mgirlich

I'm not quite sure we'd want to optimize for this case yet. I don't love early exits like these in general because they add one more special code path that we have to keep in mind whenever we make any changes in the future (even if the performance is nice)

I think a better pattern for your example is

if (vec_any_missing(x)) {
  x <- vec_slice(x, !vec_equal_na(x))
}

Then you don't need the early exit because you only do "work" when you know there is at least 1 missing value

DavisVaughan avatar Jul 18 '22 12:07 DavisVaughan

Closing for the reasons explained in https://github.com/r-lib/vctrs/pull/1636#issuecomment-1252300105

lionel- avatar Sep 20 '22 12:09 lionel-