vctrs icon indicating copy to clipboard operation
vctrs copied to clipboard

fast path for `vec_slice(x, <vector of only TRUE>)`?

Open mgirlich opened this issue 3 years ago • 3 comments

I'm not sure how often this is actually relevant in real life but it might be worth to have a "fast path" in vec_slice() when the i is a logical vector where every element is TRUE.

vec_slice2 <- function(x, i) {
  if (is.logical(i) && all(i)) {
    return(x)
  }
  
  vec_slice(x, i)
}

This would avoid copying all of x and therefore be much faster and memory friendly. At first it might seem unlikely to have all TRUE but I can see two likely scenarios:

  • remove "edge case" values like NA, e.g. vec_slice(out, !vec_equal_na(vals)),
  • looping over medium sized elements of a list and filtering each element, e.g.
library(purrr)
library(vctrs)

map(
  list(1:10, 11:20, 21:30),
  ~ vec_slice(.x, .x <= 28)
)

Having a quick look at the source code the check might be relatively cheap to do in lgl_as_location() resp. in r_lgl_which().

mgirlich avatar Jun 22 '21 12:06 mgirlich

Your second bullet point is similar to what list_compact() would do right?

DavisVaughan avatar Jun 22 '21 13:06 DavisVaughan

I added an example to the second bullet point. I really meant looping over a list and slicing every element of a list under some (rare) condition. Though, now that I think about it I'm not sure how much more efficient this is as the result is a new list anyway.

mgirlich avatar Jun 22 '21 14:06 mgirlich

I think this makes sense, and the x[!is.na(x)] example is convincing.

Given the modularity of the internals, it might be best to implement this via the compact subscripts. The C-level vec_as_location() would return a compact sequence of the form 1:n that vec_slice() could then recognise to exit early. Since vec_as_location() is exposed on the R side, this would have to be implemented via ALTREP.

lionel- avatar Jun 22 '21 14:06 lionel-

Implemented in #1408 but we decided to close.

lionel- avatar Oct 03 '22 10:10 lionel-