tibble icon indicating copy to clipboard operation
tibble copied to clipboard

WIP: tibble_reconstruct(), tibble_row_slice(), tibble_col_modify()

Open krlmlr opened this issue 3 years ago • 1 comments

For #890.

This PR makes use of new functions tibble_reconstruct(), tibble_row_slice() and tibble_col_modify() that are (almost) identical to the data frame methods of the corresponding dplyr_*() generics. I hope I have caught all instances where these operations should happen.

From there I see two paths:

  1. Call the dplyr methods from tibble if they are available
  2. Define new tibble_*() generics that will be called from dplyr's default method.

It seems that dplyr_col_modify() is inefficient, because it performs repeated lookup of column names in its assignment loop. In tibble, we already know the numeric indices, because we check thoroughly beforehand. I will try to gauge the impact on the performance so that we can make an informed decision.

@jennybc @DavisVaughan @lionel-: Is this something that should be discussed in a tidyup?

krlmlr avatar Jul 31 '21 16:07 krlmlr

Unfortunately, this is very slow -- I see a slowdown of a factor of 3 for [<- and up to 5 for [[<- . How much slowdown is acceptable if we add this hook?

https://rpubs.com/krlmlr/tibble-reconstruct-benchmark

krlmlr avatar Aug 01 '21 03:08 krlmlr