vctrs icon indicating copy to clipboard operation
vctrs copied to clipboard

Implement `vec_get()`

Open DavisVaughan opened this issue 4 years ago • 4 comments

  • Uses vec_slice() tools, except for non-shaped lists, where we actually extract out the inner elements.
  • Names are not kept. These include atomic names, data frame row names, and matrix row names
  • Data frames and matrices are extracted row wise
  • Uses vec_as_position(), which slows things down a good bit right now because it is not written in C
  • df_slice() was altered so that the restoration / slicing of row names happens outside of that function. This way we can use it in vec_get() and correctly drop row names.
library(vctrs)

vec_get <- vctrs:::vec_get

vec_get(1, 1)
#> [1] 1

# no names
vec_get(c(a = 1), 1)
#> [1] 1

# no row names, but it is row wise
df <- data.frame(x = 1:2, y = c("a", "b"), row.names = c("r1", "r2"))

vec_get(df, 1)
#>   x y
#> 1 1 a

# list extraction is recursive
lst <- as.list(df)
lst
#> $x
#> [1] 1 2
#> 
#> $y
#> [1] a b
#> Levels: a b

vec_get(lst, 1)
#> [1] 1 2

vec_get(lst, 2)
#> [1] a b
#> Levels: a b

# falls back to `[[` for S3 objects with no proxy
vec_get(factor(c("x", "y", "z")), 1)
#> [1] x
#> Levels: x y z

# row wise on matrices, no row names, but keeps other names
mat <- matrix(1:2, nrow = 2, ncol = 2, dimnames = list(c("r1", "r2"), c("c1", "c2")))
mat
#>    c1 c2
#> r1  1  1
#> r2  2  2

vec_get(mat, 2)
#>      c1 c2
#> [1,]  2  2

Created on 2019-10-19 by the reprex package (v0.3.0)

DavisVaughan avatar Oct 19 '19 20:10 DavisVaughan

~Not quite right because extra attributes should not be kept~

> vctrs::vec_get(structure(1, extra = "x", names = "x"), 1)
[1] 1
attr(,"extra")
[1] "x"

I take it back, this behavior is right. It should only strip names

DavisVaughan avatar Dec 02 '19 16:12 DavisVaughan

With vec_get() and the extension, vec_chop2(), it looks like this dplyr section where columns are sliced could be reduced to:

https://github.com/tidyverse/dplyr/blob/057b453428f0139dd013ee4db247d1a1ca51032f/R/tbl-df.r#L38-L49

resolve_chunks <- if (inherits(data, "rowwise_df")) {
  function(index) vec_chop2(.subset2(data, index), rows)
} else {
  function(index) vec_chop(.subset2(data, index), rows)
}

DavisVaughan avatar Jan 17 '20 14:01 DavisVaughan

Should rename to vec_slice2(), and add vec_chop2()

DavisVaughan avatar Jan 17 '20 14:01 DavisVaughan

Idea from @lionel-: vec_chop() always keeps names on the inner vectors, vec_chop2() moves the names from the inner vectors to the outer list. This is consistent with vec_slice(), which keeps names, and vec_slice2() which should drop them from the resulting vector.

Even if indices were provided in vec_chop2() we could still keep names on the outer list because each index will be restricted to only select 1 element. So we could have vec_chop2(x, indices = list(1, 1, 2)) for example, but not vec_chop2(x, indices = list(c(1, 1), 2))

DavisVaughan avatar May 28 '20 16:05 DavisVaughan

Will be vec_slice2(), like in #1228

DavisVaughan avatar Sep 13 '22 16:09 DavisVaughan