vctrs
vctrs copied to clipboard
`list_ungroup()` to invert `vec_group_loc()`?
It is currently slightly awkward to "undo" a vec_group_loc()
or vec_split()
call. You can do it with list_unchop()
, but it requires splitting the key first.
library(vctrs)
set.seed(123)
x <- sample(1:4, size = 20, replace = TRUE)
x
#> [1] 3 3 3 2 3 2 2 2 3 1 4 2 2 1 2 3 4 1 3 3
# Or `vec_split()` potentially, if we are splitting by something else
locs <- vec_group_loc(x)
locs
#> key loc
#> 1 3 1, 2, 3, 5, 9, 16, 19, 20
#> 2 2 4, 6, 7, 8, 12, 13, 15
#> 3 1 10, 14, 18
#> 4 4 11, 17
# Chopping here is awkward
list_unchop(vec_chop(locs$key), indices = locs$loc)
#> [1] 3 3 3 2 3 2 2 2 3 1 4 2 2 1 2 3 4 1 3 3
We could reintroduce vec_unchop(<vector>, <list-of-indices>)
to do this, but I think the "missing piece" is really a way to flatten out that loc
column from a list of location vectors that point into the original x
to a single location vector that points into the new key
.
# Should be fairly fast to build this at the C level?
# Probably some checks on `x` to make sure every element is an integer vector
# and that no element exceeds `sum(list_sizes(x))`. May also want to remove
# `0` values ahead of time?
list_ungroup <- function(x) {
out <- vec_init(integer(), n = sum(list_sizes(x)))
for (i in seq_along(x)) {
out <- vec_assign(out, x[[i]], i)
}
out
}
list_ungroup(locs$loc)
#> [1] 1 1 1 2 1 2 2 2 1 3 4 2 2 3 2 1 4 3 1 1
vec_slice(locs$key, list_ungroup(locs$loc))
#> [1] 3 3 3 2 3 2 2 2 3 1 4 2 2 1 2 3 4 1 3 3
@DavisVaughan list_ungroup
seems very specifically about reversing vec_group_loc
. ~~What if instead of trying to reverse vec_group_loc
with a new function, a third column was built into the result of vec_group_loc
which could be flattened directly (e.g. id
to match the terminology of vec_group_id
)? (This would also resolve what I was looking for in #1857).~~
I don't think there's a sensible way to include vec_group_id
in the vec_group_loc
data frame since the structure is inherently different.