vctrs
vctrs copied to clipboard
Why is `vec_assign()` for lists much faster than `vec_slice<-`?
vec_assign()
needs less memory and is way faster for lists than vec_slice()
library(rlang)
library(vctrs)
x_org <- as.list(0L + 1:100e3)
x2 <- 0L + 1:1000L
bench::mark(
vec_slice = {
x <- x_org
vec_slice(x, 1:1000) <- list(x2)
x
},
vec_assign = {
x <- x_org
x <- vec_assign(x, 1:1000, list(x2))
x
},
iterations = 20
)
#> # A tibble: 2 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 vec_slice 1.95ms 2.09ms 469. 1.57MB 0
#> 2 vec_assign 870.63µs 1.16ms 867. 798.34KB 0
x_org <- 0L + 1:10e6
bench::mark(
vec_slice = {
y <- x_org
vec_slice(y, 1:1000) <- 2L
y
},
vec_assign = {
x <- x_org
x <- vec_assign(x, 1:1000, 2L)
x
},
iterations = 20
)
#> # A tibble: 2 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 vec_slice 4.99ms 5.22ms 157. 38.2MB 84.6
#> 2 vec_assign 4.75ms 4.98ms 198. 38.2MB 198.
Created on 2021-06-25 by the reprex package (v2.0.0)
I've fought with this before, and I'm pretty sure it is an R limitation with how custom <-
functions work. It is very difficult to show this, but I'm pretty sure its the internal way <-
functions are handled that makes the extra copy, and isn't related to anything we are doing. Because of some interaction with RStudio's viewer pane, you have to wrap up calls to profmem()
in a function, but here is one way to see this:
library(rlang)
library(vctrs)
fn_slice_assign <- function(x, i, value) {
profmem::profmem(vec_slice(x, i) <- value)
}
fn_assign <- function(x, i, value) {
profmem::profmem(vec_assign(x, i, value))
}
x <- as.list(0L + 1:100e3)
i <- 1:1000 + 0L
value <- list(-1L)
fn_slice_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_slice(x, i) <- value
#>
#> Memory allocations:
#> what bytes calls
#> 1 alloc 800048 <internal>
#> 2 alloc 8048 vec_slice<-()
#> 3 alloc 800048 vec_slice<-()
#> total 1608144
fn_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_assign(x, i, value)
#>
#> Memory allocations:
#> what bytes calls
#> 1 alloc 8048 vec_assign()
#> 2 alloc 800048 vec_assign()
#> total 808096
x <- 0L + 1:100e3
i <- 1:1000 + 0L
value <- -1L
fn_slice_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_slice(x, i) <- value
#>
#> Memory allocations:
#> what bytes calls
#> 1 alloc 4048 vec_slice<-()
#> 2 alloc 400048 vec_slice<-()
#> total 404096
fn_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_assign(x, i, value)
#>
#> Memory allocations:
#> Number of 'new page' entries not displayed: 1
#> what bytes calls
#> 2 alloc 4048 vec_assign()
#> 3 alloc 400048 vec_assign()
#> total 404096
See the <internal>
copy? That's R copying the input list before we even get access to it in vec_slice<-
. I think the reason that this doesn't happen with simpler atomic types has something to do with ALTREP shallow duplication, which if I remember right lists currently can't do.
My default instinct these days is to always use vec_assign()
so I don't have to think about fighting this
Thanks for the explanation! I
So, is there any real use case for vec_slice<-
? If not, it might make sense to deprecate it.
Otherwise, a note in the documentation of vec_slice()
would be great.
Good catch, need to avoid using in tibble too.
I think the reason that this doesn't happen with simpler atomic types has something to do with ALTREP shallow duplication, which if I remember right lists currently can't do.
I think you're right. The terminology is a bit confusing because shallow duplication of lists already exists, but unlike ALTREP shallow duplication which creates a wrapper around another SEXP, shallow duplication of lists causes a copy of the whole pointer array.
However I think we could do better. The reference count inside vec_slice<-
is 2. There is one reference for the object in the calling environment (the fresh shadow duplicate) and another in vec_slice<-
's own execution environment. Since we own the latter, I think we can ignore that reference and go ahead with mutation. For atomic vectors we would have to duplicate the vector pointed to by the altrep wrapper as we do now. For lists however we could mutate without duplication since they have just been copied in the process and are only referenced in our calling environment. This would solve the performance difference.
@lionel- you mean vec_slice<-
, right?
oh yes
I don't think the reference count inside vec_slice<-
is always 2. In the somewhat rare case that it is called like vec_slice<-(x, i, value)
then we don't get that copy.
Can we differentiate between these two cases?
library(rlang)
library(vctrs)
fn_slice_assign <- function(x, i, value) {
profmem::profmem(vec_slice(x, i) <- value)
}
fn_slice_assign2 <- function(x, i, value) {
profmem::profmem(`vec_slice<-`(x, i, value))
}
x <- as.list(0L + 1:100e3)
i <- 1:1000 + 0L
value <- list(-1L)
fn_slice_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_slice(x, i) <- value
#>
#> Memory allocations:
#> what bytes calls
#> 1 alloc 800048 <internal>
#> 2 alloc 8048 vec_slice<-()
#> 3 alloc 800048 vec_slice<-()
#> total 1608144
fn_slice_assign2(x, i, value)
#> Rprofmem memory profiling of:
#> `vec_slice<-`(x, i, value)
#>
#> Memory allocations:
#> what bytes calls
#> 1 alloc 8048 vec_slice<-()
#> 2 alloc 800048 vec_slice<-()
#> total 808096
Created on 2021-08-20 by the reprex package (v2.0.0.9000)
I don't think the reference count inside vec_slice<- is always 2.
I see a refcount of 2:
`foo<-` <- function(x, value) .Internal(inspect(x))
x <- list(NULL)
foo(x) <- NULL
#> @7fee125cb380 19 VECSXP g0c1 [REF(2)] (len=1, tl=0)
#> @7fee1600d4e0 00 NILSXP g1c0 [MARK,REF(65535)]
x <- list(NULL)
`foo<-`(x, NULL)
#> @7fee125ac7c0 19 VECSXP g0c1 [REF(2)] (len=1, tl=0)
#> @7fee1600d4e0 00 NILSXP g1c0 [MARK,REF(65535)]
This make sense since we have the reference in the calling environment and the reference in vec_slice<-
's exec env. I think the profmem result you're seeing is because the replacement logic of the REPL isn't called.
But your point stands, if we see a refcount of 2 in that case, we technically shouldn't mutate. However I think this sort of usage for replacement functions is frown upon by R core. We could just say it's unsupported and vec_assign()
should be used. I think having the side effect in direct calls is reasonable because there is a <-
in the name of the function.
Perhaps you could tell the difference between the two usages by checking for the *tmp*
binding in the caller env, though that may not be a 100% sure heuristic. oh actually we can check for the sys.call()
, it looks like this:
`foo<-`(`*tmp*`, value = NULL)
I’m with @mgirlich — given the complexity of vec_slice<-
why not just deprecate in favour of vec_assign
? It’s not like the interface is that much worse.
In @mgirlich's example, there is exactly one copy that must be made in all cases (a fixed vec_slice<-
and vec_assign()
). That's because there are two references to the vector in the global environment to avoid initialisation to be counted in the benchmark. However the more common case is to initialise and then assign directly.
x <- vec_init(int(), 3)
for (i in 1:3) {
vec_slice(x, i) <- i
}
In this case there will be only one reference and we can avoid multiple expensive copies by using the replacement function.
I think we should fix the replacement function and add a warning to vec_assign()
that it should not generally be used. Since vctrs code can't be compiled, vec_assign()
will never be able to be as efficient as its replacement variant.
So you think we can imitate the efficient [<-
behavior when inside a function doing repeated assignments like this? That would be cool.
library(vctrs)
library(rlang)
test <- function() {
x <- vec_init(int(), 1e6)
for (i in 1:3) {
vec_slice(x, i) <- i
}
x
}
test2 <- function() {
x <- vec_init(int(), 1e6)
for (i in 1:3) {
x[i] <- i
}
x
}
profmem::profmem(test())
#> Rprofmem memory profiling of:
#> test()
#>
#> Memory allocations:
#> what bytes calls
#> 1 alloc 4000048 test() -> vec_init()
#> 2 alloc 4000048 test() -> vec_slice<-()
#> 3 alloc 4000048 test() -> vec_slice<-()
#> 4 alloc 4000048 test() -> vec_slice<-()
#> total 16000192
profmem::profmem(test2())
#> Rprofmem memory profiling of:
#> test2()
#>
#> Memory allocations:
#> what bytes calls
#> 1 alloc 4000048 test2() -> vec_init()
#> total 4000048
Created on 2021-08-23 by the reprex package (v2.0.0.9000)
Could help in pivot_wider()
(this is in a loop)
https://github.com/tidyverse/tidyr/blob/2f735ad361e9be40f58549ab4787a9b68efba552/R/pivot-wide.R#L259-L266
I think so. IIUC vec_slice<-
will always get a REF(2) input, at least in recent R versions. Either because R copied it before calling the replacement function, or because the input is a REF(1).