vctrs Why is `vec_assign()` for lists much faster than `vec

vec_assign() needs less memory and is way faster for lists than vec_slice()

library(rlang)
library(vctrs)

x_org <- as.list(0L + 1:100e3)
x2 <- 0L + 1:1000L

bench::mark(
  vec_slice = {
    x <- x_org
    vec_slice(x, 1:1000) <- list(x2)
    x
  },
  vec_assign = {
    x <- x_org
    x <- vec_assign(x, 1:1000, list(x2))
    x
  },
  iterations = 20
)
#> # A tibble: 2 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 vec_slice    1.95ms   2.09ms      469.    1.57MB        0
#> 2 vec_assign 870.63µs   1.16ms      867.  798.34KB        0

x_org <- 0L + 1:10e6
bench::mark(
  vec_slice = {
    y <- x_org
    vec_slice(y, 1:1000) <- 2L
    y
  },
  vec_assign = {
    x <- x_org
    x <- vec_assign(x, 1:1000, 2L)
    x
  },
  iterations = 20
)
#> # A tibble: 2 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 vec_slice    4.99ms   5.22ms      157.    38.2MB     84.6
#> 2 vec_assign   4.75ms   4.98ms      198.    38.2MB    198.

^{Created on 2021-06-25 by the reprex package (v2.0.0)}

Jun 25 '21 07:06 mgirlich

I've fought with this before, and I'm pretty sure it is an R limitation with how custom <- functions work. It is very difficult to show this, but I'm pretty sure its the internal way <- functions are handled that makes the extra copy, and isn't related to anything we are doing. Because of some interaction with RStudio's viewer pane, you have to wrap up calls to profmem() in a function, but here is one way to see this:

library(rlang)
library(vctrs)

fn_slice_assign <- function(x, i, value) {
  profmem::profmem(vec_slice(x, i) <- value)
}
fn_assign <- function(x, i, value) {
  profmem::profmem(vec_assign(x, i, value))
}

x <- as.list(0L + 1:100e3)
i <- 1:1000 + 0L
value <- list(-1L)

fn_slice_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_slice(x, i) <- value
#> 
#> Memory allocations:
#>        what   bytes         calls
#> 1     alloc  800048    <internal>
#> 2     alloc    8048 vec_slice<-()
#> 3     alloc  800048 vec_slice<-()
#> total       1608144

fn_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_assign(x, i, value)
#> 
#> Memory allocations:
#>        what  bytes        calls
#> 1     alloc   8048 vec_assign()
#> 2     alloc 800048 vec_assign()
#> total       808096

x <- 0L + 1:100e3
i <- 1:1000 + 0L
value <- -1L

fn_slice_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_slice(x, i) <- value
#> 
#> Memory allocations:
#>        what  bytes         calls
#> 1     alloc   4048 vec_slice<-()
#> 2     alloc 400048 vec_slice<-()
#> total       404096
fn_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_assign(x, i, value)
#> 
#> Memory allocations:
#> Number of 'new page' entries not displayed: 1
#>        what  bytes        calls
#> 2     alloc   4048 vec_assign()
#> 3     alloc 400048 vec_assign()
#> total       404096

See the <internal> copy? That's R copying the input list before we even get access to it in vec_slice<-. I think the reason that this doesn't happen with simpler atomic types has something to do with ALTREP shallow duplication, which if I remember right lists currently can't do.

My default instinct these days is to always use vec_assign() so I don't have to think about fighting this

Jun 28 '21 13:06 DavisVaughan

Thanks for the explanation! I

So, is there any real use case for vec_slice<-? If not, it might make sense to deprecate it. Otherwise, a note in the documentation of vec_slice() would be great.

Jun 29 '21 05:06 mgirlich

Good catch, need to avoid using in tibble too.

Jul 20 '21 08:07 krlmlr

I think the reason that this doesn't happen with simpler atomic types has something to do with ALTREP shallow duplication, which if I remember right lists currently can't do.

I think you're right. The terminology is a bit confusing because shallow duplication of lists already exists, but unlike ALTREP shallow duplication which creates a wrapper around another SEXP, shallow duplication of lists causes a copy of the whole pointer array.

However I think we could do better. The reference count inside vec_slice<- is 2. There is one reference for the object in the calling environment (the fresh shadow duplicate) and another in vec_slice<-'s own execution environment. Since we own the latter, I think we can ignore that reference and go ahead with mutation. For atomic vectors we would have to duplicate the vector pointed to by the altrep wrapper as we do now. For lists however we could mutate without duplication since they have just been copied in the process and are only referenced in our calling environment. This would solve the performance difference.

Aug 20 '21 10:08 lionel-

@lionel- you mean vec_slice<-, right?

Aug 20 '21 11:08 hadley

oh yes

Aug 20 '21 12:08 lionel-

I don't think the reference count inside vec_slice<- is always 2. In the somewhat rare case that it is called like vec_slice<-(x, i, value) then we don't get that copy.

Can we differentiate between these two cases?

library(rlang)
library(vctrs)

fn_slice_assign <- function(x, i, value) {
  profmem::profmem(vec_slice(x, i) <- value)
}
fn_slice_assign2 <- function(x, i, value) {
  profmem::profmem(`vec_slice<-`(x, i, value))
}

x <- as.list(0L + 1:100e3)
i <- 1:1000 + 0L
value <- list(-1L)

fn_slice_assign(x, i, value)
#> Rprofmem memory profiling of:
#> vec_slice(x, i) <- value
#> 
#> Memory allocations:
#>        what   bytes         calls
#> 1     alloc  800048    <internal>
#> 2     alloc    8048 vec_slice<-()
#> 3     alloc  800048 vec_slice<-()
#> total       1608144

fn_slice_assign2(x, i, value)
#> Rprofmem memory profiling of:
#> `vec_slice<-`(x, i, value)
#> 
#> Memory allocations:
#>        what  bytes         calls
#> 1     alloc   8048 vec_slice<-()
#> 2     alloc 800048 vec_slice<-()
#> total       808096

^{Created on 2021-08-20 by the reprex package (v2.0.0.9000)}

Aug 20 '21 16:08 DavisVaughan

I don't think the reference count inside vec_slice<- is always 2.

I see a refcount of 2:

`foo<-` <- function(x, value) .Internal(inspect(x))

x <- list(NULL)
foo(x) <- NULL
#> @7fee125cb380 19 VECSXP g0c1 [REF(2)] (len=1, tl=0)
#>   @7fee1600d4e0 00 NILSXP g1c0 [MARK,REF(65535)]

x <- list(NULL)
`foo<-`(x, NULL)
#> @7fee125ac7c0 19 VECSXP g0c1 [REF(2)] (len=1, tl=0)
#>   @7fee1600d4e0 00 NILSXP g1c0 [MARK,REF(65535)]

This make sense since we have the reference in the calling environment and the reference in vec_slice<-'s exec env. I think the profmem result you're seeing is because the replacement logic of the REPL isn't called.

But your point stands, if we see a refcount of 2 in that case, we technically shouldn't mutate. However I think this sort of usage for replacement functions is frown upon by R core. We could just say it's unsupported and vec_assign() should be used. I think having the side effect in direct calls is reasonable because there is a <- in the name of the function.

Perhaps you could tell the difference between the two usages by checking for the *tmp* binding in the caller env, though that may not be a 100% sure heuristic. oh actually we can check for the sys.call(), it looks like this:

`foo<-`(`*tmp*`, value = NULL)

Aug 21 '21 04:08 lionel-

I’m with @mgirlich — given the complexity of vec_slice<- why not just deprecate in favour of vec_assign? It’s not like the interface is that much worse.

Aug 22 '21 15:08 hadley

In @mgirlich's example, there is exactly one copy that must be made in all cases (a fixed vec_slice<- and vec_assign()). That's because there are two references to the vector in the global environment to avoid initialisation to be counted in the benchmark. However the more common case is to initialise and then assign directly.

x <- vec_init(int(), 3)
for (i in 1:3) {
  vec_slice(x, i) <- i
}

In this case there will be only one reference and we can avoid multiple expensive copies by using the replacement function.

I think we should fix the replacement function and add a warning to vec_assign() that it should not generally be used. Since vctrs code can't be compiled, vec_assign() will never be able to be as efficient as its replacement variant.

Aug 23 '21 05:08 lionel-

So you think we can imitate the efficient [<- behavior when inside a function doing repeated assignments like this? That would be cool.

library(vctrs)
library(rlang)

test <- function() {
  x <- vec_init(int(), 1e6)
  for (i in 1:3) {
    vec_slice(x, i) <- i
  }
  x
}

test2 <- function() {
  x <- vec_init(int(), 1e6)
  for (i in 1:3) {
    x[i] <- i
  }
  x
}

profmem::profmem(test())
#> Rprofmem memory profiling of:
#> test()
#> 
#> Memory allocations:
#>        what    bytes                   calls
#> 1     alloc  4000048    test() -> vec_init()
#> 2     alloc  4000048 test() -> vec_slice<-()
#> 3     alloc  4000048 test() -> vec_slice<-()
#> 4     alloc  4000048 test() -> vec_slice<-()
#> total       16000192

profmem::profmem(test2())
#> Rprofmem memory profiling of:
#> test2()
#> 
#> Memory allocations:
#>        what   bytes                 calls
#> 1     alloc 4000048 test2() -> vec_init()
#> total       4000048

^{Created on 2021-08-23 by the reprex package (v2.0.0.9000)}

Could help in pivot_wider() (this is in a loop) https://github.com/tidyverse/tidyr/blob/2f735ad361e9be40f58549ab4787a9b68efba552/R/pivot-wide.R#L259-L266

Aug 23 '21 14:08 DavisVaughan

I think so. IIUC vec_slice<- will always get a REF(2) input, at least in recent R versions. Either because R copied it before calling the replacement function, or because the input is a REF(1).

Aug 23 '21 14:08 lionel-

vctrs
vctrs copied to clipboard

Why is `vec_assign()` for lists much faster than `vec_slice<-`?

vctrs vctrs copied to clipboard

Why is `vec_assign()` for lists much faster than `vec_slice<-`?

vctrs
vctrs copied to clipboard