R6
R6 copied to clipboard
Modifying large matrix field is slow in R6
A large-ish matrix is a part of an R6 object as a public or private field, and a update_matrix method allows updating a certain cell of the matrix. However, this operation is very slow compared to updating a normal matrix object outside R6, on the order of microseconds instead of nanoseconds. Reprex (for this example, the matrix is called NBA, and has customers x features):
library(bench)
library(ggplot2)
library(tidyr)
# Create NBA matrix, sparse
no_customers <- 1e6
no_features <- 30
NBA_matrix <-
matrix(
sample(c(rep(0, 1000), 1), size = no_customers * no_features, replace = TRUE),
nrow = no_customers,
ncol = no_features
)
# Create NBA_like R6 object with matrix
library(R6)
NBA_lite <- R6Class("NBA_lite", class = FALSE, portable = FALSE, cloneable = FALSE,
public = list(
mm = NULL,
initialize = function(input_matrix) self$mm <- input_matrix,
get_matrix = function() self$mm,
modify = function(row, col, value) self$mm[row,col] <- value
)
)
new_NBA_lite <- NBA_lite$new(input_matrix = NBA_matrix)
# Benchmark modifying single value, matrix vs R6 field
results <- bench::mark(matrix = NBA_matrix[234123, 10] <- 2,
R6_method = new_NBA_lite$modify(row = 234123, col = 10, value = 2),
R6_field = new_NBA_lite$mm[234123, 10] <- 2)
expression | median | n_gc |
---|---|---|
NBA_matrix[234123, 10] <- 2 | 779ns | 0 |
new_NBA_lite$modify(row = 234123, col = 10, value = 2) | 126ms | 1 |
new_NBA_lite$mm[234123, 10] <- 2 | 127ms | 1 |
I'm wondering about this overhead of > 100ms - looking at the R6 performance vignette, there seems to be something other at play here - does the garbage collecting add this overhead and is it neccesary?
Thanks in advance
Since you're using portable=F
, you actually speed it up by using mm[row,col] <<- value
instead of self$mm[row,col] <- value
.
For example, here's a modified version of your code:
library(bench)
# Create NBA matrix, sparse
no_customers <- 1e6
no_features <- 30
NBA_matrix <-
matrix(
sample(c(rep(0, 1000), 1), size = no_customers * no_features, replace = TRUE),
nrow = no_customers,
ncol = no_features
)
# Create NBA_like R6 object with matrix
library(R6)
NBA_lite <- R6Class("NBA_lite", class = FALSE, portable = FALSE, cloneable = FALSE,
public = list(
mm = NULL,
initialize = function(input_matrix) self$mm <- input_matrix,
get_matrix = function() self$mm,
modify = function(row, col, value) self$mm[row,col] <- value,
modify2 = function(row, col, value) mm[row,col] <<- value
)
)
new_NBA_lite <- NBA_lite$new(input_matrix = NBA_matrix)
results <- bench::mark(matrix = NBA_matrix[234123, 10] <- 2,
R6_method = new_NBA_lite$modify(row = 234123, col = 10, value = 2),
R6_method2 = new_NBA_lite$modify2(row = 234123, col = 10, value = 2),
R6_field = new_NBA_lite$mm[234123, 10] <- 2)
results
Here's the result:
# A tibble: 4 x 13
expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm>
1 matrix 689ns 816ns 1117403. 229MB 0 10000 0 8.95ms
2 R6_method 36.32ms 41.57ms 24.2 229MB 40.3 3 5 124.1ms
3 R6_method2 1.27µs 1.66µs 574883. 0B 0 10000 0 17.39ms
4 R6_field 37.5ms 43.55ms 22.1 229MB 27.7 4 5 180.79ms
# … with 4 more variables: result <list>, memory <list>, time <list>, gc <list>
The reason that this speeds things up is because of how the <-
and <<-
operators work in R.
When you do something like self$x <- y
, that actually gets turned into something like this:
`*tmp*` <- x
x <- "$<-"(`*tmp*`, y)
rm(`*tmp*`)
This creates *tmp*
, which initially points to the same object in memory as x
. However, when the assignment to x
happens in the second line, R makes a copy of the object and modifies it. This copy of the object needs to be GC'd (garbage collected) later, and that takes time. On the other hand, when you use x <<- y
, that replaces x
directly in place, without making a copy.
See here for more info about subset assignment: https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Subset-assignment
Here's another example that illustrates:
local({
self <- environment()
x <- 0
y <- 0
bench::mark(
self$x <- x + 1,
y <<- y +1,
iterations = 1e5
)
})
#> # A tibble: 2 x 13
#> expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time
#> <bch:expr> <bch> <bch:> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm>
#> 1 self$x <- x + 1 490ns 595ns 1457224. 0B 14.6 99999 1 68.6ms
#> 2 y <<- y + 1 130ns 145ns 5800715. 0B 0 100000 0 17.2ms
#> # … with 4 more variables: result <list>, memory <list>, time <list>, gc <list>
I think that, in your case, the performance penalty is even greater since you're using double subset assignment, with both $
and [
.
If there's only a single reference to a vector and you change a value inside of it, R will modify the data structure in place. However, if there are multiple references to it, R needs to make a copy when doing the assignment. The *tmp*
thing that R does causes a copy to be created, so there are multiple references to the object. The copies take time to create, and they take time to GC later on.
I'm leaving this issue open so that it reminds us to document this behavior.
Hi Winston, Thank you very much for providing this insight into both R6 and R subset assignment as a whole - it works well on my end in a larger system as a whole. You might just have vindicated using R in production for my company ;) Best regards,
@johanneswaage Great to hear!
For future reference, here's a comparison of:
- Using
self
vs.<<-
in the assignment - Setting a single value vs. using subset (indexed) assignment
The short story is that there's a very small cost to using self
when assigning to a single value, but when doing subset assignment, it can be expensive if the object is large (and a copy of the whole thing needs to be made).
new_obj <- function() {
self <- environment()
scalar <- 0
vector <- 1:1e6
list(
set_scalar_noself = function(x) {
scalar <<- x
},
set_scalar_self = function(x) {
self$scalar <- x
},
set_vector_noself = function(i, x) {
vector[i] <<- x
},
set_vector_self = function(i, x) {
self$vector[i] <- x
}
)
}
obj <- new_obj()
microbenchmark::microbenchmark(
obj$set_scalar_noself(12345),
obj$set_scalar_self(12345),
obj$set_vector_noself(1000, 12345),
obj$set_vector_self(1000, 12345)
)
#> Unit: nanoseconds
#> expr min lq mean median uq max neval
#> obj$set_scalar_noself(12345) 520 669.5 1771.58 955 2608.0 11003 100
#> obj$set_scalar_self(12345) 977 1309.5 2727.64 1990 3440.5 8831 100
#> obj$set_vector_noself(1000, 12345) 1058 1321.0 2591.27 2040 3004.5 11394 100
#> obj$set_vector_self(1000, 12345) 991117 1326866.5 3412537.55 2477184 4884621.5 10733476 100