fastr icon indicating copy to clipboard operation
fastr copied to clipboard

strange performance issue, non-linear increase in runtime

Open o- opened this issue 5 years ago • 3 comments

The execution times in the following very simple program are linearly increasing in GNU R, but somehow exploding in fastr. On my machine the 10 results for GNU R are from 0.04s to 0.4s, whereas on fastr it is 4, 19, 44, timeout....

execute <- function(n) {
  x <- list()
  for (i in 1:n) {
    x[[i]] <- i
  }
}
i = 100000
while (i <= 1000000) {
  print(system.time(execute(i)))
  i = i+100000
}

o- avatar Mar 06 '20 08:03 o-

Hello,

thank you for reporting this issue. The problem is in x[[i]] <- i, which causes re-allocation of the list (growing it by one). Somehow GNU-R manages to handle this much faster. We will take a look at how to fix this. One possibility is to grow the list by more than one element if we detect such code pattern. For now, you can fix this by pre-allocating this list yourself (which is what I would recommend doing even for GNU-R) like this:

execute <- function(n) {
  x <- vector(mode = "list", length = n)
  for (i in 1:n) {
    x[[i]] <- i
  }
}
i = 100000
while (i <= 1000000) {
  print(system.time(execute(i)))
  i = i+100000
}

with this R code I am getting on FastR:

  user  system elapsed 
  0.080   0.003   0.084 
   user  system elapsed 
  0.070   0.007   0.078 
   user  system elapsed 
  0.070  -0.008   0.066 
   user  system elapsed 
  0.000   0.009   0.030 
   user  system elapsed 
  0.010  -0.003   0.008 
   user  system elapsed 
  0.010  -0.002   0.008 
   user  system elapsed 
   0.01    0.00    0.01 
   user  system elapsed 
  0.000   0.012   0.012 
   user  system elapsed 
  0.010   0.003   0.012 
   user  system elapsed 
  0.020  -0.006   0.014 

for comparison, this is GNU-R with the same R code and on the same machine:

   user  system elapsed 
  0.029   0.004   0.033 
   user  system elapsed 
  0.056   0.000   0.056 
   user  system elapsed 
  0.123   0.000   0.123 
   user  system elapsed 
  0.108   0.012   0.120 
   user  system elapsed 
   0.13    0.00    0.13 
   user  system elapsed 
  0.175   0.013   0.187 
   user  system elapsed 
  0.271   0.000   0.270 
   user  system elapsed 
  0.191   0.028   0.219 
   user  system elapsed 
  0.340   0.008   0.348 
   user  system elapsed 
  0.378   0.023   0.402 

steve-s avatar Mar 25 '20 09:03 steve-s

For now, you can fix this by pre-allocating this list yourself (which is what I would recommend doing even for GNU-R) like this:

thanks for the suggestion. the code snippet is actually not something I use and I am aware that the increase by one is probably at fault for constantly re-allocating that vector (though I don't know how to debug this in FastR to know for sure).

I just found it to be a fairly striking performance oddity and wanted to let you know. are you generally interested in such snippets?

o- avatar Mar 25 '20 13:03 o-

though I don't know how to debug this in FastR to know for sure

Actually, we have really great tool for this: VisualVM has support for CPU and memory tracing of GraalVM languages like FastR.

visualvm-memory-tracing

I just found it to be a fairly striking performance oddity and wanted to let you know. are you generally interested in such snippets?

yes, it is very useful. I was not aware that GNU-R is probably optimized for this code-pattern. We must match all such optimizations, because people will use these code patterns if they work fine on GNU-R.

steve-s avatar Mar 25 '20 14:03 steve-s