Wishlist-for-R
Wishlist-for-R copied to clipboard
WISH: Rprof() Improvements
Background
Rprof()
dumps the R call stack at timed intervals into a text file. The text file can then be processed to reconstruct/estimate how much time was spent under different call stacks.
Possible Improvements
Provide Function disambiguation
For example, in the following:
> my_fun2 <- function(x) {sample(x); x}
> my_fun <- function(x) {runif(x); x}
> Rprof()
> x <- lapply(rep(1e6, 20), my_fun)
> y <- lapply(rep(1e6, 20), my_fun2)
> Rprof(NULL)
> summaryRprof('Rprof.out')
$by.self
self.time self.pct total.time total.pct
"runif" 0.96 59.26 0.96 59.26
"sample.int" 0.64 39.51 0.64 39.51
"sample" 0.02 1.23 0.66 40.74
$by.total
total.time total.pct self.time self.pct
"FUN" 1.62 100.00 0.00 0.00
"lapply" 1.62 100.00 0.00 0.00
"runif" 0.96 59.26 0.96 59.26
"sample" 0.66 40.74 0.02 1.23
"sample.int" 0.64 39.51 0.64 39.51
$sample.interval
[1] 0.02
$sampling.time
[1] 1.62
The symbol FUN
is referencing both my_fun
and my_fun2
. An option to provide a uniquely identifying string would be very useful. For example, since now compiling is enabled by default, we could reference those functions by bytecode address:
> my_fun2
function(x) {sample(x); x}
<bytecode: 0x7fcb8a2d4b10>
> my_fun
function(x) {runif(x); x}
<bytecode: 0x7fcb88491520>
For functions that are part of packages, ideally this could be used to look up their original symbol (method of this TBD).
This should greatly improve the usefulness of graph based profile views.
Better Support for Fast Function Looping
One strategy as implemented in treeprof is to loop an expression repeatedly so that by random sampling even very fast expressions can be comprehensively profiled.
It would be very useful to be able to mark the beginning or end of each loop with a some arbitrary text written to the file (e.g. "for
, *pply
, etc) then linear displays of profiles would become much more useful as repeated code evaluation could be easily aggregated.
The main problem right now is there is no way to guarantee that a particular stack state gets dumped, so there is no way to insert an artificial marker.