xts icon indicating copy to clipboard operation
xts copied to clipboard

merge.xts (aka cbind) seems very slow

Open DarrenCook opened this issue 10 years ago • 1 comments

An xts with 770K rows, takes about 40s (at 100% CPU, but memory use does not grow) to do this:

lag(x$High, k = -(0:12))

It is implemented as:

return(do.call("merge.xts", lapply(k, lag.xts, x = x, na.pad = na.pad, ...)))

If I do just the lapply call, it takes 0.023s. Then the merge.xts took 73.7s (*)

Does that seem reasonable, or could merge.xts have a bug?

*: not sure why running the function components separately took twice as long - I seem to have got the same result.

DarrenCook avatar Jul 23 '15 15:07 DarrenCook

This is expected. merge.xts is recursive, so it's adding each column one at a time. You can work around it in this case, since you know the index is the same for all the objects you're merging.

x <- .xts(1:1e5, 1:1e5)
xl <- lapply(1:12, lag.xts, x=x)
system.time(xlm <- do.call(merge, xl))
#   user  system elapsed 
#  4.344   0.000   4.342 

# call cbind on the matrix coredata, and create an xts object from the result
ml <- lapply(xl, coredata)
system.time(mlm <- xts(do.call(cbind, ml), index(xl[[1]])))
#   user  system elapsed 
#  0.008   0.000   0.008

dimnames(xlm) <- NULL  # remove colnames added by merge.xts
# check coredata, since attributes may be in different order
identical(coredata(xlm), coredata(mlm))
# [1] TRUE

joshuaulrich avatar Jul 23 '15 16:07 joshuaulrich

This was fixed in d2ab0f799cf0bf9fdad240c6522214fd9db6b19c, as part of #248. It's now well under 100ms for me.

R$ library(xts)
   x <- .xts(1:1e5, 1:1e5)
   xl <- lapply(1:12, lag.xts, x=x)
   system.time(xlm <- do.call(merge, xl))
   user  system elapsed
  0.013   0.017   0.029

joshuaulrich avatar Nov 03 '22 22:11 joshuaulrich