blotter
blotter copied to clipboard
Parallelize txnsim()
Prompted by this blog post referencing the txnsim() function we started chatting again about parallelizing the function, so creating an issue for the work with some notes from the discussion...mostly pointers from Brian.
txnsim constructs random trades... the time spent is actually in the generation of the random draws... the actual P&L calculations in blotter are all vectorized and fairly trivial
so the time in txnsim is actually in constructing the draws
you would gain some advantage in breaking that up for a large simulation, e.g. have each core work on 1/ncores of the draws, but then you would probably still have blotter mark them all the way it does now
we use lapply to build the list of trades
could switch to parLapply
as much as I prefer the flexibility of foreach, it probably doesn't make sense to rewrite the lapply as a foreach loop, though that would work too... it is a more complicated refactoring
we also call replicate on the inner loops... that would actually be a better place to refactor as a foreach loop
hah! Richard McElreath already has a parallel replicate in the code for statistical rethinking
or could use mclapply which would only use one core on Windows but would serve them right for running an operating system that doesn't support forking
(of threads)
Roger Peng's book 'R Programming for Data Science' also talks about parallel replicate(). the book is worth owning in print, but the whole thing is online here:
https://bookdown.org/rdpeng/rprogdatascience/