Draft: parallelize match computation in pycbc_brute_bank by shrinking multiple templates
Opening it here for discussion. I'd like to further parallelize pycbc_brute_bank wherever possible. The idea is to shrink multiple waveforms at once, as many as the parallel processes.
However, it doesn't really work as fast as I expected. It's actually much slower than a serial computation altogether. I suspect it's because of resource contention inside of the multiple processes.
I'd also like to explore the consequences to return inside one of the multiprocessing pools.
@yi-fan-wang I'm not sure this approach will work. I think the most straightforward is simple to parallelize over the proposals themselves and assume that within each proposal set there isn't much overlap.
@yi-fan-wang should this be closed?
Leave some notes here for future development: For some unknown reason the parallelization doesn't work as I want. It may be that opening and closing the parallelization too frequently cost too much time. So maybe try to incorporating all things in a multiprocessing pool