Optimizing calls to time_delay_from_earth_center
The call to time_delay_from_earth_center in pycbc_multi_inspiral is inserted in a triple loop on ifos, sky positions and time slides, but the time delay depends only on the IFO and the sky position.
Specifically, https://github.com/gwastro/pycbc/blob/master/bin/pycbc_multi_inspiral#L223-L224 are executed N_IFO * N_skypoints * N_slides times rather than N_IFO * N_skypoints times. This needs to be handled differently so that we end up with N_IFO * N_skypoints calls only and the timeslides are folded in separately.
This one is for Sebastian Gomez Lopez (@sebastiangomezlopez)
@sebastiangomezlopez, can you please compare https://github.com/pannarale/pycbc/tree/time_delay_optimization to the current code in terms of
- the plot with run time vs number of slides, and
- the profiling of the specific job you have for the current code?
Before the change vs after the change for a single point search! Thanks, @sebastiangomezlopez
It may be beneficial (in terms of memory and/or computing time, though maybe not code clarity) to switch from a nested dict data structure to a 3-dimensional Numpy array for storing quantites that depend on (detector, position, slide).
If you do, I suggest writing a comment that explains the intent of each dimension of the 3D array clearly.
We will try that and add a third curve to the performance plot.
It should be possible to call time_delay_from_earth_center once and have it compute N_IFO * N_skypoints quantities. I think it could take vectorized input (might need some poking).
If that doesn't work, this could be cythonized, but I think it should be able to take vectorized inputs.
... But it would be good to post profile graphs here (I highly recommend gprof2dot (https://github.com/jrfonseca/gprof2dot) for making these).
This has been fixed in #4710.