pycbc icon indicating copy to clipboard operation
pycbc copied to clipboard

Cache calls to FrOpen(XX).cache to avoid calculate_psd being incredibly slow

Open spxiwh opened this issue 1 year ago • 3 comments

PROBLEM BEING SOLVED HERE: The calculate_psd jobs are normally run with a large list (a week of data) of frame files. Every time a PSD segment is calculated a call to the from_cli frame reading code is made. As this only reads frame files at the specific times, this shouldn't be an issue right? Well, not quite, because we don't assert the LVK frame file naming convention, so explicitly check what times each frame file covers. Which means that we need to do a quick file stat operation on every frame file, for every PSD segment. If you're doing an OSG run, you're likely pointing to frame files in a CVMFS location, which basically means that there's also a transfer process going on in the background as we need the entire frame file before we can read parts of it.

HOW ARE WE SOLVING IT: Cache the output of the FrStreamOpen command. calculate_psd is split over multiple cores, and so this is cached for each of the multiprocessing processes, but that's not really a problem. Having each process do this O(50) times for all the frame files in a chunk over CVMFS is a problem.

I want to run this in a CVMFS workflow and have had some issues with authentication etc., which should now be resolved, but this up for review anyway in case there are comments or objections.

spxiwh avatar Jun 27 '23 20:06 spxiwh

@spxiwh The CC issues look pretty minor so should be straightforward to address, go ahead merge when you are happy.

ahnitz avatar Jun 27 '23 21:06 ahnitz

Anything holding this up? (i.e. could/should it be merged before the new release?)

GarethCabournDavies avatar Oct 12 '23 10:10 GarethCabournDavies

I was seeing some weird failures when running with this. I don't think they were actually caused by this, but I've not been able to verify this at scale.

.... It probably is more of an issue when this code being run with frame files that are not on a fast disk (ie. running on older data).

In any case, I've not verified this in large-scale runs, as I would like before merging this.

spxiwh avatar Oct 12 '23 10:10 spxiwh