pycbc
pycbc copied to clipboard
Moving em_progenitors.py and ns_sequence data from tmpltbank to new neutron_stars package
The functions within tmpltbank/em_progenitors.py serve purposes beyond template banks, for instance, they are used quite integrally in pygrb where the remnant mass function is used to separate potentially EM-bright injections from EM-dim injections. Here, I have moved the remnant_mass function to conversions.py so it may be used more broadly, and moved the rest of the em_progenitors.py functions to the new ns_functions.py (name subject to change) within a new package under the name neutron_stars. This package also now houses the ns_sequence data once stored in tmpltbank since it is more relevant to the functions stored here. The aim of this change is to allow for these functions to be used more broadly under a more appropriately named package.
@a-r-williamson @SamuelH-97 @pannarale @jakeb245 @nathan.ormsby Hi Samuel (and possibly Andrew). Jacob Buchanan and Nathan Ormsby would like to help with the injection part of the PyGRB workflow if possible. We've taken a look through this PR and the linked PR (not all the code in detail), and would like to know if there is a specific issue or problem we could tackle. It looks like the major problem/issue is editing the code to make it mergeable with the upstream branch? If you can just give us one thing to do, we'll start diving into the code and take it on!
Starting to look at this again. It seems codeclimate has picked up a few genuine issues, for example there is a reference to an undefined variable, and some docstrings have backslashes which are being interpreted as escape sequences. Could someone take a look at those things?
Starting to look at this again. It seems codeclimate has picked up a few genuine issues, for example there is a reference to an undefined variable, and some docstrings have backslashes which are being interpreted as escape sequences. Could someone take a look at those things?
Thanks for the heads up. The undefined variable was an error on my part in a recent change and has bee resolved.
As for the backslashes, they seem to be a part of the equation formatting for automated documentation. I'm unsure if he has the time to check these but perhaps @a-r-williamson could provide some wisdom here. I'm somewhat unfamiliar with these changes to the docstrings and how they may affect both the everyday running of the code and the documentation
@SamuelH-97 For the docstrings I think you just need to add the "r" prefix: https://developer.lsst.io/python/numpydoc.html#docstrings-must-be-delimited-by-triple-double-quotes
which should not mess up the latex. In this case I think it's okay as there's nothing special in here, but try and use \tan
and you'll probably spot the problem!!
Or double the backslashes, \
-> \\
I took another deep look through, and left some trivial comments and a more substantial one. I think those will be my last comments.
@spxiwh recommends moving the large pickle file to the pycbc-config repository. I agree the file is a bit large to keep here. Will look into that.
Some notes for posterity.
Francesco and I spent some time looking into why the ISSO interpolant is needed, since it adds quite a bit of complexity. We reproduced an earlier test from Andrew, which indeed indicates that generating the injections with the exact calculation takes ~10 min, versus just a few seconds using the interpolant. This, however, seems to contradict my own timing of the ISSO solver, which indicates that the exact calculation takes just about half a millisecond for a single point, and I struggle to believe the interpolant would be significantly faster than that.
Bottom line: I propose we modify this PR to only include the exact ISSO calculation, and leave adding the interpolant as a smaller PR later, if the extra time is found to be really unacceptable. Meanwhile, I will do some profiling to understand where the time is coming from, and see if we can find a solution that does not involve carrying around a ~1 Mb precalculated file.
I have been poking at the ISSO code a bit more, so here are a few more observations.
First, I found that PG_ISSO_solver()
is already quite fast, however the RectBivariateSpline
is faster by orders of magnitude. Since pycbc_create_injections
seems to invoke the constraint function millions and millions of times, using the spline does indeed reduce the execution time from several minutes (or even almost one hour with 100000 injections) to just a few seconds. So using the spline is indeed beneficial, though probably not immediately essential, as I am told that PyGRB creates the injections as a Condor job anyway.
Next, I spent some time looking for a way to make the spline pickle file significantly smaller by making the interpolation grid coarser. In doing so, I realized that PG_ISSO_solver()
does not produce a sensible result when the spin magnitude is exactly 1:
This problem seems to be greatly reduced by removing the bracketing used in the third root finding in
PG_ISSO_solver()
, although the curve I obtain does still have a weird kink. I will not request that this issue is fixed in this PR, but we should fix it soon after this is merged.
Anyway, after making that change, I found that I can reduce the spline pickle file to ~15 Kb only, and still obtain an interpolation better than 1e-5 almost everywhere (note that the spline proposed in the PR does not satisfy the condition that the error is below 1e-5 everywhere, contrary to what the docstring of generate_isso_bivariate_interp()
claims).
Conclusion of this second note: I still propose we modify this PR to only include the exact ISSO calculation (I will look into this myself next). In one or two followup PRs, we will then address the high-spin problem in PG_ISSO_solver()
and bring in a smaller version of the interpolating spline.
It seems I cannot open a PR against this PR, but the change I propose is on this branch: https://github.com/titodalcanton/pycbc/tree/embright_remove_isso_interp.