nnpdf
nnpdf copied to clipboard
Finalizing eScience contributions
Finalizing eScience contributions
Since our time is running out, we thought it would be useful to have an overview of our remaining PRs, separated into essentials and optionals.
I think our main priority should be to get the essentials merged and have a tag from which to start the final runs for the paper.
Essentials
| name | PR | purpose | status |
|---|---|---|---|
| Fk refactor | #1936 | ~3.5x speedup | merged |
| Parallel hyperoptimization with MongoDB | #1921 | parallelisation in trials -> ~3.5x speedup | merged |
| Hyperopt loss | #1726 | implementation of multiple losses | merged |
| Make weight initialization reproducible | #1923 | testing, consistency | merged |
| hyperopt runcard | #1986 | (not an eScience contribution but essential to start runs so thought I'd add it here) | merged |
Optional, if time allows
| name | PR | purpose | status |
|---|---|---|---|
| Avoid idle gpu | #1939 | ~25% speedup | ready for review |
| Avoiding duplicated computations by having a single observable model | #1855 | ~30% speedup | needs feedback |
| Implementation of hyperopt model selection | #1976 | automate, integrate final selection | in progress |
I would honestly skip old the optionals (other than the reproducible weight initialisation maybe) and focus instead on polishing the stuff that's already there*. For instance, these problems with tf 2.16 / python 3.12, they will only grow as keras 3 becomes the standard. Making sure that things like multidense / multireplica fits are robust and that we don't "lose them" when keras 3.1 comes out is important.
(and, as with anything that touches the under-the-hood tensorflow, like all the Meta-whatever stuff, it will have many chances to break)
*actually, I would say the weight initialization is part of this polishing
This is now finished, with only this item still to be merged (but finished otherwise) https://github.com/NNPDF/nnpdf/pull/1976