Finalizing eScience contributions

Since our time is running out, we thought it would be useful to have an overview of our remaining PRs, separated into essentials and optionals.

I think our main priority should be to get the essentials merged and have a tag from which to start the final runs for the paper.

Essentials

name	PR	purpose	status
Fk refactor	#1936	~3.5x speedup	merged
Parallel hyperoptimization with MongoDB	#1921	parallelisation in trials -> ~3.5x speedup	merged
Hyperopt loss	#1726	implementation of multiple losses	merged
Make weight initialization reproducible	#1923	testing, consistency	merged
hyperopt runcard	#1986	(not an eScience contribution but essential to start runs so thought I'd add it here)	merged

Optional, if time allows

name	PR	purpose	status
Avoid idle gpu	#1939	~25% speedup	ready for review
Avoiding duplicated computations by having a single observable model	#1855	~30% speedup	needs feedback
Implementation of hyperopt model selection	#1976	automate, integrate final selection	in progress

Mar 04 '24 09:03 APJansen

I would honestly skip old the optionals (other than the reproducible weight initialisation maybe) and focus instead on polishing the stuff that's already there*. For instance, these problems with tf 2.16 / python 3.12, they will only grow as keras 3 becomes the standard. Making sure that things like multidense / multireplica fits are robust and that we don't "lose them" when keras 3.1 comes out is important.

(and, as with anything that touches the under-the-hood tensorflow, like all the Meta-whatever stuff, it will have many chances to break)

*actually, I would say the weight initialization is part of this polishing

Mar 04 '24 21:03 scarlehoff

This is now finished, with only this item still to be merged (but finished otherwise) https://github.com/NNPDF/nnpdf/pull/1976

Jul 25 '24 15:07 scarlehoff

nnpdf nnpdf copied to clipboard