talos
talos copied to clipboard
Gracefully dealing with parameter combinations that cause errors during training
1) It would be nice, if Talos could add
A way to handle Errors that appear only for certain parameter combinations. Sometimes, certain hyperparameter combinations might lead to an error that cannot be avoided (such as ResourceExhaustedError)
For example, in my model functionthat I pass to talos.Scan
I would have the following code:
try:
history = model.fit(...)
except ResourceExhaustedError:
history = None
alternatively (and imho even more nice) would be a way to catch errors during training from within talos
2) Once implemented, I can see how this feature will
make it possible to scan parameter space that includes some combinations that lead to errors
3) I believe this feature is
- [ ] critically important
- [ ] must have
- [x] nice to have
4) Given the chance, I'd be happy to make a PR for this feature
- [ ] definitely
- [x] possibly
- [ ] unlikely
I would need pointers for how best to implement this:
- should the try catch block be part of the user's model function or part of talos.Scan
- what should the history object look like if an error occurred
- where in the code would I need to add error handling in order to catch problems with empty history objects
Welcome to Talos community! Thanks so much for creating your first issue :)