talos icon indicating copy to clipboard operation
talos copied to clipboard

Gracefully dealing with parameter combinations that cause errors during training

Open thawn opened this issue 5 years ago • 1 comments

1) It would be nice, if Talos could add

A way to handle Errors that appear only for certain parameter combinations. Sometimes, certain hyperparameter combinations might lead to an error that cannot be avoided (such as ResourceExhaustedError)

For example, in my model functionthat I pass to talos.Scan I would have the following code:

try:
    history = model.fit(...)
except ResourceExhaustedError:
    history = None

alternatively (and imho even more nice) would be a way to catch errors during training from within talos

2) Once implemented, I can see how this feature will

make it possible to scan parameter space that includes some combinations that lead to errors

3) I believe this feature is

  • [ ] critically important
  • [ ] must have
  • [x] nice to have

4) Given the chance, I'd be happy to make a PR for this feature

  • [ ] definitely
  • [x] possibly
  • [ ] unlikely

I would need pointers for how best to implement this:

  • should the try catch block be part of the user's model function or part of talos.Scan
  • what should the history object look like if an error occurred
  • where in the code would I need to add error handling in order to catch problems with empty history objects

thawn avatar Dec 03 '19 09:12 thawn

Welcome to Talos community! Thanks so much for creating your first issue :)

github-actions[bot] avatar Dec 03 '19 09:12 github-actions[bot]