rlberry
rlberry copied to clipboard
Stopping criterion utility
Implement a tool to stop the algorithm when some value (in the writer) go above some threshold. Typically for either early stopping or for stopping after a certain number of episodes.
Should we modify the signature of the fit method to be somth likedef fit(self, budget=100, stop_callback: Callable[[Writer], bool]) ?
I think there is not a simple way to solve this PR. Each agent have its own implementation of the fit method, so the early stopping criterion should be handled in each agents...
My idea was the following:
- Have an optional argument in the fit that default to None, as you said it can be something like
stop_callback: None or Callable[[Writer], bool]. This is optional, users can implement an agent without using it, the same way as users can use the budget or not (see ValueIteration for instance). - Have a prototype of callback function that can be used for anyone and is well documented.
- Have the function implemented in all rlberry agents to show how it is done.
This should be sufficient, we don't need to have something automatic, we only need to make it simple to use and have it implmented in the rlberry agents.