ElegantRL
ElegantRL copied to clipboard
Conditions to stop training when target return is reached
Hello, I am testing elegantRL for different environments, but I could not find any examples of completed training. I have been running the LunarLander example for a few days, using the original parameters (attached). How can I tell when the model is fully trained? Is there a minimum number of episodes that need to exceed the target return? If so, where can I specify that? Or is the training indefinite and I have to stop it manually?
While the original example is still running, I tried changing some parameters to see if the training would finish, or if there were any statistics besides the ones shown in the log. I set the target return to 2 and the eval times also 2, and the model surpassed the target return, but it kept training even after several episodes with avgR above 2.
Could you please explain how the stop condition works and what are the best practices for evaluating the model performance? Thank you very much.