agents
agents copied to clipboard
Feature Request: entropy_regularization Schedule or Function
Hello! I'm personally working with a PPOClipAgent in an environment where we want to tune the entropy_regularization
parameter during training, either on a schedule, or with a custom function in response to some training metrics.
Currently, the parameter must be a constant scalar type (e.g. float). And manually editing the private _entropy_regularization
parameter only works in eager execution.
So, as far as I can tell, the only way to get this functionality with graph execution is to completely re-initialize the agent on every change.
If this is a semi-common use case, could we consider adding functionality similar to the various Keras optimizers (e.g. Adam) and allow the following possible inputs for the entropy_regularization
parameter: float, Tensor, function, or something like LearningRateSchedule?
This is something I am happy to put cycles into with some guidance (as I have not contributed to this project before). Thanks!
Yes, I think this makes sense. Please make a pull request if you are interested.