agents icon indicating copy to clipboard operation
agents copied to clipboard

Feature Request: entropy_regularization Schedule or Function

Open egordon opened this issue 2 years ago • 1 comments

Hello! I'm personally working with a PPOClipAgent in an environment where we want to tune the entropy_regularization parameter during training, either on a schedule, or with a custom function in response to some training metrics.

Currently, the parameter must be a constant scalar type (e.g. float). And manually editing the private _entropy_regularization parameter only works in eager execution.

So, as far as I can tell, the only way to get this functionality with graph execution is to completely re-initialize the agent on every change.

If this is a semi-common use case, could we consider adding functionality similar to the various Keras optimizers (e.g. Adam) and allow the following possible inputs for the entropy_regularization parameter: float, Tensor, function, or something like LearningRateSchedule?

This is something I am happy to put cycles into with some guidance (as I have not contributed to this project before). Thanks!

egordon avatar Mar 02 '22 04:03 egordon

Yes, I think this makes sense. Please make a pull request if you are interested.

kuanghuei avatar Mar 31 '22 17:03 kuanghuei