ngx-toastr
ngx-toastr copied to clipboard
Layer freezing
Adding some useful utility functions for transfer learning
@matthiasplappert can we put this through?
Hey, thanks for this contribution.
One thing that I'm wondering is if this could instead simply be done on the Keras model? It seems a bit odd that this is now part of the agent API since it seems quite specific.
@matthiasplappert I think some part of this can be moved to keras itself. But some part of it has to be part of keras-rl as well. For example ddpg has two components,. Actor and critic so freezing some layers in the agent would mean freezing them symmetrically in bot of these networks.
Given this the problem now is as follows. If we put some part of this layer freezing mechanism into keras keras-rl will betied to a specific version of keras which is not so nice. Besides Keras is so big that I do not think a PR there has any chance of getting through in any realistic time.
@ParthaEth Hey, thanks for the contribution! Can you please share one example applying transfer learning? As far as I understand, you can train during a long time and get some weights, then load those weights and train for a period in a different environment for updating the weights to this new environment.
But, what are the things you have to take into account? I've read this, but it does not tell anything useful for me. https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14787
Thanks!