ngx-toastr icon indicating copy to clipboard operation
ngx-toastr copied to clipboard

Layer freezing

Open ParthaEth opened this issue 7 years ago • 4 comments

Adding some useful utility functions for transfer learning

ParthaEth avatar Nov 07 '17 18:11 ParthaEth

@matthiasplappert can we put this through?

ParthaEth avatar Nov 09 '17 23:11 ParthaEth

Hey, thanks for this contribution.

One thing that I'm wondering is if this could instead simply be done on the Keras model? It seems a bit odd that this is now part of the agent API since it seems quite specific.

matthiasplappert avatar Nov 10 '17 12:11 matthiasplappert

@matthiasplappert I think some part of this can be moved to keras itself. But some part of it has to be part of keras-rl as well. For example ddpg has two components,. Actor and critic so freezing some layers in the agent would mean freezing them symmetrically in bot of these networks.

Given this the problem now is as follows. If we put some part of this layer freezing mechanism into keras keras-rl will betied to a specific version of keras which is not so nice. Besides Keras is so big that I do not think a PR there has any chance of getting through in any realistic time.

ParthaEth avatar Nov 13 '17 22:11 ParthaEth

@ParthaEth Hey, thanks for the contribution! Can you please share one example applying transfer learning? As far as I understand, you can train during a long time and get some weights, then load those weights and train for a period in a different environment for updating the weights to this new environment.

But, what are the things you have to take into account? I've read this, but it does not tell anything useful for me. https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14787

Thanks!

ghub-c avatar Apr 11 '18 09:04 ghub-c