OpenNMT-py
OpenNMT-py copied to clipboard
Keeping encoder/decoder fixed
Hi OpenNMTers, may you point me to me the section of the code that I can modify to keep either encoder or decoder fixed while updating the generator? For example, if I train from a existing model, and I want to keep encoder fixed, and update only decoder and generator, which section of code do I start with?
Thanks a lot!
See #1418 and https://forum.opennmt.net/t/transformer-freezing-encoder-while-training-decoder/2723
@francoishernandez Thank you. I saw the threads interest a lot of people, but has anyone done a commit with the flags mentioned? Thanks!
It seems that people still dont have an exact solution for doing so, I wonder why people mention that requiers_grad = False is not possible at the moment without disconnecting encoder/decoder
@i55code following #1895
No experiment was made on this topic, at least on our side.
If you feel like contributing, you might want to have a look at the different parts of the model architecture and building function (onmt.encoders.*
/ onmt.model_builder
). You may try to add something similar to what is proposed in https://github.com/pytorch/fairseq/pull/1710 for instance.
I also require freezing the encoder/decoder for one of my experiments. I'm currently adding the feature and planning to bundle into a PR!