OpenNMT-py icon indicating copy to clipboard operation
OpenNMT-py copied to clipboard

Keeping encoder/decoder fixed

Open i55code opened this issue 4 years ago • 5 comments

Hi OpenNMTers, may you point me to me the section of the code that I can modify to keep either encoder or decoder fixed while updating the generator? For example, if I train from a existing model, and I want to keep encoder fixed, and update only decoder and generator, which section of code do I start with?

Thanks a lot!

i55code avatar Aug 31 '20 19:08 i55code

See #1418 and https://forum.opennmt.net/t/transformer-freezing-encoder-while-training-decoder/2723

francoishernandez avatar Sep 01 '20 08:09 francoishernandez

@francoishernandez Thank you. I saw the threads interest a lot of people, but has anyone done a commit with the flags mentioned? Thanks!

i55code avatar Sep 01 '20 19:09 i55code

It seems that people still dont have an exact solution for doing so, I wonder why people mention that requiers_grad = False is not possible at the moment without disconnecting encoder/decoder

atseng17 avatar Sep 03 '20 14:09 atseng17

@i55code following #1895 No experiment was made on this topic, at least on our side. If you feel like contributing, you might want to have a look at the different parts of the model architecture and building function (onmt.encoders.* / onmt.model_builder). You may try to add something similar to what is proposed in https://github.com/pytorch/fairseq/pull/1710 for instance.

francoishernandez avatar Oct 21 '20 08:10 francoishernandez

I also require freezing the encoder/decoder for one of my experiments. I'm currently adding the feature and planning to bundle into a PR!

kyduff avatar Jul 26 '22 16:07 kyduff