nmt icon indicating copy to clipboard operation
nmt copied to clipboard

Re-train by using the ckeckpoint

Open Marilena263 opened this issue 7 years ago • 12 comments

Hi, I have a question. Right now when we train the model the weights get initialized to the same value (which is 0.1 by default). The question is how can we train the model, so that it initializes the weights using the values from the checkpoint? (--ckpt=/path/to/checkpoint/translate.ckpt) - In other words how do I fine-tune the model for a new dataset.

Marilena263 avatar Dec 04 '17 15:12 Marilena263

@Marilena263

If you want to use the --ckpt for training, you need to make a small change here. You may want to use the load_model method for loading from --ckpt directly.

oahziur avatar Dec 08 '17 04:12 oahziur

Hi, If I stop my training and re-run from the latest checkpoint the bleu score for dev/test sets is increased. What is the reason for that? Should I update the random_seed parameter after each epoch?

d2sys avatar Dec 24 '17 12:12 d2sys

As far as I have understood, loading the parameters from the last checkpoint is now the default behavior.

nbro avatar Jan 10 '18 23:01 nbro

Hi, I have a question regarding model re-training. I want to run incremental training on my trained German-English Engine using NMT with subword BPE encoding. Can I update my vocab file with new words from the incremental training data. If Yes, then kindly let me know the process.

Should I append the new words at the end of the existing vocabulary file while running incremental training? Or should i do a sorting of the vocab file after appending the new words to it?

Sabyasachi18 avatar Jan 22 '18 10:01 Sabyasachi18

Hi, There is a problem: Re-train with --ckpt directly, the running command is python -m nmt.nmt --ckpt directly? Thank you very much!

yapingzhao avatar Jun 04 '18 12:06 yapingzhao

@yapingzhao use --ckpt flag like --ckpt=/path/to/last/saved/ckpt . I also had to reset the num_train_steps in hparams. please note the --num_keep_ckpts, can be important for re-train

ArashHosseini avatar Sep 25 '18 08:09 ArashHosseini

@Sabyasachi18 I was trying similar approach as yours. Were you able to work it out? If yes, please share your process.

kbv71 avatar Jan 01 '19 14:01 kbv71

I'm using tensorflow 1.0.1. I'm new in deep learning and following tensorflow nmt tutorial. how can i re run training from last saved checkpoints. Is it default in tensorflow 1.0.1. Or should i have to give it last saved ckpt path to re run. Kindly guide me. Thank you.

sahertariq07 avatar Feb 27 '19 05:02 sahertariq07

Hi @sahertariq07, use --ckpt flag like --ckpt=/path/to/last/saved/ckpt explicitly

ArashHosseini avatar Feb 27 '19 07:02 ArashHosseini

@ArashHosseini thank you very much for your response....

sahertariq07 avatar Feb 27 '19 10:02 sahertariq07

@ArashHosseini thank you very much for your response....

sahertariq07 avatar Feb 27 '19 10:02 sahertariq07

@sahertariq07, welcome

ArashHosseini avatar Feb 27 '19 11:02 ArashHosseini