nmt icon indicating copy to clipboard operation
nmt copied to clipboard

Training too slow

Open janenie opened this issue 7 years ago • 7 comments

Hi Is there any possible way to accelerate the code ? I am running a training data with only 300 vocabulary size and 3w training instances with maximum length 50, but it takes almost 1 hour to finish training a epoch

What happened to this version of code

Thanks

janenie avatar Nov 22 '17 04:11 janenie

i am using 100k vocabulary size and 10 million training data, it take 32 hours to training 127k steps with around 17 BLUE for english to chinese. batch size is set to 64.

1.use batch size as big as possible, as long as GPU can support 2.hidden size is 1024 by default, you can reduce it to 800 or 512 if out of memory of GPU. 3.for machine translation, with deeper layers it take long time to train and more memory of GPU, but performance improve is small. you can set layer to 2.

here is the command: CUDA_VISIBLE_DEVICES=7 nohup python -m nmt.nmt --attention=normed_bahdanau --src=en --tgt=zh --train_prefix=nmt_data_chinese/train --dev_prefix=nmt_data_chinese/dev
--test_prefix=nmt_data_chinese/test --out_dir=nmt_attention_model_big_pte_batch64 --num_train_steps=4800000 --steps_per_stats=100 --num_layers=2 --num_units=800
--dropout=0.5 --metrics=bleu --learning_rate=0.001 --optimizer=adam --encoder_type=bi --batch_size=64 --attention_architecture=gnmt_v2 --src_max_len=25
--subword_option=bpe --unit_type=layer_norm_lstm --vocab_prefix=nmt_data_chinese/vocabulary &

brightmart avatar Nov 22 '17 08:11 brightmart

I have a question to ask: My training language is about 70,000 sentences. How many of my dictionary sizes are appropriate? Thank you.

yapingzhao avatar Apr 18 '18 08:04 yapingzhao

It is a small corpus. You can try 50k. You can also use larger size if you cpu/gpu allows.


发件人: zhaoyaping [email protected] 发送时间: 2018年4月18日 16:34:20 收件人: tensorflow/nmt 抄送: brightmart; Comment 主题: Re: [tensorflow/nmt] Training too slow (#183)

I have a question to ask: My training language is about 70,000 sentences. How many of my dictionary sizes are appropriate? Thank you.

― You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/tensorflow/nmt/issues/183#issuecomment-382309089, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ASuYMPwwRogmHAkncGy2i_L6KiVxCNtGks5tpvqMgaJpZM4Qm2tH.

brightmart avatar Apr 18 '18 08:04 brightmart

I would like to ask if 50k is equivalent to 50000 (dictionary size)?I'm a neural network beginner (smile). Thank you.

yapingzhao avatar Apr 18 '18 09:04 yapingzhao

hi,

50k=50,000


发件人: zhaoyaping [email protected] 发送时间: 2018年4月18日 17:19 收件人: tensorflow/nmt 抄送: brightmart; Comment 主题: Re: [tensorflow/nmt] Training too slow (#183)

I would like to ask if 50k is equivalent to 5000 (dictionary size)?I'm a neural network beginner (smile). Thank you.

― You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/tensorflow/nmt/issues/183#issuecomment-382322064, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ASuYMFCCzgAQd-VSsH2vLqYvE2SafSceks5tpwUhgaJpZM4Qm2tH.

brightmart avatar Apr 19 '18 05:04 brightmart

@brightmart What is the size of your dev set ? Does it matter if we have bigger dev set, then training takes longer to complete?

vikaskumarjha9 avatar Dec 18 '18 00:12 vikaskumarjha9

if you have a big dev set, you can choose part of dev set to evaluate during training.

brightmart avatar Dec 18 '18 13:12 brightmart