attention-is-all-you-need-pytorch
attention-is-all-you-need-pytorch copied to clipboard
slow and inaccurate
I run
python train.py -data_pkl ./bpe_deen/bpe_vocab.pkl -train_path ./bpe_deen/deen-train -val_path ./bpe_deen/deen-val -log deen_bpe -embs_share_weight -proj_share_weight -label_smoothing -save_model trained -b 64 -warmup 128000 -epoch 400
But the training is slow and inaccurate.
[ Epoch 306 ]
- (Training) ppl: 69.81372, accuracy: 39.721 %, elapse: 24.194 min
- (Validation) ppl: 501.22445, accuracy: 18.472 %, elapse: 0.088 min
[ Epoch 307 ]
- (Training) ppl: 69.74481, accuracy: 39.743 %, elapse: 24.189 min
- (Validation) ppl: 463.02458, accuracy: 19.354 %, elapse: 0.089 min
Here I change batch_size from 256 to 64 because of the limit of cuda memory, is this the reason?
I run
python train.py -data_pkl ./bpe_deen/bpe_vocab.pkl -train_path ./bpe_deen/deen-train -val_path ./bpe_deen/deen-val -log deen_bpe -embs_share_weight -proj_share_weight -label_smoothing -save_model trained -b 64 -warmup 128000 -epoch 400
But the training is slow and inaccurate.
[ Epoch 306 ] - (Training) ppl: 69.81372, accuracy: 39.721 %, elapse: 24.194 min - (Validation) ppl: 501.22445, accuracy: 18.472 %, elapse: 0.088 min [ Epoch 307 ] - (Training) ppl: 69.74481, accuracy: 39.743 %, elapse: 24.189 min - (Validation) ppl: 463.02458, accuracy: 19.354 %, elapse: 0.089 min
Here I change batch_size from 256 to 64 because of the limit of cuda memory, is this the reason?
hello, I also encountered this exactly the same issue, have you got the reason for this?
I have the same issue too.