albert icon indicating copy to clipboard operation
albert copied to clipboard

Squad v2 does not seem to train.

Open Matimath opened this issue 5 years ago • 2 comments

Hi, I have run into the following issue. I run squad v2 with the following command: python -m run_squad_v2
--albert_config_file=albert_base/albert_config.json
--output_dir=./outputs
--train_file=squad/train-v2.0.json
--predict_file=squad/dev-v2.0.json
--vocab_file=albert_base/30k-clean.vocab
--train_feature_file=train_feature_file.tf
--predict_feature_file=predict_feature_file.tf
--predict_feature_left_file=predict_left_feature_file.tf
--init_checkpoint=albert_base/model.ckpt-best.index
--spm_model_file=albert_base/30k-clean.model
--do_lower_case
--max_seq_length=384
--doc_stride=128
--max_query_length=64
--do_train
--do_predict
--train_batch_size=48
--predict_batch_size=8
--learning_rate=5e-5
--num_train_epochs=5.0
--warmup_proportion=.1
--save_checkpoints_steps=5000
--n_best_size=20
--max_answer_length=30

Unfortunately, the model does not seem to train. I get the following results:

exact = 15.269940200454814 f1 = 17.577793829645486 null_score_diff_threshold = 0.0 total = 11873 best f1 = 50.07159100480081

exact = 19.910721805777815 f1 = 21.955459178333086 null_score_diff_threshold = 0.0 total = 11873 best f1 = 50.07159100480081

exact = 18.05777815210983 f1 = 20.522545630117914 null_score_diff_threshold = 0.0 total = 11873 best f1 = 50.07159100480081

exact = 22.041607007495998 f1 = 24.285709999165526 null_score_diff_threshold = 0.0 total = 11873 best f1 = 50.07159100480081

exact = 50.07159100480081 f1 = 50.07159100480081 null_score_diff_threshold = 0.4828525483608246 total = 11873 best perf happened at step: 0

Could you hint me what might be the problem?

Matimath avatar Jan 24 '20 11:01 Matimath

I change --init_checkpoint=albert_base/model.ckpt-best.index to --init_checkpoint=albert_base/model.ckpt-best and it seems to work.

penut85420 avatar Feb 06 '20 03:02 penut85420

Hi, I have run into the following issue. I run squad v2 with the following command: python -m run_squad_v2 --albert_config_file=albert_base/albert_config.json --output_dir=./outputs --train_file=squad/train-v2.0.json --predict_file=squad/dev-v2.0.json --vocab_file=albert_base/30k-clean.vocab --train_feature_file=train_feature_file.tf --predict_feature_file=predict_feature_file.tf --predict_feature_left_file=predict_left_feature_file.tf --init_checkpoint=albert_base/model.ckpt-best.index --spm_model_file=albert_base/30k-clean.model --do_lower_case --max_seq_length=384 --doc_stride=128 --max_query_length=64 --do_train --do_predict --train_batch_size=48 --predict_batch_size=8 --learning_rate=5e-5 --num_train_epochs=5.0 --warmup_proportion=.1 --save_checkpoints_steps=5000 --n_best_size=20 --max_answer_length=30

Unfortunately, the model does not seem to train. I get the following results:

exact = 15.269940200454814 f1 = 17.577793829645486 null_score_diff_threshold = 0.0 total = 11873 best f1 = 50.07159100480081

exact = 19.910721805777815 f1 = 21.955459178333086 null_score_diff_threshold = 0.0 total = 11873 best f1 = 50.07159100480081

exact = 18.05777815210983 f1 = 20.522545630117914 null_score_diff_threshold = 0.0 total = 11873 best f1 = 50.07159100480081

exact = 22.041607007495998 f1 = 24.285709999165526 null_score_diff_threshold = 0.0 total = 11873 best f1 = 50.07159100480081

exact = 50.07159100480081 f1 = 50.07159100480081 null_score_diff_threshold = 0.4828525483608246 total = 11873 best perf happened at step: 0

Could you hint me what might be the problem?

hi,can you show us your new results,Thanks!

urextra avatar Jul 08 '20 14:07 urextra