xlnet
xlnet copied to clipboard
Error while running the pretrained model on MNLI
Used the following command to run MNLI using the pretrained model:
python run_classifier.py --do_train=False --do_eval=True --task_name=mnli_matched --data_dir=../MNLI/MNLI --output_dir=results --model_dir=model/xlnet_cased_L-24_H-1024_A-16 --uncased=False --spiece_model_file=model/xlnet_cased_L-24_H-1024_A-16/spiece.model --model_config_path=model/xlnet_cased_L-24_H-1024_A-16/xlnet_config.json --max_seq_length=128 --eval_batch_size=8 --num_hosts=1 --num_core_per_host=1 --eval_all_ckpt=False --is_regression=False
It throws the following error: NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Key model/classification_mnli_matched/logit/bias not found in checkpoint [[node save/RestoreV2 (defined at /home/demo/anaconda2/lib/python2.7/site-packages/tensorflow_estimator/python/estimator/estimator.py:1537) ]]
Am I using the correct command (TF version 1.13.1)? Thank you.
You need to set init_checkpoint
to be model/xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt
and model_dir
to be a new separate folder.
Thank you for such a fast response.
I am using this command now:
python run_classifier.py --do_train=False --do_eval=True --task_name=mnli_matched --data_dir=../MNLI/MNLI --output_dir=results --init_checkpoint=model/xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt --uncased=False --spiece_model_file=model/xlnet_cased_L-24_H-1024_A-16/spiece.model --model_config_path=model/xlnet_cased_L-24_H-1024_A-16/xlnet_config.json --max_seq_length=128 --eval_batch_size=8 --num_hosts=1 --num_core_per_host=1 --eval_all_ckpt=False --is_regression=False --model_dir=f_model
,where f_model is a new directory ( for this argument help section shows: "help="Directory for saving the finetuned model.". Since I am just doing evaluation and not training my assumption is this folder should remain empty; ).
Now I am getting the following error (as eval_results is empty):
Traceback (most recent call last):
File "run_classifier.py", line 855, in
My doubt is, in line 759 the filenames are being read from model_dir, which in my case is just an empty folder:
filenames = tf.gfile.ListDirectory(FLAGS.model_dir)
Let me know if something is wrong here.
I think I will read the code in detail to see if I am missing something, but just in case you happen to know the fix then that would be great (and quick). Thank you again.
You can't do eval without training because there are task-specific parameters (the output layer).
Ugh should have seen that. Thank you!
Well I think it's possible but does not make too much sense.
True. I think fine-tuning first makes sense too and MNLI has the train dataset.
Hi, @LeenaShekhar I meet the same issue. Have you solved it yet?
Having the same issue too. I used the suggestions from above; I get the DataLossError on the .ckpt-1200.data-00000-of-00001 file. Any suggestions?
I also encountered the same issue. Solved when the values of data_dir and output_dir are set to different paths.