xlnet Error while running the pretrained model on MNLI

Used the following command to run MNLI using the pretrained model: python run_classifier.py --do_train=False --do_eval=True --task_name=mnli_matched --data_dir=../MNLI/MNLI --output_dir=results --model_dir=model/xlnet_cased_L-24_H-1024_A-16 --uncased=False --spiece_model_file=model/xlnet_cased_L-24_H-1024_A-16/spiece.model --model_config_path=model/xlnet_cased_L-24_H-1024_A-16/xlnet_config.json --max_seq_length=128 --eval_batch_size=8 --num_hosts=1 --num_core_per_host=1 --eval_all_ckpt=False --is_regression=False

It throws the following error: NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key model/classification_mnli_matched/logit/bias not found in checkpoint [[node save/RestoreV2 (defined at /home/demo/anaconda2/lib/python2.7/site-packages/tensorflow_estimator/python/estimator/estimator.py:1537) ]]

Am I using the correct command (TF version 1.13.1)? Thank you.

Jun 24 '19 23:06 LeenaShekhar

You need to set init_checkpoint to be model/xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt and model_dir to be a new separate folder.

Jun 24 '19 23:06 kimiyoung

Thank you for such a fast response.

I am using this command now: python run_classifier.py --do_train=False --do_eval=True --task_name=mnli_matched --data_dir=../MNLI/MNLI --output_dir=results --init_checkpoint=model/xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt --uncased=False --spiece_model_file=model/xlnet_cased_L-24_H-1024_A-16/spiece.model --model_config_path=model/xlnet_cased_L-24_H-1024_A-16/xlnet_config.json --max_seq_length=128 --eval_batch_size=8 --num_hosts=1 --num_core_per_host=1 --eval_all_ckpt=False --is_regression=False --model_dir=f_model

,where f_model is a new directory ( for this argument help section shows: "help="Directory for saving the finetuned model.". Since I am just doing evaluation and not training my assumption is this folder should remain empty; ).

Now I am getting the following error (as eval_results is empty): Traceback (most recent call last): File "run_classifier.py", line 855, in tf.app.run() File "/home/demo/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "run_classifier.py", line 797, in main for key, val in sorted(eval_results[0].items(), key=lambda x: x[0]): IndexError: list index out of range

My doubt is, in line 759 the filenames are being read from model_dir, which in my case is just an empty folder: filenames = tf.gfile.ListDirectory(FLAGS.model_dir)

Let me know if something is wrong here.

I think I will read the code in detail to see if I am missing something, but just in case you happen to know the fix then that would be great (and quick). Thank you again.

Jun 25 '19 00:06 LeenaShekhar

You can't do eval without training because there are task-specific parameters (the output layer).

Jun 25 '19 00:06 kimiyoung

Ugh should have seen that. Thank you!

Jun 25 '19 00:06 LeenaShekhar

Well I think it's possible but does not make too much sense.

Jun 25 '19 00:06 kimiyoung

True. I think fine-tuning first makes sense too and MNLI has the train dataset.

Jun 25 '19 00:06 LeenaShekhar

Hi, @LeenaShekhar I meet the same issue. Have you solved it yet?

Jul 06 '19 13:07 yana-xuyan

Having the same issue too. I used the suggestions from above; I get the DataLossError on the .ckpt-1200.data-00000-of-00001 file. Any suggestions?

Sep 11 '19 18:09 descartesholland

I also encountered the same issue. Solved when the values of data_dir and output_dir are set to different paths.

Nov 23 '19 13:11 samueldaliu

xlnet xlnet copied to clipboard

Error while running the pretrained model on MNLI

xlnet
xlnet copied to clipboard