style-transfer-paraphrase
style-transfer-paraphrase copied to clipboard
OSError: file style_paraphrase/saved_models/test_paraphrase/config.json not found
I tried training the paraphraser with gpt2 (small) as the large model would not fit my 1080 Ti. Everything went alright until the last iteration, where I got the error below. The final checkpoint seems to have been saved successfully. However, python tries to read from
file style_paraphrase/saved_models/test_paraphrase/config.json
which was not created and does not exist. All config.json files are inside their respective checkpoint folders.
12/03/2020 18:22:39 - INFO - __main__ - global_step = 21918, average loss = 1.8063476276852939
12/03/2020 18:22:40 - INFO - __main__ - Saving model checkpoint to style_paraphrase/saved_models/test_paraphrase/checkpoint-21918
Traceback (most recent call last):
File "/home/ioannis/anaconda3/envs/style-transfer-paraphrase/lib/python3.7/site-packages/transformers/configuration_utils.py", line 369, in get_config_dict
local_files_only=local_files_only,
File "/home/ioannis/anaconda3/envs/style-transfer-paraphrase/lib/python3.7/site-packages/transformers/file_utils.py", line 957, in cached_path
raise EnvironmentError("file {} not found".format(url_or_filename))
OSError: file style_paraphrase/saved_models/test_paraphrase/config.json not found
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "style_paraphrase/run_lm_finetuning.py", line 507, in <module>
main()
File "style_paraphrase/run_lm_finetuning.py", line 437, in main
tokenizer_class=tokenizer_class)
File "/home/ioannis/Desktop/style-transfer-paraphrase/style_paraphrase/utils.py", line 51, in init_gpt2_model
model = model_class.from_pretrained(checkpoint_dir)
File "/home/ioannis/anaconda3/envs/style-transfer-paraphrase/lib/python3.7/site-packages/transformers/modeling_utils.py", line 876, in from_pretrained
**kwargs,
File "/home/ioannis/anaconda3/envs/style-transfer-paraphrase/lib/python3.7/site-packages/transformers/configuration_utils.py", line 329, in from_pretrained
config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/ioannis/anaconda3/envs/style-transfer-paraphrase/lib/python3.7/site-packages/transformers/configuration_utils.py", line 382, in get_config_dict
raise EnvironmentError(msg)
OSError: Can't load config for 'style_paraphrase/saved_models/test_paraphrase'. Make sure that:
- 'style_paraphrase/saved_models/test_paraphrase' is a correct model identifier listed on 'https://huggingface.co/models'
- or 'style_paraphrase/saved_models/test_paraphrase' is the correct path to a directory containing a config.json file
Traceback (most recent call last):
File "/home/ioannis/anaconda3/envs/style-transfer-paraphrase/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/ioannis/anaconda3/envs/style-transfer-paraphrase/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ioannis/anaconda3/envs/style-transfer-paraphrase/lib/python3.7/site-packages/torch/distributed/launch.py", line 260, in <module>
main()
File "/home/ioannis/anaconda3/envs/style-transfer-paraphrase/lib/python3.7/site-packages/torch/distributed/launch.py", line 256, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/ioannis/anaconda3/envs/style-transfer-paraphrase/bin/python', '-u', 'style_paraphrase/run_lm_finetuning.py', '--local_rank=0', '--output_dir=style_paraphrase/saved_models/test_paraphrase', '--model_type=gpt2', '--model_name_or_path=gpt2', '--data_dir=datasets/paranmt_filtered', '--do_train', '--save_steps', '500', '--logging_steps', '20', '--save_total_limit', '-1', '--evaluate_during_training', '--num_train_epochs', '3', '--gradient_accumulation_steps', '2', '--per_gpu_train_batch_size', '5', '--job_id', 'paraphraser_test', '--learning_rate', '5e-5', '--prefix_input_type', 'original', '--global_dense_feature_list', 'none', '--specific_style_train', '-1', '--optimizer', 'adam']' returned non-zero exit status 1.
Same problem with run_finetune_shakespeare_0.sh, btw, when training with gpt2 (small)
Thanks for reporting this! I will look more closely later in the day / tomorrow, but which HuggingFace transformers
library version are you using?
Should be transformers==3.4.0 as in the reqs file. I installed everything in a fresh conda env with python==3.7.
Btw, I am looking forward to the directions for training the inverse model on custom data!
I just tried running it with GPT2-small, and I can see the config.json
files. Could you share the set of files you see in your checkpoint folder?
I had the same issue when training my models. It seems like there is an issue with the path in this line. Basically, when re-loading the model, the args.output_dir
is used instead of the output_dir
that is defined a few lines above. So this points to the parent folder of all the checkpoints instead of the folder with the last checkpoint.
I haven't tested if this fixes the problem, but I will try it for my next run on the cluster.
Just to follow up: Changing the line mentioned above did fix the error. Just make sure that --do_eval
is set and that you are not using do_delete_old
. This way the best, i.e. lowest validation perplexity, checkpoint will be copied to the output dir / parent folder of all the checkpoints after training is finished.
@martiansideofthemoon Just curious, how could I also load gpt2-small
as you did? It seems that this is not offered in the HuggingFace model hub.
@guanqun-yang you can just use gpt2
offered on HuggingFace (https://huggingface.co/gpt2)