gpt-2-simple
gpt-2-simple copied to clipboard
Cannot convert to pytorch using huggingface
Huggingface requires "/path/to/gpt2/pretrained/weights" and I just don't understand what path should I enter here. It's not like I haven't tried everything - checkpoint, the folder inside it and even the checkpoint itself. But it just doesn't work.
!python export OPENAI_GPT2_CHECKPOINT_PATH=/path/to/gpt2/pretrained/weights
!python transformers-cli convert --model_type gpt2 \
--tf_checkpoint $OPENAI_GPT2_CHECKPOINT_PATH \
--pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
[--config OPENAI_GPT2_CONFIG] \
[--finetuning_task_name OPENAI_GPT2_FINETUNED_TASK]
Any help will be greatly appreciated.
Thanks!
Not sure how it works in the cli but in python you could do
config = transformers.GPT2Config.from_pretrained('gpt2')
model = transformers.GPT2Model.from_pretrained(index_path,from_tf=True,config=config)
Where index_path is the tensorflow index checkpoint So something like "checkpoint/run1/model-XXX.index".
Docs on loading pretrained models
I managed to load the model, thanks to your advice. But I was struggling to load the Tokenizer. This seems to do the trick:
tokenizer = GPT2Tokenizer("checkpoint/run1/encoder.json", "checkpoint/run1/vocab.bpe")
print(tokenizer("Bxe3")['input_ids'])
but
tokenizer2 = GPT2Tokenizer.from_pretrained('gpt2')
print(tokenizer2("Bxe3")['input_ids'])
outputs the same input_ids [33, 27705, 18] which made me suspect something was wrong. But I guess I was wrong since I have a finetuned model and finetuning does not change the tokenizer as far as I am aware of.
Anyways maybe this helps someone.
Not sure how it works in the cli but in python you could do
config = transformers.GPT2Config.from_pretrained('gpt2') model = transformers.GPT2Model.from_pretrained(index_path,from_tf=True,config=config)
Where index_path is the tensorflow index checkpoint So something like "checkpoint/run1/model-XXX.index".
Docs on loading pretrained models
For those of you who seem to get an tensor error, make sure you load the correct config.
config = transformers.GPT2Config.from_pretrained('/content/checkpoint/run1/hparams.json')
model = transformers.GPT2Model.from_pretrained(index_path,from_tf=True,config=config)
Hello! Have this been figured out? I tried to follow the same from here with this code
config = transformers.GPT2Config.from_pretrained('/content/hparams.json')
tokenizer = transformers.GPT2Tokenizer("/content/encoder.json", "/content/vocab.bpe")
model = transformers.GPT2Model.from_pretrained('/content/model.index',from_tf=True,config=config)
But I'm seeing this error:
OpError: /content/model.data-00000-of-00001; No such file or directory
Has anybody seen this and know how to fix it?
Thanks!
Typically the GPT2Model.from_pretrained should point to a folder rather then a specific file. Try
model = transformers.GPT2Model.from_pretrained('/content/model.index',from_tf=True,config=config)
If that doesn't work print off your file structure in your content directory so we can debug
Windows
Get-ChildItem -Directory -Recurse -Depth 2 | ForEach-Object {Write-Host $_.FullName}
Linux
tree -L 2
Or give us the code on how you downloaded the model
Typically the GPT2Model.from_pretrained should point to a folder rather then a specific file. Try
model = transformers.GPT2Model.from_pretrained('/content/model.index',from_tf=True,config=config)
If that doesn't work print off your file structure in your content directory so we can debug Windows
Get-ChildItem -Directory -Recurse -Depth 2 | ForEach-Object {Write-Host $_.FullName}
Linux
tree -L 2
Or give us the code on how you downloaded the model
Doing the above solved the issue from before but raised a new issue "IndexError: Read fewer bytes than requested".
I can't seem to find much information about this error, just 2 so discussions that mention memory not being enough or corrupted weight files. I didn't see Colab crashing and I'm unsure how to check for corrupted weight files. I could retrain and try with those files instead.
Have you seen that issue before?
As to how I downloaded the model, I just used what was provided in the gpt2-simple colab.
gpt2.copy_checkpoint_to_gdrive(run_name='run1')
Followed by downloading it to my local computer. So there is a chance something got corrupted along the way but I figured I'd ask anyways just in case.
Many thanks for all the help so far!