gpt-2-simple icon indicating copy to clipboard operation
gpt-2-simple copied to clipboard

Cannot convert to pytorch using huggingface

Open designgrande opened this issue 4 years ago • 6 comments

Huggingface requires "/path/to/gpt2/pretrained/weights" and I just don't understand what path should I enter here. It's not like I haven't tried everything - checkpoint, the folder inside it and even the checkpoint itself. But it just doesn't work.

!python export OPENAI_GPT2_CHECKPOINT_PATH=/path/to/gpt2/pretrained/weights

!python transformers-cli convert --model_type gpt2 \
  --tf_checkpoint $OPENAI_GPT2_CHECKPOINT_PATH \
  --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
  [--config OPENAI_GPT2_CONFIG] \
  [--finetuning_task_name OPENAI_GPT2_FINETUNED_TASK]

Any help will be greatly appreciated.

Thanks!

designgrande avatar Mar 10 '20 07:03 designgrande

Not sure how it works in the cli but in python you could do

config = transformers.GPT2Config.from_pretrained('gpt2')
model = transformers.GPT2Model.from_pretrained(index_path,from_tf=True,config=config)

Where index_path is the tensorflow index checkpoint So something like "checkpoint/run1/model-XXX.index".

Docs on loading pretrained models

apeguero1 avatar Mar 14 '20 14:03 apeguero1

I managed to load the model, thanks to your advice. But I was struggling to load the Tokenizer. This seems to do the trick:

tokenizer = GPT2Tokenizer("checkpoint/run1/encoder.json", "checkpoint/run1/vocab.bpe")
print(tokenizer("Bxe3")['input_ids'])

but

tokenizer2 = GPT2Tokenizer.from_pretrained('gpt2')
print(tokenizer2("Bxe3")['input_ids'])

outputs the same input_ids [33, 27705, 18] which made me suspect something was wrong. But I guess I was wrong since I have a finetuned model and finetuning does not change the tokenizer as far as I am aware of.

Anyways maybe this helps someone.

bablf avatar Mar 17 '21 16:03 bablf

Not sure how it works in the cli but in python you could do

config = transformers.GPT2Config.from_pretrained('gpt2')
model = transformers.GPT2Model.from_pretrained(index_path,from_tf=True,config=config)

Where index_path is the tensorflow index checkpoint So something like "checkpoint/run1/model-XXX.index".

Docs on loading pretrained models

For those of you who seem to get an tensor error, make sure you load the correct config.

config = transformers.GPT2Config.from_pretrained('/content/checkpoint/run1/hparams.json')
model = transformers.GPT2Model.from_pretrained(index_path,from_tf=True,config=config)

enochlev avatar Dec 30 '22 19:12 enochlev

Hello! Have this been figured out? I tried to follow the same from here with this code

config = transformers.GPT2Config.from_pretrained('/content/hparams.json')
tokenizer = transformers.GPT2Tokenizer("/content/encoder.json", "/content/vocab.bpe")
model = transformers.GPT2Model.from_pretrained('/content/model.index',from_tf=True,config=config)

But I'm seeing this error:

OpError: /content/model.data-00000-of-00001; No such file or directory

Has anybody seen this and know how to fix it?

Thanks!

aenriquez27 avatar Jan 29 '24 02:01 aenriquez27

Typically the GPT2Model.from_pretrained should point to a folder rather then a specific file. Try model = transformers.GPT2Model.from_pretrained('/content/model.index',from_tf=True,config=config)

If that doesn't work print off your file structure in your content directory so we can debug Windows Get-ChildItem -Directory -Recurse -Depth 2 | ForEach-Object {Write-Host $_.FullName}

Linux tree -L 2

Or give us the code on how you downloaded the model

enochlev avatar Jan 29 '24 17:01 enochlev

Typically the GPT2Model.from_pretrained should point to a folder rather then a specific file. Try model = transformers.GPT2Model.from_pretrained('/content/model.index',from_tf=True,config=config)

If that doesn't work print off your file structure in your content directory so we can debug Windows Get-ChildItem -Directory -Recurse -Depth 2 | ForEach-Object {Write-Host $_.FullName}

Linux tree -L 2

Or give us the code on how you downloaded the model

Doing the above solved the issue from before but raised a new issue "IndexError: Read fewer bytes than requested".

I can't seem to find much information about this error, just 2 so discussions that mention memory not being enough or corrupted weight files. I didn't see Colab crashing and I'm unsure how to check for corrupted weight files. I could retrain and try with those files instead.

Have you seen that issue before?

As to how I downloaded the model, I just used what was provided in the gpt2-simple colab.

gpt2.copy_checkpoint_to_gdrive(run_name='run1')

Followed by downloading it to my local computer. So there is a chance something got corrupted along the way but I figured I'd ask anyways just in case.

Many thanks for all the help so far!

aenriquez27 avatar Jan 31 '24 03:01 aenriquez27