nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

How to load the GPT-2 model

Open strangeoptics opened this issue 2 years ago • 2 comments

Can you give an example of how to use the official GPT-2 model. I downloaded it successfully via https://raw.githubusercontent.com/openai/gpt-2/master/download_model.py Moved and renamed the model.ckpt.data-00000-of-00001 to /out/ckpt.pt But I got some pickl errors when loading it.

python sample.py config\eval_gpt2.py

Traceback (most recent call last): File "d:\work\AI\nanoGPT\sample.py", line 35, in checkpoint = torch.load(ckpt_path, map_location=device) File "c:\Users\orosa\anaconda3\envs\evn_nanogpt\lib\site-packages\torch\serialization.py", line 795, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "c:\Users\orosa\anaconda3\envs\evn_nanogpt\lib\site-packages\torch\serialization.py", line 1002, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '\x03'.

strangeoptics avatar Jan 12 '23 19:01 strangeoptics

In sample.py you need to set init_from = 'gpt2'

pannous avatar Jan 30 '23 14:01 pannous

I do not think that this modification is required any longer: config/eval_gpt2.py already contains a line with init_from = 'gpt2' which overwrites the setting in sample.py.

This issue may therefore be closed now, I guess

rozek avatar Mar 29 '23 09:03 rozek