nebuly
nebuly copied to clipboard
[Chatllama] GPT2 model is missing error on tutorial
HI all
I am a beginner and small question for this field. I followed the readme and tried to run the study as per the default cofig.yaml. However, I got an error that the GPT2 model does not exist in the model folder. Should I just place the GPT2 weight file here or do I need to place a file like GPT2 model (model.py) together? Looking at this github repository, only the model file (llama_model.py) for LLaMA exists, so I feel like I need to put the model.py for GPT as well. If possible, I would like to know where can I get the pretrained GPT2 model compatible with this chatLLaMA. Thankyou.
The error message is following:
Warning, Impossible to load the model: ./models/gpt2-large.pt No previous checkpoint found. Warning, Impossible to load the model: ./models/gpt2-large.pt No previous checkpoint found. Start RL Training Episode: 1 of 100, Timestep: 1 of 32 Traceback (most recent call last):
Hi @TakafumiYano, can we have the full bug report? A new version with updated model loading should be out soon. By the way, if it can't load any model it may be because you haven't performed any actor or reward related before RLHF, in any case it should load the default gpt model weights and return this warning without goinig to rise any error. So in any case you should be able to run the training.
Hi @PierpaoloSorbellini
Thank you for your reply. I will make the full report and share it in here. Thank you.
Takafumi.