pyllama icon indicating copy to clipboard operation
pyllama copied to clipboard

Struggle with training LLaMA with a single GPU using both PT v1 and v2

Open linhduongtuan opened this issue 2 years ago • 4 comments

Hi, I love your code base and want to try how to train the LLaMA with a single GPU. This code I use is here https://github.com/juncongmoo/pyllama/blob/main/llama/model_single.py. However, I struggle with an error. This message's shown that: " self.tok_embeddings = nn.Embedding(params.vocab_size, params.dim) File "/home/linh/anaconda3/envs/a/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 139, in init self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs)) RuntimeError: Trying to create tensor with negative dimension -1: [-1, 512] " Can you help me to fix/test this code again.

Thank in advance. Linh

linhduongtuan avatar Mar 11 '23 03:03 linhduongtuan

Guess your torch version is too old?

mldevorg avatar Mar 11 '23 06:03 mldevorg

No. I test both PT V1 and V2 (updated very recent)

linhduongtuan avatar Mar 11 '23 07:03 linhduongtuan

@linhduongtuan Can you please post your environment info like OS, torch version, Model files checksum? (I cannot reproduce your issue.)

juncongmoo avatar Mar 12 '23 08:03 juncongmoo

@juncongmoo, I use PT v2 nightly (or PT 1.13) in Os Ubuntu 20.4, CUDA 11.7, LLaMA 7B. Instead of loading the model checkpoint, I want to train the model from scratch.

linhduongtuan avatar Mar 13 '23 01:03 linhduongtuan