LISA
LISA copied to clipboard
The model "llava-7b-llama-2-7b-chat" merged by myself had problems during training.
Hello, we have merged the model "zhangyupeng/llava-7b-llama-2-7b-chat" by ourselves. Two 3090 Gpus are used for training, Batch_size=2 and grad_accumulation_steps=40. The following problems appear during training. Is this the reason for our own merged models?
Traceback (most recent call last): File "/home/zhangyupeng/anaconda3/envs/lisa/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/home/zhangyupeng/anaconda3/envs/lisa/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 61, in fetch return self.collate_fn(data) File "/mnt/21T/zhangyupeng/code/LISA/utils/dataset.py", line 135, in collate_fn assert cur_len == total_len AssertionError
[2023-09-05 20:58:14,118] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 77018 [2023-09-05 20:58:14,119] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 77019 [2023-09-05 20:58:15,023] [ERROR] [launch.py:321:sigkill_handler] ['/home/zhangyupeng/anaconda3/envs/lisa/bin/python', '-u', 'train_ds.py', '--local_rank=1'] exits with return code = 1
I think this is caused by datasets. Can you check whether the datasets are correctly organized?
Hi~@X-Lai , After we downloaded and unzipped the dataset, we changed the file name according to your request and uploaded it to the server. Are you saying there are other changes that need to be made?
I feel like this is a model problem. Because I can run llama-13b but can't run merged llama-7b. And I don't know how to solve this.
I solved this. Just add legacy=True
in
tokenizer = transformers.AutoTokenizer.from_pretrained(
args.version,
cache_dir=None,
model_max_length=args.model_max_length,
padding_side="right",
use_fast=False,
legacy=True
)
Refer to link
Worked for me, thanks!
Worked for me, thanks!
Dear @AmrinKareem ,
I met the same issue. And this solution also works for me. May I ask if it will affect the results of training the LISA model?
It would be super helpful for me.
Best regards and many thanks,