Editing Llama-7b with MEMIT on multiple GPUs: Expected all tensors to be on the same device
Thanks for this great repo. I'm trying to edit Llama-7b with the MEMIT algorithm. With two GPUs, I get the following error:
return self.edit_requests(requests, sequential_edit, verbose, test_generation=test_generation, **kwargs)
File "easyeditor/editors/editor.py", line 302, in edit_requests
metrics = {"pre": compute_edit_quality(self.model, self.model_name, self.hparams, self.tok, request, self.hparams.device, eval_metric=eval_metric, test_generation=test_generation)}
File "easyeditor/evaluate/evaluate.py", line 64, in compute_edit_quality
ret = compute_rewrite_or_rephrase_quality(model, model_name, hparams, tok,
File "easyeditor/evaluate/evaluate.py", line 133, in compute_rewrite_or_rephrase_quality
acc = test_prediction_acc(model, tok, hparams, prompt, target_new, device)
File "easyeditor/evaluate/evaluate_utils.py", line 132, in test_prediction_acc
outputs = model(**prompt_target_tok)
File "envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "envs/EasyEdit/lib/python3.9/site-packages/accelerate/hooks.py", line 169, in new_forward
output = module._old_forward(*args, **kwargs)
File "envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1189, in forward
outputs = self.model(
File "envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 977, in forward
position_embeddings = self.rotary_emb(hidden_states, position_ids)
File "envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "envs/EasyEdit/lib/python3.9/site-packages/accelerate/hooks.py", line 169, in new_forward
output = module._old_forward(*args, **kwargs)
File "envs/EasyEdit/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 209, in forward
freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)
I have the latest version of EasyEdit.
I'm using two A100 80GB GPUs.
I set CUDA_VISIBLE_DEVICES.
In the parameter file, I set model_parallel to true and set the device to 0.
I understand that editing Llama-7b requires no more than 80GB and can run on a single GPU, but I am testing the feasibility of running it on multiple GPUs. Any insights or suggestions would be greatly appreciated.
Thank you very much for your interest in EasyEdit. We will debug and address this issue soon. If you are in a hurry, you can manually set the tensor to device:0, which should temporarily solve your problem.
Thanks for responding. I tried your proposed solution, but it seems like the issue occurs in different steps, and moving the vectors didn't seem to be a feasible workaround. I'll keep an eye out for the next update. Thank you!
Hello @XeeKee, do you suspect what might cause this error? or any other suggestions I can try?
Dear fateme-hshm96:
Hello, you can try our latest code. I tested it locally with two A800 GPUs, and it runs smoothly with model_parallel: true enabled.
EasyEdit Team
hi, do you have any further issues?