EasyEdit Editing Llama-7b with MEMIT on multiple GPUs: Expected all tensors to be on the same device

Thanks for this great repo. I'm trying to edit Llama-7b with the MEMIT algorithm. With two GPUs, I get the following error:

    return self.edit_requests(requests, sequential_edit, verbose, test_generation=test_generation, **kwargs)
  File "easyeditor/editors/editor.py", line 302, in edit_requests
    metrics = {"pre": compute_edit_quality(self.model, self.model_name, self.hparams, self.tok, request, self.hparams.device, eval_metric=eval_metric, test_generation=test_generation)}
  File "easyeditor/evaluate/evaluate.py", line 64, in compute_edit_quality
    ret = compute_rewrite_or_rephrase_quality(model, model_name, hparams, tok,
  File "easyeditor/evaluate/evaluate.py", line 133, in compute_rewrite_or_rephrase_quality
    acc = test_prediction_acc(model, tok, hparams, prompt, target_new, device)
  File "easyeditor/evaluate/evaluate_utils.py", line 132, in test_prediction_acc
    outputs = model(**prompt_target_tok)
  File "envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "envs/EasyEdit/lib/python3.9/site-packages/accelerate/hooks.py", line 169, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1189, in forward
    outputs = self.model(
  File "envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 977, in forward
    position_embeddings = self.rotary_emb(hidden_states, position_ids)
  File "envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "envs/EasyEdit/lib/python3.9/site-packages/accelerate/hooks.py", line 169, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "envs/EasyEdit/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 209, in forward
    freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)

I have the latest version of EasyEdit. I'm using two A100 80GB GPUs. I set CUDA_VISIBLE_DEVICES. In the parameter file, I set model_parallel to true and set the device to 0.

I understand that editing Llama-7b requires no more than 80GB and can run on a single GPU, but I am testing the feasibility of running it on multiple GPUs. Any insights or suggestions would be greatly appreciated.

Sep 27 '24 22:09 fateme-hshm96

Thank you very much for your interest in EasyEdit. We will debug and address this issue soon. If you are in a hurry, you can manually set the tensor to device:0, which should temporarily solve your problem.

Sep 29 '24 12:09 XeeKee

Thanks for responding. I tried your proposed solution, but it seems like the issue occurs in different steps, and moving the vectors didn't seem to be a feasible workaround. I'll keep an eye out for the next update. Thank you!

Sep 29 '24 20:09 fateme-hshm96

Hello @XeeKee, do you suspect what might cause this error? or any other suggestions I can try?

Oct 21 '24 20:10 fateme-hshm96

Dear fateme-hshm96:

Hello, you can try our latest code. I tested it locally with two A800 GPUs, and it runs smoothly with model_parallel: true enabled.

EasyEdit Team

Oct 24 '24 07:10 XeeKee

hi, do you have any further issues?

Oct 27 '24 03:10 zxlzr