HALC ERROR in running 'run_scripts/pope

Thanks for your good work! But I got confused when I ran evaluation of POPE.

At first, I run: python run_scripts/pope_eval.py --model llava-1.5 --data_path /home/duyuetian/COCO/val2014 -d vcd --pope_type random --num_images 100 --seed 0 --gpu_id 0 --output_dir ./generated_captions/ --noise_step 100

but it didn't work because of the error below: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:3! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

So I roughly solved it by adding the "CUDA_VISIBLE_DEVICES" param: CUDA_VISIBLE_DEVICES=0 python run_scripts/pope_eval.py --model llava-1.5 --data_path /home/duyuetian/COCO/val2014 -d vcd --pope_type random --num_images 100 --seed 0 --gpu_id 0 --output_dir ./generated_captions/ --noise_step 100

But it is not useful for more universal scenarios, so I wonder if there is a way to solve the problem at the root？Appreciate it.

May 29 '24 05:05 anotherbricki

Thanks for your question! I also encounter the problem with LLaVA-1.5 on two devices. It might be due to some of the intermediate variables being accidentally cast to the other device when we develop the code base. However, there is no such issue for other three VLMs, and you can try to run them as well. I will check the code and let you know the updates. Thank!

May 31 '24 09:05 BillChan226

Thanks for your question! I also encounter the problem with LLaVA-1.5 on two devices. It might be due to some of the intermediate variables being accidentally cast to the other device when we develop the code base. However, there is no such issue for other three VLMs, and you can try to run them as well. I will check the code and let you know the updates. Thank!

Thanks a lot!

Jun 04 '24 03:06 anotherbricki

Hello! I have the same problem, is there a solution that will allow me to run LLaVA-1.5 normally?

Jun 26 '24 08:06 Shenshen7

I encountered the same issue. When I use the LLaVA-1.5 model and specify the GPU with CUDA_VISIBLE_DEVICES=0, I get a memory error: torch.cuda.OutOfMemoryError: CUDA out of memory. However, when I don't specify the GPU, I get the following error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument mat1 in method wrapper_CUDA_addmm).

Sep 06 '24 08:09 darwann

I encountered the same issue. When I use the LLaVA-1.5 model and specify the GPU with CUDA_VISIBLE_DEVICES=0, I get a memory error: torch.cuda.OutOfMemoryError: CUDA out of memory. However, when I don't specify the GPU, I get the following error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument mat1 in method wrapper_CUDA_addmm).

Hi, the OOM error is probably just because there is not enough memory. I will try to fix the cuda device issue this week.

Sep 06 '24 08:09 BillChan226

Thanks! :)

Sep 20 '24 08:09 darwann

ERROR in running 'run_scripts/pope_eval.py'