qlora icon indicating copy to clipboard operation
qlora copied to clipboard

V100 can not supprt load_in_4bit and fp16?

Open tongwwt opened this issue 1 year ago • 4 comments

I can only run the code with load_int_8bit using V100 with 32GB. And I can not use "args.fp16", which can cause the error "RuntimeError: expected scalar type Half but found Float". How to solve it?

Traceback (most recent call last):
  File "qlora.py", line 854, in <module>
    train()
  File "qlora.py", line 763, in train
    train_result = trainer.train(resume_from_checkpoint=checkpoint_dir)
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/transformers/trainer.py", line 1696, in train
    return inner_training_loop(
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/transformers/trainer.py", line 1971, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/transformers/trainer.py", line 2797, in training_step
    self.scaler.scale(loss).backward()
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward
    torch.autograd.backward(
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/autograd/__init__.py", line 200, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/autograd/function.py", line 274, in apply
    return user_fn(self, *args)
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 157, in backward
    torch.autograd.backward(outputs_with_grad, args_with_grad)
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/autograd/__init__.py", line 200, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/autograd/function.py", line 274, in apply
    return user_fn(self, *args)
  File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 476, in backward
    grad_A = torch.matmul(grad_output, CB).view(ctx.grad_shape).to(ctx.dtype_A)
RuntimeError: expected scalar type Half but found Float

tongwwt avatar May 28 '23 16:05 tongwwt

same issue for me

amdnsr avatar May 28 '23 18:05 amdnsr

load_in_4bit should be supported in most GPUs. However, some GPUs might not support bfloat16. In this case, I recommend switching to float32 for your computation data type. This might slow your computation, but we found float16 to not always be stable. Let me know if this helps!

artidoro avatar May 28 '23 20:05 artidoro

According to Nvidia, v100 DOES NOT support int4 data type. image

Maxwell-Lyu avatar May 30 '23 15:05 Maxwell-Lyu

V100 does support 8bit and fp16?

feng-1985 avatar Aug 01 '23 11:08 feng-1985