qlora
qlora copied to clipboard
V100 can not supprt load_in_4bit and fp16?
I can only run the code with load_int_8bit using V100 with 32GB. And I can not use "args.fp16", which can cause the error "RuntimeError: expected scalar type Half but found Float". How to solve it?
Traceback (most recent call last):
File "qlora.py", line 854, in <module>
train()
File "qlora.py", line 763, in train
train_result = trainer.train(resume_from_checkpoint=checkpoint_dir)
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/transformers/trainer.py", line 1696, in train
return inner_training_loop(
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/transformers/trainer.py", line 1971, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/transformers/trainer.py", line 2797, in training_step
self.scaler.scale(loss).backward()
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/autograd/__init__.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/autograd/function.py", line 274, in apply
return user_fn(self, *args)
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 157, in backward
torch.autograd.backward(outputs_with_grad, args_with_grad)
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/autograd/__init__.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/torch/autograd/function.py", line 274, in apply
return user_fn(self, *args)
File "/mnt/cache/tongwenwen1/miniconda3/envs/qloracu113/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 476, in backward
grad_A = torch.matmul(grad_output, CB).view(ctx.grad_shape).to(ctx.dtype_A)
RuntimeError: expected scalar type Half but found Float
same issue for me
load_in_4bit
should be supported in most GPUs. However, some GPUs might not support bfloat16. In this case, I recommend switching to float32 for your computation data type. This might slow your computation, but we found float16 to not always be stable. Let me know if this helps!
According to Nvidia, v100 DOES NOT support int4 data type.
V100 does support 8bit and fp16?