cublas error and bfloat16 to float16 warning

Open gullalc opened this issue 1 year ago • 0 comments

Hi,

I am trying out the single image - query and batch inference using int8 model. I have created a new environment following the installation instructions and trying out the demo line by line. I am getting the following cublas error and a warning as well.

dynamic ViT batch size: 5
/envs/internvl/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py:316: UserWarning: MatMul8bitLt: inputs will be cast from torch.bfloat16 to float16 during quantization
  warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
cuBLAS API failed with status 15
A: torch.Size([5125, 3200]), B: torch.Size([9600, 3200]), C: (5125, 9600); (lda, ldb, ldc): (c_int(164000), c_int(307200), c_int(164000)); (m, n, k): (c_int(5125), c_int(9600), c_int(3200))

The error comes from bitsandbytes:

File "/envs/internvl/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py", line 395, in forward
    out32, Sout32 = F.igemmlt(C32A, state.CxB, SA, state.SB)
  File "/envs/internvl/lib/python3.9/site-packages/bitsandbytes/functional.py", line 2337, in igemmlt
    raise Exception("cublasLt ran into an error!")

May 03 '24 15:05 gullalc