InternVL
InternVL copied to clipboard
cublas error and bfloat16 to float16 warning
Hi,
I am trying out the single image - query and batch inference using int8 model. I have created a new environment following the installation instructions and trying out the demo line by line. I am getting the following cublas error and a warning as well.
dynamic ViT batch size: 5
/envs/internvl/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py:316: UserWarning: MatMul8bitLt: inputs will be cast from torch.bfloat16 to float16 during quantization
warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
cuBLAS API failed with status 15
A: torch.Size([5125, 3200]), B: torch.Size([9600, 3200]), C: (5125, 9600); (lda, ldb, ldc): (c_int(164000), c_int(307200), c_int(164000)); (m, n, k): (c_int(5125), c_int(9600), c_int(3200))
The error comes from bitsandbytes:
File "/envs/internvl/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py", line 395, in forward
out32, Sout32 = F.igemmlt(C32A, state.CxB, SA, state.SB)
File "/envs/internvl/lib/python3.9/site-packages/bitsandbytes/functional.py", line 2337, in igemmlt
raise Exception("cublasLt ran into an error!")