exllama icon indicating copy to clipboard operation
exllama copied to clipboard

Slowdown again with pascal cards.

Open ENjoyBlue2021 opened this issue 2 years ago • 5 comments

I couldn't reopen my original issue so I hope its fine if I open another bug. The pascal fix is broken again, at least for me. The following check does not work:

q4_matmul.cu:

if defined(CUDA_ARCH) && CUDA_ARCH < 700

const float alpha = 1.0f;
const float beta = no_zero ? 1.0f : 0.0f;
cublasSgemmEx(handle, CUBLAS_OP_N, CUBLAS_OP_N, width, height, dim, &alpha, buffers->temp_dq, CUDA_R_16F, width,
              x_mapped, CUDA_R_16F, dim, &beta, out, CUDA_R_16F, width);

else

const half alpha = __float2half(1.0f);
const half beta = no_zero ? __float2half(1.0f) : __float2half(0.0f);
cublasHgemm(handle, CUBLAS_OP_N, CUBLAS_OP_N, width, height, dim, &alpha, buffers->temp_dq, width, x_mapped, dim, &beta, out, width);

endif

Taking out the if and just set SgemmEx works. This is on dual gpu 1080ti + 1080.

ENjoyBlue2021 avatar Jul 15 '23 08:07 ENjoyBlue2021

This sounds like CUDA_ARCH is either undefined or defined incorrectly. Could you try changing the first line to just:

#if CUDA_ARCH < 700

That should fail to compile if the symbol is missing. If it compiles it means you've got it incorrectly defined, somehow. Which I guess would suggest a (very strange) driver issue...?

turboderp avatar Jul 15 '23 13:07 turboderp

Hmm, you are correct.

Putting a

if (CUDA_ARCH < 700)
{

}

Gives me an error.

/media/w/PhoenixSSD/oobabooga/miniconda/envs/textgen/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options ‘-fPIC’ -lineinfo -std=c++17 -c /media/w/PhoenixSSD/oobabooga/text-generation-webui/repositories/exllama/exllama_ext/cuda_func/q4_matmul.cu -o q4_matmul.cuda.o /media/w/PhoenixSSD/oobabooga/text-generation-webui/repositories/exllama/exllama_ext/cuda_func/q4_matmul.cu(247): error: identifier “CUDA_ARCH” is undefined

I'm on ubuntu and really don't want to mess too much with the nvidia drivers. Very much possible that its something on my end.

That would be my nvidia-smi:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06   Driver Version: 525.125.06   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   59C    P0    66W / 210W |   1399MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  On   | 00000000:02:00.0 Off |                  N/A |
|  0%   41C    P8     9W / 200W |      9MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

ENjoyBlue2021 avatar Jul 16 '23 07:07 ENjoyBlue2021

Your driver is probably fine. It's the venv. I too have cuda 12 on the system and then cuda 11.8 in the venv. I had to download all the cuda toolkit stuff again for that. Conda was actually useful.

Ph0rk0z avatar Jul 16 '23 15:07 Ph0rk0z

I think you are right, I played around a couple of hours trying to uninstall the old version. I reinstalled venv cuda toolkit 11.8 but that didn't fix anything.

I can't seem to be able to properly remove the toolkit drivers from my system before installing another version. I suppose I need to install 11.8 like in the venv but all my attempts to clean up and purge the current version failed. So I'm giving up, this is the only area thats causing problems for me anyway.

ENjoyBlue2021 avatar Jul 24 '23 01:07 ENjoyBlue2021

This is why I like conda. A fresh environment with new cu118 torch and reqs usually fixes things. Although I've yet to mess up a single conda env or venv, knock on wood.

Ph0rk0z avatar Jul 24 '23 12:07 Ph0rk0z