text-generation-webui
text-generation-webui copied to clipboard
Add Cuda installation requirement to docs
Hi! First of all, thank you for your work.
I was just wondering whether it should be mentioned in the 4-bit installation guide, that you require Cuda 11.7 (compatible with pytorch) to run python setup_cuda.py install. I was getting error while building about missing hip_runtime_api.h.
I just spend a couple of hours figuring that out, this info would probably be helpful to people.
One can download and install Cuda here https://developer.nvidia.com/cuda-11-7-0-download-archive
you mean 11.7 right ? 👀
Additionally, Pascal or newer is require as of recently. See https://github.com/qwopqwop200/GPTQ-for-LLaMa/issues/88
quant_cuda_kernel.cu(654): error: identifier "__hfma2" is undefined
In my case, my server has both Maxwell as Pascal, so I just needed to do CUDA_VISIBLE_DEVICES=1,2 python setup_cuda.py install instead to limit to the P40 cards. (This is just another reason I really need to replace that M40, but for now I'll use conda's environment vars to set the visible devices by default.)
you mean 11.7 right ? eyes
Yes, I edited and fixed this mistake, thank you
@ReFruity By "require Cuda 11.7" do you mean 11.8 won't work? I'm using the Dockerfile (based on nvidia/cuda:11.8.0-devel-ubuntu22.04) to test on "Nvidia A10G large" but failed with quant_cuda_kernel.cu(654): error: identifier "__hfma2" is undefined with such a warning above the failure:
--> RUN . /build/venv/bin/activate && python3 setup_cuda.py bdist_wheel -d .
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
running bdist_wheel
/build/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
running build
running build_ext
/build/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py:388: UserWarning: The detected CUDA version (11.8) has a minor version mismatch with the version that was used to compile PyTorch (11.7). Most likely this shouldn't be a problem.
warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
/build/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py:398: UserWarning: There are no x86_64-linux-gnu-g++ version bounds defined for CUDA version 11.8
warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'quant_cuda' extension
I'm pretty sure it's not about "Pascal or newer", so is it about mixing 11.8 and 11.7?
EDIT: tried 11.7 and also failed, really puzzled, code at https://github.com/utensil/text-generation-webui/tree/dev , almost identical to upstream main branch.
I changed TORCH_CUDA_ARCH_LIST in the .env file from 5.0 which I thought was right to 7.5 and it fixed my error. Do you know if you server has an nvidia card and what the arch should be?
I changed TORCH_CUDA_ARCH_LIST in the .env file from
5.0which I thought was right to7.5and it fixed my error. Do you know if you server has an nvidia card and what the arch should be?
I had tried a few combinations, including full list, 7.5 etc. , all failed. Then I moved on to other models and triton branch and haven't test the same settings again yet. Will try again and see if the error would go away.
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.