[AMD] Fix compilation issue with ROCm
Problem: Unable to install the package on a Linux machine with an AMD 6800XT GPU using ROCm.
docker run -it --device=/dev/kfd --device=/dev/dri --group-add video docker.io/rocm/dev-ubuntu-22.04:5.6-complete bash
CT_HIPBLAS=1 pip install ctransformers --no-binary ctransformers
# Compiling after setting CC and CXX env variables also failed with a similar error.
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ CT_HIPBLAS=1 pip install ctransformers --no-binary ctransformers
Error logs: https://gist.github.com/bhargav/7f8c2984ba32ff99ce8e93433d9059a6
Solution: Failures are due to references to CUDA library imports instead of using the HIP versions when compiled for AMD.
Verified that the project can build with the fixes.
apt-get update && apt-get install -y git
git clone https://github.com/bhargav/ctransformers.git
cd ctransformers
git checkout bhargav/fix_rocm_compile
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ CT_HIPBLAS=1 pip install .
Build log: https://gist.github.com/bhargav/65bbbd039bda6f39504448656e88ab6b
Package installs successfully and I was able to run a model inference on GPU.
i can confirm. Compile without a issue now. (rocm nightly and old vega64 :)
Hello,
Installing done without error but when trying to prompt the model using
llm = AutoModelForCausalLM.from_pretrained("/models/llama-7b.Q3_K_M.gguf", model_type="llama" , local_files_only=True, gpu_layers=100)
print(llm("AI is going to"))
exits with error
CUDA error 98 at ~/ctransformers/models/ggml/ggml-cuda.cu:6045: invalid device function
When using ROCm, "CUDA error 98 ....... invalid device function" (as far as I know) usually means version implementation problems in the HIP stack. Most likely it's solvable with
export HSA_OVERRIDE_GFX_VERSION=11.0.0 for rdna3 gpu
export HSA_OVERRIDE_GFX_VERSION=10.3.0 for older one
Hi, In the referenced container version it can work with a gfx1030 system. docker.io/rocm/dev-ubuntu-22.04:5.6-complete
apt show rocm-libs -a
Package: rocm-libs
Version: 5.6.0.50600-67~22.04
Priority: optional
Section: devel
But in "rocm/pytorch:latest-release"
apt show rocm-libs -a
Package: rocm-libs
Version: 5.7.0.50700-63~20.04
Priority: optional
Raises the" invalid device function" Error.
NOTE:
If you import the cloned directory by accident instead of the installed package:
OSError: libcudart.so.12: cannot open shared object file: No such file or directory
NOTE: I disabled the integrated gfx1036 GPU from the CPU, so I have only the gfx1030 system.
I assume the "invalid device function" depends on the environment's library version(s).
/lgtm
CUDA error 98 at /home/gingi/github/ctransformers/models/ggml/ggml-cuda.cu:6045: invalid device function
I have export HSA_OVERRIDE_GFX_VERSION=11.0.0 and running HSA_OVERRIDE_GFX_VERSION=11.0.0 python index.py
the below command worked well! CMAKE_ARGS="-DLLAMA_HIPBLAS=on -DLLAMA_CLBLAST=on -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DCMAKE_PREFIX_PATH=/opt/rocm" FORCE_CMAKE=1 pip install ctransformers --no-binary ctransformers