Need help with olmOCR Setup on RX 7900 XTX with ROCm 6.2 and vLLM
Hi @jakep-allenai,
I’m trying to set up olmOCR on an AMD RX 7900 XTX with ROCm 6.2 in a Docker container (Ubuntu 22.04 base), but I’m hitting persistent errors with vLLM’s setup.py (AssertionError: CUDA_HOME is not set or TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'). I commented on #111 (where you confirmed olmOCR works with ROCm vLLM), but as it’s closed, I’m opening this new issue for help.
My Setup:
Base Image: rocm/dev-ubuntu-22.04:6.2
PyTorch: 2.3.1 (from https://download.pytorch.org/whl/rocm6.2/) vLLM: Built from source (tried v0.6.2, v0.6.1) with VLLM_BUILD_ROCM=1 olmOCR: Installed with pip install olmocr[gpu] --no-deps Environment Variables: ROCM_HOME=/opt/rocm, HSA_OVERRIDE_GFX_VERSION=11.0.0, HIP_VISIBLE_DEVICES=0 Hardware: AMD RX 7900 XTX, Ryzen 9 9950X, 128GB DDR5, Ubuntu host Patches:
Modified vllm/setup.py to bypass CUDA checks (e.g., replaced CUDA_HOME block with cuda_version = "rocm6.2" for ROCm builds). Patched olmocr/check.py to skip CUDA GPU memory checks for ROCm.
Issue:
Despite patches, vLLM installation fails with AssertionError: CUDA_HOME is not set. rocm-smi detects the GPU, but torch.cuda.is_available() often returns False with a “No NVIDIA driver” error. I suspect vLLM’s ROCm support needs specific tweaks.
Questions:
What vLLM version/branch did you use for olmOCR on ROCm? Are there specific patches for setup.py or other files to ensure ROCm compatibility? Any Dockerfile or host setup details for RX 7900 XTX? Should I try a non-Docker setup or a different vLLM version?
Thanks for your help, @jakep-allenai! Let me know if you need my dockerfile or other details.
mm
i want to know too
Hey, I posted on another thread on the setup we used:
https://github.com/allenai/olmocr/issues/111#issuecomment-3276316770
Check my comment too