text-generation-inference
text-generation-inference copied to clipboard
Add AMD gfx110* support
Feature request
Add support for the gfx1101 and gfx1100 GPUs. Currently the official docs indicate lack of support for this hardware.
Motivation
Allow developers who have a 7900xt or 7900xtx GPU to utilize it to run local LLMs using TGI.
I have successfully run the image rocm/pytorch:rocm6.2.3_ubuntu22.04_py3.10_pytorch_release_2.3.0 on such a machine and demonstrated basic GPU accelerated compute smokechecks, so I think it is possible, if we change some configuration items in the build.
Your contribution
Looking at the upstream pytorch AMD docker build definition pytorch is built with gfx1100 support.
Looking at TGI AMD docker build definition, pytorch is not currently compiled with gfx1100 support.
Also some of these env vars related to ROCm tuning may need to change to support the consumer GPUs: https://github.com/huggingface/text-generation-inference/blob/main/Dockerfile_amd#L323-L326
I plan to experiment with a rebuild configured with gfx1100 to see if it is achievable without further modifications.