text-generation-inference Add AMD gfx110* support

Add AMD gfx110* support

Open cazlo opened this issue 1 year ago • 0 comments

Feature request

Add support for the gfx1101 and gfx1100 GPUs. Currently the official docs indicate lack of support for this hardware.

Motivation

Allow developers who have a 7900xt or 7900xtx GPU to utilize it to run local LLMs using TGI.

I have successfully run the image rocm/pytorch:rocm6.2.3_ubuntu22.04_py3.10_pytorch_release_2.3.0 on such a machine and demonstrated basic GPU accelerated compute smokechecks, so I think it is possible, if we change some configuration items in the build.

Your contribution

Looking at the upstream pytorch AMD docker build definition pytorch is built with gfx1100 support.

Looking at TGI AMD docker build definition, pytorch is not currently compiled with gfx1100 support.

Also some of these env vars related to ROCm tuning may need to change to support the consumer GPUs: https://github.com/huggingface/text-generation-inference/blob/main/Dockerfile_amd#L323-L326

I plan to experiment with a rebuild configured with gfx1100 to see if it is achievable without further modifications.

Oct 13 '24 05:10 cazlo

text-generation-inference text-generation-inference copied to clipboard

Add AMD gfx110* support

Feature request

Motivation

Your contribution

text-generation-inference
text-generation-inference copied to clipboard