Add support for blackwell architecture (sm120)

Open danielealbano opened this issue 2 months ago • 1 comments

What does this PR do?

This PR adds support for the Blackwell architecture, related to issue #652.

As I wanted to run TEI on my 5090 I went through a few iterations and got it working, tested with Qwen3-Embedding-0.6B.

Before submitting

[X] Did you read the contributor guideline?

Yes

[X] Was this discussed/approved via a GitHub issue or the forum?

Not discussed nor approved but it's a known issue and there is a related issue already opened at https://github.com/huggingface/text-embeddings-inference/issues/652

[X] Did you make sure to update the documentation with your changes? Here are the documentation guidelines.

Documentation updated to mention the new compute cap.

[X] Did you write any new necessary tests? If applicable, did you include or update the insta snapshots?

I have updated the only test already in place to validate the compute cap

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

Oct 06 '25 12:10 danielealbano

Hey @danielealbano, thanks for the changes, I'll take a look and test those over the week (as I just got back from a 2-week break), thanks for the contribution (and the patience) 🤗

Oct 27 '25 06:10 alvarobartt

@alvarobartt you can by the way support cuda 12.9 docker image, even if you system runs 12.2. Just add the following to the dockerfile.

RUN : Remove compat folder to avoid conflicts with host GPU drivers
RUN rm -rf /usr/local/cuda-12.9/compat
ENV NVIDIA_DISABLE_REQUIRE=true

Nov 21 '25 17:11 michaelfeil

I built the image using Dockerfile-cuda-blackwell(git revision: c406619) and it launched TEI successfully on my RTX 5090 with GPU processing working properly. Great work!

For those who want to quickly test TEI on Blackwell like I did, I've made the image available here (note: this is not actively maintained, but feel free to use it for testing):

https://hub.docker.com/r/hotchpotch/tei-blackwell-testing

Nov 21 '25 23:11 hotchpotch

text-embeddings-inference text-embeddings-inference copied to clipboard

Add support for blackwell architecture (sm120)

What does this PR do?

Before submitting

Who can review?

text-embeddings-inference
text-embeddings-inference copied to clipboard