haystack-core-integrations feat: support Trition with Nvidia embedders

feat: support Trition with Nvidia embedders

Open tstadel opened this issue 5 months ago • 0 comments

support for Triton-hosted embedding models: e.g. almost any hugginface model can be hosted on Triton and thus make use of Triton's hosting and scaling features. This is especially helpful if you want to self-host arbitrary OpenSource models using NVIDIA's inference engines.

sample embedding triton server can be hosted using

docker run --gpus all -p 8000:8000 -p 8001:8001 -p 8002:8002 tstadelds/triton-embedding:intfloat-multilingual-e5-base-o4-v0.0.2

template for creating Triton images can be found here: https://github.com/deepset-ai/nvidia-triton-inference

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.

Sep 23 '24 16:09 tstadel