haystack-core-integrations
haystack-core-integrations copied to clipboard
feat: support Trition with Nvidia embedders
Related Issues
- support for Triton-hosted embedding models: e.g. almost any hugginface model can be hosted on Triton and thus make use of Triton's hosting and scaling features. This is especially helpful if you want to self-host arbitrary OpenSource models using NVIDIA's inference engines.
Proposed Changes:
- add Triton Backend support for NvidiaEmbedders
How did you test it?
- added tests
- ran integration tests locally against Triton Embedding endpoint
- sample embedding triton server can be hosted using
docker run --gpus all -p 8000:8000 -p 8001:8001 -p 8002:8002 tstadelds/triton-embedding:intfloat-multilingual-e5-base-o4-v0.0.2
Notes for the reviewer
- template for creating Triton images can be found here: https://github.com/deepset-ai/nvidia-triton-inference
Checklist
- I have read the contributors guidelines and the code of conduct
- I have updated the related issue with new insights and changes
- I added unit tests and updated the docstrings
- I've used one of the conventional commit types for my PR title:
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.