TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

Support for text embedding models

Open SupreethRao99 opened this issue 1 year ago • 4 comments

With the popularity of RAG, it would be great if TensorRT-LLM supported text-embedding and re-ranking models from sentence-transformers.

SupreethRao99 avatar Mar 01 '24 16:03 SupreethRao99

+1 to this. Would be great if embedding models can be served on Triton servers.

jasonngap1 avatar Apr 16 '24 02:04 jasonngap1

Is there any update on this? I'm also interested on this capability

FernandoDorado avatar Nov 06 '24 16:11 FernandoDorado

cc @ncomly-nvidia @AdamzNV @laikhtewari for vis

nv-guomingz avatar Nov 14 '24 07:11 nv-guomingz

Is there any plan to have this feature in the roadmap?

FernandoDorado avatar Jan 13 '25 13:01 FernandoDorado

Just following up on this too! Would really appreciate it if there was text embedding support!

neilbhutada avatar Oct 09 '25 11:10 neilbhutada

@SupreethRao99 , @FernandoDorado , @neilbhutada , There were discussions around that, but the team ultimately decided to stay focused on text generation, rather than introducing additional complexity from supporting models with fundamentally different characteristics.

karljang avatar Oct 21 '25 22:10 karljang

Issue has not received an update in over 14 days. Adding stale label.

github-actions[bot] avatar Nov 05 '25 03:11 github-actions[bot]