text-embeddings-inference
text-embeddings-inference copied to clipboard
A blazing fast inference solution for text embeddings models
test model:`BAAI/bge-reranker-base`. Duplicate to PR [357](https://github.com/huggingface/text-embeddings-inference/pull/357)
Hello. I'm trying to estimate how much GPU equipment is needed to serve a `bge-reranker-v2-m3` fp16 reranker model. We plan to input about **23 documents** with a **chunk size of...
### Feature request TEI 1.5 introduced [feat(onnx): add onnx runtime for better CPU perf #328](https://github.com/huggingface/text-embeddings-inference/pull/328). Request to not use onnx runtime on CPU. It seems there is no way to...
### System Info ubuntu2024 RTX3070 ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications...
### System Info Tested with TEI 1.2, 1.4, and latest (ghcr.io/huggingface/text-embeddings-inference:cuda-latest) OS: Docker on Debian 12 Model: dophys/bge-m3_finetuned Hardware: 1 NVIDIA_L4 ### Information - [X] Docker - [ ] The...
# What does this PR do? Fixes #605 The `pooler` layer loads its weight using an incorrect key name, causing the classifier and reranker based on GTE to produce wrong...
### System Info text-embeddings-inference version 1.7 (cpu, volta, hopper) ### Information - [x] Docker - [ ] The CLI directly ### Tasks - [x] An officially supported command - [...
# What does this PR do? This PR adds an integration test for Gaudi, with the goal of eventually including it in the CI pipeline. The CI pipeline will be...
### System Info # image > text-embeddings-inference:turing-1.6-grpc # model id > sentence-transformers/distiluse-base-multilingual-cased-v2 ### Information - [x] Docker - [ ] The CLI directly ### Tasks - [ ] An officially...
### System Info I am trying to run the BAAI/bge-large-en-v1.5 model ``` - command: - bash - -c - text-embeddings-router --model-id 'BAAI/bge-large-en-v1.5' --max-batch-tokens 10000 --max-client-batch-size 10 --payload-limit 500000000 --dtype float16...