text-embeddings-inference icon indicating copy to clipboard operation
text-embeddings-inference copied to clipboard

A blazing fast inference solution for text embeddings models

Results 180 text-embeddings-inference issues
Sort by recently updated
recently updated
newest added

After using tei to deploy the model, can I request the model service via ipv6?

### Model description The Salesforce/SFR-Embedding model excels in multilingual and multi-task code/text retrieval, demonstrating superior performance in CoIR benchmarks (e.g., 67.4 NDCG@10 for the 2B model). Sincerely hoping that the...

For models like [jinaai/jina-embeddings-v2-base-code](https://hf.co/jinaai/jina-embeddings-v2-base-code), Rust side will download 2 separate directories `models--jinaai--jina-embeddings-v2-base-code/` and `models--jinaai--jina-bert-v2-qk-post-norm/`, while existing implementation only pass 1 Path to python backend. This PR refines model file download...

### Feature request there is python server: https://github.com/bernardo-sb/image-embedding-inference inputs: base64image output: embeddings for each image ### Motivation the python version is so big, and slow... ![Image](https://github.com/user-attachments/assets/57da9bcd-8458-4bea-ad04-0f668b923f9f) ### Your contribution pr

### System Info Cannot load https://huggingface.co/Qodo/Qodo-Embed-1-1.5B when I investigated it there was an error reguarding the tokenizer... I was curious so I tried the latest version (0.21.0) of tokenizers and...

### Feature request Add a section in the readme of "Supported Models" to highlight the recommended `inputs` format for each model. For example, for Qwen2, should we use the `Instruct:...

### System Info ### Information - [x] Docker - [ ] The CLI directly ### Tasks - [x] An officially supported command - [ ] My own modifications ### Reproduction...

### System Info - Python 3.10.12 - text-embeddings-router 1.6.0, installed using `cargo install --path router -F candle-cuda -F http --no-default-features` - platform: Ubuntu 2204 with Nvidia A100 GPU. - Model:...

### Feature request would appreciate doc to use as rust lib or sidecar (non docker) to run on consumer hardware (typically macos, windows without nvidia) ### Motivation i'd like to...

### System Info Version of Text Embedding Inference: 1.6 (Turing) GPU: 1xTesla T4 16GB Deployment environment: Openshift 4 - Kubernetes version v1.28.15+ff493be Service info: `{"model_id":"naver/efficient-splade-VI-BT-large-doc","model_sha":"main","model_dtype":"float16","model_type":{"embedding":{"pooling":"splade"}},"max_concurrent_requests":512,"max_input_length":512,"max_batch_tokens":16384,"max_batch_requests":null,"max_client_batch_size":32,"auto_truncate":false,"tokenization_workers":1,"version":"1.6.0","sha":"57d8fc8128ab94fcf06b4463ba0d83a4ca25f89b","docker_label":"sha-57d8fc8"}` ### Information - [x] Docker...