text-embeddings-inference issues

Support tokenized input

2

### Feature request The OpenAI API `/embedding` endpoint accepts input for both text (list of strings) and [tokenized input](https://github.com/openai/openai-openapi/blob/893ba52242dbd5387a97b96444ee1c742cfce9bd/openapi.yaml#L8832-L8850) (list of integers). text-embeddings-inference should also support list of integers (tokens)...

slyt

'ptxas' died due to signal 11 (Invalid memory reference)

3

### System Info Version: v.1.4.0 Cargo version: cargo 1.79.0 (ffa9cf99a 2024-06-03) GCC version: 11.4.1 GPU: Compile with CUDA_COMPUTE_CAP=86 on machine without GPU (but with CUDA 12.1). I plan to use...

Semihal

Move `batch`, `sort_embeddings` into `backends/candle`

# What does this PR do? - **Change**: Moves the functions `batch`, `sort_embeddings` `backends/candle/tests/` to `backends/candle`. - **Motivation**: Crates consuming `text-embeddings-inference` as a dependency (and not as a standalone server)...

Jeadie

Dockerized text-embeddings-inference:cpu-1.0 /embed endpoint issue

1

### System Info Sample Docker Compose File ``` embedding: image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.0 platform: linux/amd64 volumes: - embed_data:/data command: --model-id BAAI/bge-small-en-v1.5 ports: - "8080:80" ``` When hitting endpoint `/embed` over and over...

liltimtim

Deberta V3 not supported

1

### System Info Hi, on inference endpoint in huggingface the TGI for classifiers is working but here it doesn't, Deberta v3 classifier is not supported? ### Information - [ ]...

Stealthwriter

Model is downloaded each time I run the container

1

### System Info text-embedding-inference:1.3.0 ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [ ] An officially supported command - [ ] My own modifications...

djanito

Improve documentation about rerankers: which ones are supported?

3

### System Info I am currently mostly working with the ghcr.io/huggingface/text-embeddings-inference:cpu-1.2 Docker image on MacOS. Currently, I am only trying to find out which reranker models with a context size...

AlexanderFillbrunn

CPU Image: High memory usage on startup

1

### System Info Image: v1.2 CPU Model used: jinaai/jina-embeddings-v2-base-de Deployment: Docker / RH OpenShift ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An...

freinold

chore: cleanup python backend

OlivierDehaene

Support BAAI/bge-reranker-v2-minicpm-layerwise

1

### Feature request Support BAAI/bge-reranker-v2-minicpm-layerwise ### Motivation BAAI/bge-reranker-v2-minicpm-layerwise inference is very slow using default way. ### Your contribution None

FounderHy

text-embeddings-inference
text-embeddings-inference copied to clipboard

Metadata

Support tokenized input

'ptxas' died due to signal 11 (Invalid memory reference)

Move `batch`, `sort_embeddings` into `backends/candle`

Dockerized text-embeddings-inference:cpu-1.0 /embed endpoint issue

Deberta V3 not supported

Model is downloaded each time I run the container

Improve documentation about rerankers: which ones are supported?

CPU Image: High memory usage on startup

chore: cleanup python backend

Support BAAI/bge-reranker-v2-minicpm-layerwise

← Metadata

Owner

Metadata

text-embeddings-inference text-embeddings-inference copied to clipboard

Metadata

← Metadata

Owner

Metadata

text-embeddings-inference
text-embeddings-inference copied to clipboard