text-embeddings-inference
                                
                                 text-embeddings-inference copied to clipboard
                                
                                    text-embeddings-inference copied to clipboard
                            
                            
                            
                        TEI failed to serve fine-tuned bge-m3 model
System Info
Tested with TEI 1.2, 1.4, and latest (ghcr.io/huggingface/text-embeddings-inference:cuda-latest) OS: Docker on Debian 12 Model: dophys/bge-m3_finetuned Hardware: 1 NVIDIA_L4
Information
- [X] Docker
- [ ] The CLI directly
Tasks
- [X] An officially supported command
- [ ] My own modifications
Reproduction
#!/bin/bash
IMAGE="ghcr.io/huggingface/text-embeddings-inference:cuda-latest"
MODEL=dophys/bge-m3_finetuned
docker pull "$IMAGE"
docker run \
  --shm-size=1G \
  -e NVIDIA_VISIBLE_DEVICES=0 \
  -e MODEL_ID=$MODEL \
  -e JSON_OUTPUT=true \
  -e PORT=8080 \
  -p 7080:8080 \
  --runtime=nvidia \
  $IMAGE
Got error:
Error: Could not create backend
Caused by:
    Could not start backend: cannot find tensor embeddings.word_embeddings.weight
(for TEI 1.2 / 1.4, it throws a different tokenizer.json error)
Expected behavior
Expected the model to be served successfully since its base model BAAI/bge-m3 can be served with TEI, and the model has text-embeddings-inference tag on the model card page.