text-embeddings-inference TEI failed to serve fine-tuned bge-m3 model

TEI failed to serve fine-tuned bge-m3 model

Open KCFindstr opened this issue 1 year ago • 0 comments

System Info

Tested with TEI 1.2, 1.4, and latest (ghcr.io/huggingface/text-embeddings-inference:cuda-latest) OS: Docker on Debian 12 Model: dophys/bge-m3_finetuned Hardware: 1 NVIDIA_L4

Information

[X] Docker
[ ] The CLI directly

Tasks

[X] An officially supported command
[ ] My own modifications

Reproduction

#!/bin/bash
IMAGE="ghcr.io/huggingface/text-embeddings-inference:cuda-latest"
MODEL=dophys/bge-m3_finetuned
docker pull "$IMAGE"

docker run \
  --shm-size=1G \
  -e NVIDIA_VISIBLE_DEVICES=0 \
  -e MODEL_ID=$MODEL \
  -e JSON_OUTPUT=true \
  -e PORT=8080 \
  -p 7080:8080 \
  --runtime=nvidia \
  $IMAGE

Got error:

Error: Could not create backend

Caused by:
    Could not start backend: cannot find tensor embeddings.word_embeddings.weight

(for TEI 1.2 / 1.4, it throws a different tokenizer.json error)

Expected behavior

Expected the model to be served successfully since its base model BAAI/bge-m3 can be served with TEI, and the model has text-embeddings-inference tag on the model card page.

Aug 15 '24 20:08 KCFindstr

text-embeddings-inference text-embeddings-inference copied to clipboard

TEI failed to serve fine-tuned bge-m3 model

System Info

Information

Tasks

Reproduction

Expected behavior

text-embeddings-inference
text-embeddings-inference copied to clipboard