text-embeddings-inference
text-embeddings-inference copied to clipboard
Different behavior between SentenceTransformer and TEI when using gte-large-en-v1.5
System Info
$ text-embeddings-router --version
text-embeddings-router 1.5.0
Information
- [X] Docker
- [X] The CLI directly
Tasks
- [X] An officially supported command
- [ ] My own modifications
Reproduction
# using TEI
model=Alibaba-NLP/gte-large-en-v1.5
text-embeddings-router --model-id $model --port 8080
curl -X POST "http://localhost:8080/embeddings" \
-H "Content-Type: application/json" \
-d '{"input":["Dimension table for main account?"]}'
<Response [200]>
Alibaba-NLP/gte-large-en-v1.5
[-0.0006371783,-0.03931647,-0.010235489,-0.019322978,-0.014273809,0.022573953]
# using SentenceTransformer
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Alibaba-NLP/gte-large-en-v1.5",trust_remote_code=True)
embeddings = model.encode(['Dimension table for main account?'])
print(list(embeddings[0][:6]))
[-0.015188057, -0.9458093, -0.24485634, -0.4617836, -0.3435278, 0.53972]
When using SentenceTransformer, it will download a new model named Alibaba-NLP/new-impl, but TEI may use the original model.
/home/smilencer/miniconda3/envs/ml/lib/python3.12/site-packages/huggingface_hub/file_download.py:1150: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
configuration.py: 7.13kB [00:00, 25.2MB/s]
A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/new-impl:
- configuration.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
modeling.py: 59.0kB [00:00, 350kB/s]
A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/new-impl:
- modeling.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
Is there anyway to make TEI to use Alibaba-NLP/new-impl?
I tried to modify the repo files ref https://huggingface.co/Alibaba-NLP/new-impl/discussions/2, but it's not working.
Expected behavior
the embedding results are the same