text-embeddings-inference
text-embeddings-inference copied to clipboard
Data type returned by text-embeddings-inference does not match the specified type
System Info
docker-compose.yml
version: "3"
services:
embedding:
image: ghcr.io/huggingface/text-embeddings-inference:0.6
container_name: tei_bge
restart: always
command: --model-id /data/checkpoints/bge_large/ --dtype float32
ports:
- "8080:80"
volumes:
- /data:/data
deploy:
resources:
reservations:
devices:
- driver: "nvidia"
device_ids: [ "0" ]
capabilities: [ "gpu" ]
Information
- [X] Docker
- [ ] The CLI directly
Tasks
- [X] An officially supported command
- [ ] My own modifications
Reproduction
Describe the bug
When using text-embeddings-inference
, I specified --dtype float32
, but the data returned seems to be of float16
type.
Steps to reproduce
- Start
text-embeddings-inference
with--dtype float32
specified. - Call the model via the interface.
- Check the returned data.
Expected behavior
Expected behavior
I expect the data returned to be of float32
type, consistent with what I specified when starting text-embeddings-inference
.
Actual behavior
The data returned seems to be of float16
type.
Additional context
The printout when starting the model is:
Args { model_id: "/dat//rge/", revision: None, tokenization_workers: None, dtype: Some(Float32), pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, hf_api_token: None, hostname: "842c3aa5253b", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), json_output: false, otlp_endpoint: None }
An example of the returned data is: [0.03771281, -0.035546724, -0.04045037, 0.04029212, -0.003022021, ...]
.
Could you please advise on how to ensure the data returned is always of float32
type? Thank you.