text-embeddings-inference icon indicating copy to clipboard operation
text-embeddings-inference copied to clipboard

Model downloads just *hang*

Open djanito opened this issue 9 months ago • 0 comments

System Info

  bge-reranker:
    image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.2.3
    ports:
      - "8082:80"
    volumes:
      - model_cache_huggingface:/data
    environment:
      - MODEL_ID=BAAI/bge-reranker-large
volumes:
  model_cache_huggingface:

and I run my docker-compose file with docker compose up -d

Information

  • [X] Docker
  • [ ] The CLI directly

Tasks

  • [X] An officially supported command
  • [ ] My own modifications

Reproduction

2024-04-30 15:38:20 {"timestamp":"2024-04-30T13:38:20.528463Z","level":"INFO","message":"Args { model_id: \"BAA*/***-********-**rge\", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, hf_api_token: None, hostname: \"b70decaa228d\", port: 80, uds_path: \"/tmp/text-embeddings-inference-server\", huggingface_hub_cache: Some(\"/data\"), payload_limit: 2000000, api_key: None, json_output: true, otlp_endpoint: None, cors_allow_origin: None }","target":"text_embeddings_router","filename":"router/src/main.rs","line_number":140}

2024-04-30 15:38:20 {"timestamp":"2024-04-30T13:38:20.528544Z","level":"INFO","message":"Token file not found \"/root/.cache/huggingface/token\"","log.target":"hf_hub","log.module_path":"hf_hub","log.file":"/usr/local/cargo/git/checkouts/hf-hub-1aadb4c6e2cbe1ba/b167f69/src/lib.rs","log.line":55,"target":"hf_hub","filename":"/usr/local/cargo/git/checkouts/hf-hub-1aadb4c6e2cbe1ba/b167f69/src/lib.rs","line_number":55}

2024-04-30 15:38:23 {"timestamp":"2024-04-30T13:38:23.388497Z","level":"INFO","message":"Starting download","target":"text_embeddings_core::download","filename":"core/src/download.rs","line_number":20,"span":{"name":"download_artifacts"},"spans":[{"name":"download_artifacts"}]}

After a long time, I have this:

2024-04-30 15:48:04 {"timestamp":"2024-04-30T13:48:04.675753Z","level":"WARN","message":"Could not find a Sentence Transformers config","target":"text_embeddings_router","filename":"router/src/lib.rs","line_number":165}

2024-04-30 15:48:04 {"timestamp":"2024-04-30T13:48:04.675777Z","level":"INFO","message":"Maximum number of tokens per request: 512","target":"text_embeddings_router","filename":"router/src/lib.rs","line_number":169}

2024-04-30 15:48:04 {"timestamp":"2024-04-30T13:48:04.676595Z","level":"INFO","message":"Starting 8 tokenization workers","target":"text_embeddings_core::tokenization","filename":"core/src/tokenization.rs","line_number":23}

2024-04-30 15:48:06 {"timestamp":"2024-04-30T13:48:06.076074Z","level":"INFO","message":"Starting model backend","target":"text_embeddings_router","filename":"router/src/lib.rs","line_number":194}

2024-04-30 15:48:06 {"timestamp":"2024-04-30T13:48:06.090020Z","level":"INFO","message":"Starting Bert model on Cpu","target":"text_embeddings_backend_candle","filename":"backends/candle/src/lib.rs","line_number":132}

And then the container shut down without errors

Expected behavior

I expect the model to download correctly, given that I'm using docker and the tutorial on huggingface.

This issue seems very similar to this one.

djanito avatar Apr 30 '24 13:04 djanito