text-embeddings-inference icon indicating copy to clipboard operation
text-embeddings-inference copied to clipboard

Update `ort` crate version to `2.0.0-rc.4` to support onnx IR version 10

Open kozistr opened this issue 1 year ago • 0 comments

What does this PR do?

Fixes #355

  • i guess IR version 10 is supported from onnxruntime 1.8.0, and it is used from 2.0.0-rc.3 version. So, upgrade to the latest version, 2.0.0-rc.4.
$ ./target/release/text-embeddings-router --model-id dunzhang/stella_en_400M_v5 --revision refs/pr/3 --port 8080
2024-07-26T16:50:55.128433Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "dun*****/******_**_***M_v5", revision: Some("refs/pr/3"), tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "0.0.0.0", port: 8080, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: None, payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-07-26T16:50:55.139343Z  INFO hf_hub: /home/zero/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/home/zero/.cache/huggingface/token"
2024-07-26T16:50:55.211483Z  INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:38: Downloading `1_Pooling/config.json`
2024-07-26T16:50:55.213326Z  INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:62: Downloading `config_sentence_transformers.json`
2024-07-26T16:50:55.213366Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:21: Starting download
2024-07-26T16:50:55.213371Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:23: Downloading `config.json`
2024-07-26T16:50:55.213381Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Downloading `tokenizer.json`
2024-07-26T16:50:55.217605Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:368: Downloading `model.onnx`
2024-07-26T16:50:55.482762Z  WARN download_artifacts: text_embeddings_backend: backends/src/lib.rs:372: Could not download `model.onnx`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/dunzhang/stella_en_400M_v5/resolve/refs%2Fpr%2F3/model.onnx)
2024-07-26T16:50:55.482789Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:373: Downloading `onnx/model.onnx`
2024-07-26T16:50:55.483067Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:379: Downloading `model.onnx_data`
2024-07-26T16:50:55.673022Z  WARN download_artifacts: text_embeddings_backend: backends/src/lib.rs:383: Could not download `model.onnx_data`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/dunzhang/stella_en_400M_v5/resolve/refs%2Fpr%2F3/model.onnx_data)
2024-07-26T16:50:55.673127Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:384: Downloading `onnx/model.onnx_data`
2024-07-26T16:50:55.866739Z  WARN download_artifacts: text_embeddings_backend: backends/src/lib.rs:388: Could not download `onnx/model.onnx_data`: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/dunzhang/stella_en_400M_v5/resolve/refs%2Fpr%2F3/onnx/model.onnx_data)
2024-07-26T16:50:55.866772Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:32: Model artifacts downloaded in 653.40509ms
2024-07-26T16:50:55.888893Z  INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 512
2024-07-26T16:50:55.894090Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 4 tokenization workers
2024-07-26T16:50:55.913947Z  INFO text_embeddings_router: router/src/lib.rs:241: Starting model backend
Error: Model backend is not healthy

Caused by:
    Unknown output keys: [Output { name: "sentence_embedding", output_type: Tensor { ty: Float32, dimensions: [-1, 1024] } }]

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [x] Did you read the contributor guideline, Pull Request section?
  • [x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

@OlivierDehaene OR @Narsil

kozistr avatar Jul 26 '24 16:07 kozistr