GenAIExamples
GenAIExamples copied to clipboard
[Bug] TEI Gaudi 2 image is failing to launch
Priority
Undecided
OS type
Ubuntu
Hardware type
Gaudi2
Installation method
- [X] Pull docker images from hub.docker.com
- [ ] Build docker images from source
Deploy method
- [X] Docker compose
- [ ] Docker
- [ ] Kubernetes
- [ ] Helm
Running nodes
Single Node
What's the version?
latest
Description
TEI Gaudi image is not lauching due to errors.
Reproduce steps
After running the docker compose the image "opea/tei-gaudi:latest " started but it fails after launching
Raw log
/ChatQnA/docker_compose/intel/hpu/gaudi$ docker logs 349bc3685e97
2024-09-15T19:24:20.318465Z INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "BAA*/***-****-**-v1.5", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: true, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "349bc3685e97", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2024-09-15T19:24:20.318939Z INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"
2024-09-15T19:24:20.445084Z INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:45: Downloading `1_Pooling/config.json`
2024-09-15T19:24:21.565035Z INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:108: Downloading `config_sentence_transformers.json`
2024-09-15T19:24:21.693734Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2024-09-15T19:24:21.693774Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:22: Downloading `config.json`
2024-09-15T19:24:21.823411Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:25: Downloading `tokenizer.json`
2024-09-15T19:24:22.137401Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:52: Downloading `model.safetensors`
2024-09-15T19:24:43.732689Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:39: Model artifacts downloaded in 22.038954482s
2024-09-15T19:24:44.030320Z INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 512
2024-09-15T19:24:44.049828Z INFO text_embeddings_core::tokenization: core/src/tokenization.rs:26: Starting 152 tokenization workers
2024-09-15T19:24:44.374782Z INFO text_embeddings_router: router/src/lib.rs:250: Starting model backend
2024-09-15T19:24:44.375230Z INFO text_embeddings_backend_python::management: backends/python/src/management.rs:58: Starting Python backend
2024-09-15T19:24:48.899061Z WARN python-backend: text_embeddings_backend_python::logging: backends/python/src/logging.rs:39: Could not import Flash Attention enabled models: No module named 'dropout_layer_norm'
2024-09-15T19:24:50.018977Z ERROR python-backend: text_embeddings_backend_python::logging: backends/python/src/logging.rs:40: Error when initializing model
Traceback (most recent call last):
File "/usr/local/bin/python-text-embeddings-server", line 8, in <module>
sys.exit(app())
File "/usr/local/lib/python3.10/dist-packages/typer/main.py", line 311, in __call__
return get_command(self)(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/typer/core.py", line 716, in main
return _main(
File "/usr/local/lib/python3.10/dist-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/usr/src/backends/python/server/text_embeddings_server/cli.py", line 51, in serve
server.serve(model_path, dtype, uds_path)
File "/usr/src/backends/python/server/text_embeddings_server/server.py", line 88, in serve
asyncio.run(serve_inner(model_path, dtype))
File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/usr/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/usr/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/usr/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
> File "/usr/src/backends/python/server/text_embeddings_server/server.py", line 57, in serve_inner
model = get_model(model_path, dtype)
File "/usr/src/backends/python/server/text_embeddings_server/models/__init__.py", line 56, in get_model
raise ValueError("CPU device only supports float32 dtype")
ValueError: CPU device only supports float32 dtype
Error: Could not create backend
Caused by:
Could not start backend: Python backend failed to start