ART icon indicating copy to clipboard operation
ART copied to clipboard

RuntimeError: Engine core initialization failed. Failed core proc(s): {}

Open Tarunrao0 opened this issue 8 months ago • 3 comments

I'm currently working with the ART library on Kaggle and trying to utilize both of the available T4 GPUs. Specifically, I’m experimenting with the Tic-Tac-Toe example and have attempted to enable multi-GPU support as noted in the recent updates.

Here's how I'm configuring the model:

import os
import torch
import art

model = art.TrainableModel(
    name="001-script",
    project="tic-tac-toe-local",
    base_model="Qwen/Qwen3-8B",
    _internal_config=art.dev.InternalModelConfig(
        engine_args=art.dev.EngineArgs(
            tensor_parallel_size=torch.cuda.device_count(),  
            gpu_memory_utilization=0.9  
        ),
        torchtune_args=art.dev.TorchtuneArgs(
            model="qwen3_8b",           
            model_type="QWEN3",         
            async_weight_syncing=False,
            enable_activation_offloading=False
        ),
    ),
)

await model.register(backend)

This gives me this error

(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527] WorkerProc hit an exception.
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527] Traceback (most recent call last):
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/executor/multiproc_executor.py", line 522, in worker_busy_loop
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     output = func(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]              ^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return func(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/worker/gpu_worker.py", line 205, in determine_available_memory
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     self.model_runner.profile_run()
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 2012, in profile_run
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     hidden_states = self._dummy_run(self.max_num_tokens)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return func(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1847, in _dummy_run
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     outputs = model(
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]               ^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen3.py", line 301, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     hidden_states = self.model(input_ids, positions, intermediate_tensors,
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/compilation/decorators.py", line 239, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     output = self.compiled_callable(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/_dynamo/eval_frame.py", line 655, in _fn
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return fn(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen2.py", line 336, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     def forward(
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/_dynamo/eval_frame.py", line 838, in _fn
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return fn(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 830, in call_wrapped
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._wrapped_call(self, *args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 406, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     raise e
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 393, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "<eval_with_key>.74", line 303, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     submod_1 = self.submod_1(getitem, s0, getitem_1, getitem_2);  getitem = getitem_1 = getitem_2 = None
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 830, in call_wrapped
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._wrapped_call(self, *args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 406, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     raise e
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 393, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "<eval_with_key>.2", line 5, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     unified_attention = torch.ops.vllm.unified_attention(query_1, key_1, v, 'model.layers.0.self_attn.attn');  query_1 = key_1 = v = None
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/_ops.py", line 1158, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._op(*args, **(kwargs or {}))
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/attention/layer.py", line 402, in unified_attention
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     output = self.impl.forward(self, query, key, value, kv_cache,
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/attention/backends/xformers.py", line 547, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     get_num_prefill_decode_query_kv_tokens(attn_metadata, attn_type)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/attention/backends/utils.py", line 584, in get_num_prefill_decode_query_kv_tokens
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     num_prefill_query_tokens = attn_metadata.num_prefill_tokens
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527] AttributeError: 'NoneType' object has no attribute 'num_prefill_tokens'
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527] Traceback (most recent call last):
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/executor/multiproc_executor.py", line 522, in worker_busy_loop
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     output = func(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]              ^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return func(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/worker/gpu_worker.py", line 205, in determine_available_memory
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     self.model_runner.profile_run()
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 2012, in profile_run
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     hidden_states = self._dummy_run(self.max_num_tokens)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return func(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1847, in _dummy_run
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     outputs = model(
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]               ^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen3.py", line 301, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     hidden_states = self.model(input_ids, positions, intermediate_tensors,
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/compilation/decorators.py", line 239, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     output = self.compiled_callable(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/_dynamo/eval_frame.py", line 655, in _fn
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return fn(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/model_executor/models/qwen2.py", line 336, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     def forward(
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/_dynamo/eval_frame.py", line 838, in _fn
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return fn(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 830, in call_wrapped
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._wrapped_call(self, *args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 406, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     raise e
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 393, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "<eval_with_key>.74", line 303, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     submod_1 = self.submod_1(getitem, s0, getitem_1, getitem_2);  getitem = getitem_1 = getitem_2 = None
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 830, in call_wrapped
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._wrapped_call(self, *args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 406, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     raise e
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/fx/graph_module.py", line 393, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "<eval_with_key>.2", line 5, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     unified_attention = torch.ops.vllm.unified_attention(query_1, key_1, v, 'model.layers.0.self_attn.attn');  query_1 = key_1 = v = None
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/torch/_ops.py", line 1158, in __call__
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     return self._op(*args, **(kwargs or {}))
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/attention/layer.py", line 402, in unified_attention
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     output = self.impl.forward(self, query, key, value, kv_cache,
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/attention/backends/xformers.py", line 547, in forward
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     get_num_prefill_decode_query_kv_tokens(attn_metadata, attn_type)
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]   File "/usr/local/lib/python3.11/dist-packages/vllm/attention/backends/utils.py", line 584, in get_num_prefill_decode_query_kv_tokens
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]     num_prefill_query_tokens = attn_metadata.num_prefill_tokens
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527]                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527] AttributeError: 'NoneType' object has no attribute 'num_prefill_tokens'
(VllmWorker rank=0 pid=1065) ERROR 07-07 18:08:26 [multiproc_executor.py:527] 
ERROR 07-07 18:08:26 [core.py:515] EngineCore failed to start.
ERROR 07-07 18:08:26 [core.py:515] Traceback (most recent call last):
ERROR 07-07 18:08:26 [core.py:515]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core.py", line 506, in run_engine_core
ERROR 07-07 18:08:26 [core.py:515]     engine_core = EngineCoreProc(*args, **kwargs)
ERROR 07-07 18:08:26 [core.py:515]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 07-07 18:08:26 [core.py:515]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core.py", line 390, in __init__
ERROR 07-07 18:08:26 [core.py:515]     super().__init__(vllm_config, executor_class, log_stats,
ERROR 07-07 18:08:26 [core.py:515]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core.py", line 83, in __init__
ERROR 07-07 18:08:26 [core.py:515]     self._initialize_kv_caches(vllm_config)
ERROR 07-07 18:08:26 [core.py:515]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core.py", line 141, in _initialize_kv_caches
ERROR 07-07 18:08:26 [core.py:515]     available_gpu_memory = self.model_executor.determine_available_memory()
ERROR 07-07 18:08:26 [core.py:515]                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 07-07 18:08:26 [core.py:515]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/executor/abstract.py", line 76, in determine_available_memory
ERROR 07-07 18:08:26 [core.py:515]     output = self.collective_rpc("determine_available_memory")
ERROR 07-07 18:08:26 [core.py:515]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 07-07 18:08:26 [core.py:515]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/executor/multiproc_executor.py", line 220, in collective_rpc
ERROR 07-07 18:08:26 [core.py:515]     result = get_response(w, dequeue_timeout)
ERROR 07-07 18:08:26 [core.py:515]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 07-07 18:08:26 [core.py:515]   File "/usr/local/lib/python3.11/dist-packages/vllm/v1/executor/multiproc_executor.py", line 207, in get_response
ERROR 07-07 18:08:26 [core.py:515]     raise RuntimeError(
ERROR 07-07 18:08:26 [core.py:515] RuntimeError: Worker failed with error ''NoneType' object has no attribute 'num_prefill_tokens'', please check the stack trace above for the root cause
terminate called after throwing an instance of 'c10::Error'
  what():  Trying to free a pointer not allocated here
Exception raised from raw_delete at /pytorch/torch/csrc/cuda/CUDAPluggableAllocator.cpp:149 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x98 (0x7867847785e8 in /usr/local/lib/python3.11/dist-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0x6a (0x78678470d5b6 in /usr/local/lib/python3.11/dist-packages/torch/lib/libc10.so)
frame #2: torch::cuda::CUDAPluggableAllocator::CUDAPluggableAllocator::raw_delete(void*) + 0x227 (0x78672d5f3df7 in /usr/local/lib/python3.11/dist-packages/torch/lib/libtorch_cuda.so)
frame #3: <unknown function> + 0x20766 (0x786784b6f766 in /usr/local/lib/python3.11/dist-packages/torch/lib/libc10_cuda.so)
frame #4: <unknown function> + 0x20e0b (0x786784b6fe0b in /usr/local/lib/python3.11/dist-packages/torch/lib/libc10_cuda.so)
frame #5: <unknown function> + 0x39012 (0x786784b88012 in /usr/local/lib/python3.11/dist-packages/torch/lib/libc10_cuda.so)
frame #6: c10::cuda::MemPool::~MemPool() + 0x1b9 (0x786784b71999 in /usr/local/lib/python3.11/dist-packages/torch/lib/libc10_cuda.so)
frame #7: <unknown function> + 0xbff0ba (0x78677cd4e0ba in /usr/local/lib/python3.11/dist-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0x3885b0 (0x78677c4d75b0 in /usr/local/lib/python3.11/dist-packages/torch/lib/libtorch_python.so)
frame #9: <unknown function> + 0x388bf1 (0x78677c4d7bf1 in /usr/local/lib/python3.11/dist-packages/torch/lib/libtorch_python.so)
frame #10: /usr/bin/python3() [0x58c8ae]
frame #11: /usr/bin/python3() [0x52d32b]
frame #12: /usr/bin/python3() [0x52dace]
frame #13: /usr/bin/python3() [0x58c3a4]
frame #14: /usr/bin/python3() [0x52dd3f]
frame #15: /usr/bin/python3() [0x5ace0e]
frame #16: /usr/bin/python3() [0x523375]
frame #17: /usr/bin/python3() [0x644a40]
frame #18: Py_FinalizeEx + 0x14b (0x632c8b in /usr/bin/python3)
frame #19: Py_Exit + 0xc (0x65430c in /usr/bin/python3)
frame #20: /usr/bin/python3() [0x64462f]
frame #21: PyErr_PrintEx + 0x16 (0x644416 in /usr/bin/python3)
frame #22: PyRun_SimpleStringFlags + 0x70 (0x622940 in /usr/bin/python3)
frame #23: Py_RunMain + 0x366 (0x63e3e6 in /usr/bin/python3)
frame #24: Py_BytesMain + 0x2d (0x603f2d in /usr/bin/python3)
frame #25: <unknown function> + 0x29d90 (0x786786194d90 in /lib/x86_64-linux-gnu/libc.so.6)
frame #26: __libc_start_main + 0x80 (0x786786194e40 in /lib/x86_64-linux-gnu/libc.so.6)
frame #27: _start + 0x25 (0x603db5 in /usr/bin/python3)

ERROR 07-07 18:08:29 [multiproc_executor.py:140] Worker proc VllmWorker-1 died unexpectedly, shutting down executor.
Process EngineCore_0:
Traceback (most recent call last):
  File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core.py", line 519, in run_engine_core
    raise e
  File "/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core.py", line 506, in run_engine_core
    engine_core = EngineCoreProc(*args, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core.py", line 390, in __init__
    super().__init__(vllm_config, executor_class, log_stats,
  File "/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core.py", line 83, in __init__
    self._initialize_kv_caches(vllm_config)
  File "/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core.py", line 141, in _initialize_kv_caches
    available_gpu_memory = self.model_executor.determine_available_memory()
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/vllm/v1/executor/abstract.py", line 76, in determine_available_memory
    output = self.collective_rpc("determine_available_memory")
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/vllm/v1/executor/multiproc_executor.py", line 220, in collective_rpc
    result = get_response(w, dequeue_timeout)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/vllm/v1/executor/multiproc_executor.py", line 207, in get_response
    raise RuntimeError(
RuntimeError: Worker failed with error ''NoneType' object has no attribute 'num_prefill_tokens'', please check the stack trace above for the root cause
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_827/2726409425.py in <cell line: 1>()
     21 )
     22 
---> 23 await model.register(backend)

/usr/local/lib/python3.11/dist-packages/art/model.py in register(self, backend, _openai_client_config)
    293     ) -> None:
    294         await super().register(backend)
--> 295         base_url, api_key = await backend._prepare_backend_for_training(
    296             self, _openai_client_config
    297         )

/usr/local/lib/python3.11/dist-packages/art/local/backend.py in _prepare_backend_for_training(self, model, config)
    239     ) -> tuple[str, str]:
    240         service = await self._get_service(model)
--> 241         await service.start_openai_server(config=config)
    242         server_args = (config or {}).get("server_args", {})
    243 

/usr/local/lib/python3.11/dist-packages/mp_actors/traceback.py in async_wrapper(*args, **kwargs)
     25                 return await func(*args, **kwargs)
     26             except Exception as e:
---> 27                 raise e.with_traceback(streamlined_traceback())
     28 
     29         return cast(T, async_wrapper if is_async else wrapper)

/usr/local/lib/python3.11/dist-packages/art/torchtune/service.py in start_openai_server()
     32     async def start_openai_server(self, config: dev.OpenAIServerConfig | None) -> None:
     33         await openai_server_task(
---> 34             engine=await self.llm,
     35             config=dev.get_openai_server_config(
     36                 model_name=self.model_name,

/usr/lib/python3.11/asyncio/futures.py in __await__()
    285         if not self.done():
    286             self._asyncio_future_blocking = True
--> 287             yield self  # This tells Task to wait for completion.
    288         if not self.done():
    289             raise RuntimeError("await wasn't used with future")

/usr/lib/python3.11/asyncio/tasks.py in __wakeup()
    347     def __wakeup(self, future):
    348         try:
--> 349             future.result()
    350         except BaseException as exc:
    351             # This may also be a cancellation.

/usr/lib/python3.11/asyncio/futures.py in result()
    201         self.__log_traceback = False
    202         if self._exception is not None:
--> 203             raise self._exception.with_traceback(self._exception_tb)
    204         return self._result
    205 

/usr/lib/python3.11/asyncio/tasks.py in __step()
    275                 # We use the `send` method directly, because coroutines
    276                 # don't have `__iter__` and `__next__` methods.
--> 277                 result = coro.send(None)
    278             else:
    279                 result = coro.throw(exc)

/usr/local/lib/python3.11/dist-packages/art/vllm/engine.py in get_llm()
     36     envs.VLLM_USE_V1 = "1"
     37     # Create engine
---> 38     llm = AsyncLLM.from_engine_args(
     39         replace(
     40             args,

/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/async_llm.py in from_engine_args()
    187 
    188         # Create the AsyncLLM.
--> 189         return cls(
    190             vllm_config=vllm_config,
    191             executor_class=executor_class,

/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/async_llm.py in __init__()
    122         # EngineCore (starts the engine in background process).
    123 
--> 124         self.engine_core = EngineCoreClient.make_async_mp_client(
    125             vllm_config=vllm_config,
    126             executor_class=executor_class,

/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core_client.py in make_async_mp_client()
     91             return DPAsyncMPClient(vllm_config, executor_class, log_stats,
     92                                    client_addresses, client_index)
---> 93         return AsyncMPClient(vllm_config, executor_class, log_stats,
     94                              client_addresses, client_index)
     95 

/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core_client.py in __init__()
    714                  client_addresses: Optional[dict[str, str]] = None,
    715                  client_index: int = 0):
--> 716         super().__init__(
    717             asyncio_mode=True,
    718             vllm_config=vllm_config,

/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core_client.py in __init__()
    420 
    421             if client_addresses is None:
--> 422                 self._init_engines_direct(vllm_config, local_only,
    423                                           local_start_index, input_address,
    424                                           output_address, executor_class,

/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core_client.py in _init_engines_direct()
    489 
    490             # Wait for engine core process(es) to start.
--> 491             self._wait_for_engine_startup(handshake_socket, input_address,
    492                                           output_address)
    493 

/usr/local/lib/python3.11/dist-packages/vllm/v1/engine/core_client.py in _wait_for_engine_startup()
    509             "called with CoreEngineProcManager")
    510 
--> 511         wait_for_engine_startup(
    512             handshake_socket,
    513             addresses,

/usr/local/lib/python3.11/dist-packages/vllm/v1/utils.py in wait_for_engine_startup()
    492             if coord_process is not None and coord_process.exitcode is not None:
    493                 finished[coord_process.name] = coord_process.exitcode
--> 494             raise RuntimeError("Engine core initialization failed. "
    495                                "See root cause above. "
    496                                f"Failed core proc(s): {finished}")

RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

Tarunrao0 avatar Jul 07 '25 18:07 Tarunrao0