vllm
vllm copied to clipboard
[Bug]: vllm stall on llama3-70b warmup with 0.4.1
Your current environment
I'm running a minimally modified image of nvcr.io/nvidia/pytorch:23.10-py3 with Python 3.11.
PyTorch version: 2.2.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.3 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.29.2
Libc version: glibc-2.35
Python version: 3.11.5 (main, Aug 26 2023, 07:22:50) [Clang 16.0.3 ] (64-bit runtime)
Python platform: Linux-4.4.0-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 12.2.140
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration:
GPU 0: NVIDIA A100-SXM4-80GB
GPU 1: NVIDIA A100-SXM4-80GB
Nvidia driver version: 535.129.03
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.5
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 5
On-line CPU(s) list: 0-4
Vendor ID: GenuineIntel
Model name: unknown
CPU family: 6
Model: 85
Thread(s) per core: 0
Core(s) per socket: 0
Socket(s): 0
Stepping: unknown
BogoMIPS: 2200.19
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_vnni md_clear arch_capabilities
Hypervisor vendor: KVM
Virtualization type: full
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] nvidia-nccl-cu12==2.19.3
[pip3] torch==2.2.1
[pip3] triton==2.2.0
[pip3] vllm_nccl_cu11==2.18.1.0.3.0
[conda] Could not collectROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: 0.4.1
vLLM Build Flags:
CUDA Archs: 5.2 6.0 6.1 7.0 7.2 7.5 8.0 8.6 8.7 9.0+PTX; ROCm: Disabled; Neuron: Disabled
GPU Topology:
Could not collect
🐛 Describe the bug
I am trying to boot a llama-3 70b model with llvm, but it times out while trying to instantiate. So far I've let it rest for upwards of 10min with minimal observed GPU activity. It allocates a maximum of 7GB despite having access to the full 160GB (A100 80x2).
MODEL_DIR = "/model"
MODEL_NAME = "NousResearch/Meta-Llama-3-70B-Instruct"
Before this run, I'm downloading the safetensor files through huggingface-cli and placing them in the local /model volume. No downloading seems to be happening when the entrypoint boots.
engine_args = AsyncEngineArgs(
model=MODEL_DIR,
tensor_parallel_size=2,
gpu_memory_utilization=0.90,
enforce_eager=False,
disable_log_requests=True,
)
engine = AsyncLLMEngine.from_engine_args(engine_args)
Starting cold inference...
2024-04-22 22:29:35,056 INFO worker.py:1749 -- Started a local Ray instance.
INFO 04-22 22:29:40 llm_engine.py:98] Initializing an LLM engine (v0.4.1) with config: model='/model', speculative_config=None, tokenizer='/model', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=8192, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=2, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), seed=0)
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
INFO 04-22 22:29:56 utils.py:570] Found nccl from environment variable VLLM_NCCL_SO_PATH=/usr/lib/x86_64-linux-gnu/libnccl.so
(RayWorkerWrapper pid=419) INFO 04-22 22:29:56 utils.py:570] Found nccl from environment variable VLLM_NCCL_SO_PATH=/usr/lib/x86_64-linux-gnu/libnccl.so
INFO 04-22 22:29:58 selector.py:28] Using FlashAttention backend.
(RayWorkerWrapper pid=419) INFO 04-22 22:29:58 selector.py:28] Using FlashAttention backend.
INFO 04-22 22:29:59 pynccl_utils.py:45] vLLM is using nccl==2.19.3
(RayWorkerWrapper pid=419) INFO 04-22 22:29:59 pynccl_utils.py:45] vLLM is using nccl==2.19.3
Note that before getting here, my env was throwing an exception related to https://github.com/vllm-project/vllm/issues/4257, so I'm providing the NCCL path manually:
export VLLM_NCCL_SO_PATH="/usr/lib/x86_64-linux-gnu/libnccl.so"
Can you try to run with export VLLM_TRACE_FUNCTION=1 ? This should give you hint on which function crashes or hangs.
@youkaichao Attached the truncated tail. It stays this way indefinitely with no additional calls written. Seems like it's either legitimately stalling out there or trying to retry the download...
2024-04-22 23:10:33.256602 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/worker/worker.py:0
2024-04-22 23:10:33.259526 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/__init__.py:0
2024-04-22 23:10:33.259735 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/__init__.py:0
2024-04-22 23:10:33.262792 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:0
2024-04-22 23:10:33.263368 Call to init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:57
2024-04-22 23:10:33.263979 Return from init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:69
2024-04-22 23:10:33.266077 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:0
2024-04-22 23:10:33.266392 Call to init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:57
2024-04-22 23:10:33.267045 Return from init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:69
2024-04-22 23:10:33.267345 Call to find_nccl_library in /usr/local/lib/python3.11/site-packages/vllm/utils.py:556
2024-04-22 23:10:33.268880 Call to _patched_makeRecord in /usr/local/lib/python3.11/site-packages/ray/autoscaler/_private/cli_logger.py:116
2024-04-22 23:10:33.269158 Return from _patched_makeRecord in /usr/local/lib/python3.11/site-packages/ray/autoscaler/_private/cli_logger.py:143
2024-04-22 23:10:33.269388 Call to format in /usr/local/lib/python3.11/site-packages/vllm/logger.py:23
2024-04-22 23:10:33.269707 Return from format in /usr/local/lib/python3.11/site-packages/vllm/logger.py:28
2024-04-22 23:10:33.269904 Return from find_nccl_library in /usr/local/lib/python3.11/site-packages/vllm/utils.py:581
2024-04-22 23:10:33.270178 Call to nccl_integrity_check in /usr/local/lib/python3.11/site-packages/vllm/utils.py:534
2024-04-22 23:10:34.128144 Return from nccl_integrity_check in /usr/local/lib/python3.11/site-packages/vllm/utils.py:553
2024-04-22 23:10:34.129958 Call to NcclUniqueId in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:81
2024-04-22 23:10:34.130212 Return from NcclUniqueId in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:82
2024-04-22 23:10:34.130539 Call to ncclDataType_t in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:112
2024-04-22 23:10:34.130806 Return from ncclDataType_t in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:131
2024-04-22 23:10:34.131093 Call to ncclRedOp_t in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:151
2024-04-22 23:10:34.131233 Return from ncclRedOp_t in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:160
2024-04-22 23:10:34.131465 Call to NCCLCommunicator in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:194
2024-04-22 23:10:34.134067 Return from NCCLCommunicator in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:259
2024-04-22 23:10:34.134443 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:194
2024-04-22 23:10:34.135002 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.135374 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.135873 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:68
2024-04-22 23:10:34.137903 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/custom_all_reduce.py:0
2024-04-22 23:10:34.159647 Call to init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:57
2024-04-22 23:10:34.160260 Return from init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:69
2024-04-22 23:10:34.160809 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.160920 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.161374 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.161567 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.161830 Call to CustomAllreduce in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/custom_all_reduce.py:171
2024-04-22 23:10:34.162045 Return from CustomAllreduce in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/custom_all_reduce.py:259
2024-04-22 23:10:34.162227 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/custom_all_reduce.py:171
2024-04-22 23:10:34.164356 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/worker/cache_engine.py:0
2024-04-22 23:10:34.166890 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/__init__.py:0
2024-04-22 23:10:34.170287 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/__init__.py:0
2024-04-22 23:10:34.170658 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/__init__.py:0
2024-04-22 23:10:34.176090 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:0
2024-04-22 23:10:34.176636 Call to AttentionBackend in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:8
2024-04-22 23:10:34.177014 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.177212 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.177461 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.177686 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.177983 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.178150 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.178368 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.178561 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.178784 Return from AttentionBackend in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:42
2024-04-22 23:10:34.178992 Call to AttentionMetadataPerStage in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:49
2024-04-22 23:10:34.179098 Return from AttentionMetadataPerStage in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:53
2024-04-22 23:10:34.179899 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.180006 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.180164 Call to AttentionMetadata in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:66
2024-04-22 23:10:34.180454 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.180589 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.180861 Return from AttentionMetadata in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:90
2024-04-22 23:10:34.181151 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.181407 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.182503 Call to AttentionImpl in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:98
2024-04-22 23:10:34.182684 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.182863 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.183197 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.183396 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.183631 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.183810 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.184076 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.184222 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.184482 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.184657 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.184877 Return from AttentionImpl in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:113
2024-04-22 23:10:34.185033 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/backends/abstract.py:98
2024-04-22 23:10:34.187041 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/layer.py:0
2024-04-22 23:10:34.189205 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/selector.py:0
2024-04-22 23:10:34.189473 Call to init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:57
2024-04-22 23:10:34.190079 Return from init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:69
2024-04-22 23:10:34.190437 Call to _Backend in /usr/local/lib/python3.11/site-packages/vllm/attention/selector.py:17
2024-04-22 23:10:34.190777 Return from _Backend in /usr/local/lib/python3.11/site-packages/vllm/attention/selector.py:21
2024-04-22 23:10:34.191303 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.191411 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.191537 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.191638 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.191867 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/selector.py:50
2024-04-22 23:10:34.192031 Call to Attention in /usr/local/lib/python3.11/site-packages/vllm/attention/layer.py:12
2024-04-22 23:10:34.192225 Return from Attention in /usr/local/lib/python3.11/site-packages/vllm/attention/layer.py:39
2024-04-22 23:10:34.192452 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/layer.py:12
2024-04-22 23:10:34.192610 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/attention/__init__.py:7
2024-04-22 23:10:34.192911 Call to init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:57
2024-04-22 23:10:34.193375 Return from init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:69
2024-04-22 23:10:34.193523 Call to CacheEngine in /usr/local/lib/python3.11/site-packages/vllm/worker/cache_engine.py:14
2024-04-22 23:10:34.193663 Return from CacheEngine in /usr/local/lib/python3.11/site-packages/vllm/worker/cache_engine.py:84
2024-04-22 23:10:34.193842 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/worker/cache_engine.py:104
2024-04-22 23:10:34.196290 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/worker/model_runner.py:0
2024-04-22 23:10:34.198871 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:0
2024-04-22 23:10:34.201085 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/punica.py:0
2024-04-22 23:10:34.201515 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/punica.py:100
2024-04-22 23:10:34.204006 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/logits_processor.py:0
2024-04-22 23:10:34.204411 Call to LogitsProcessor in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/logits_processor.py:11
2024-04-22 23:10:34.204950 Return from LogitsProcessor in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/logits_processor.py:61
2024-04-22 23:10:34.205158 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/logits_processor.py:82
2024-04-22 23:10:34.207462 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py:0
2024-04-22 23:10:34.207668 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.207944 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.208279 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.208368 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.208501 Call to VocabParallelEmbedding in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py:35
2024-04-22 23:10:34.208790 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.208913 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.209118 Return from VocabParallelEmbedding in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py:89
2024-04-22 23:10:34.209427 Call to ParallelLMHead in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py:109
2024-04-22 23:10:34.209683 Return from ParallelLMHead in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py:145
2024-04-22 23:10:34.209924 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py:109
2024-04-22 23:10:34.210342 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.210485 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.210865 Call to LoRAMapping in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:119
2024-04-22 23:10:34.211055 Return from LoRAMapping in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:126
2024-04-22 23:10:34.212198 Call to BaseLayerWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:131
2024-04-22 23:10:34.212508 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.212630 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.212844 Return from BaseLayerWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:167
2024-04-22 23:10:34.213007 Call to VocabParallelEmbeddingWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:174
2024-04-22 23:10:34.213327 Return from VocabParallelEmbeddingWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:309
2024-04-22 23:10:34.213623 Call to ColumnParallelLinearWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:315
2024-04-22 23:10:34.213916 Return from ColumnParallelLinearWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:425
2024-04-22 23:10:34.214071 Call to MergedColumnParallelLinearWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:433
2024-04-22 23:10:34.214238 Return from MergedColumnParallelLinearWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:535
2024-04-22 23:10:34.214490 Call to QKVParallelLinearWithLora in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:542
2024-04-22 23:10:34.214740 Return from QKVParallelLinearWithLora in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:600
2024-04-22 23:10:34.214946 Call to MergedQKVParallelLinearWithLora in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:607
2024-04-22 23:10:34.215088 Return from MergedQKVParallelLinearWithLora in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:776
2024-04-22 23:10:34.215297 Call to RowParallelLinearWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:783
2024-04-22 23:10:34.215549 Return from RowParallelLinearWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:914
2024-04-22 23:10:34.215821 Call to LogitsProcessorWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:920
2024-04-22 23:10:34.215979 Return from LogitsProcessorWithLoRA in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:1089
2024-04-22 23:10:34.216101 Call to <setcomp> in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:1096
2024-04-22 23:10:34.216415 Return from <setcomp> in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:1099
2024-04-22 23:10:34.216797 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.216977 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.217239 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.217446 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.217718 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.217883 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.218120 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.218292 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.218453 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/layers.py:1117
2024-04-22 23:10:34.220434 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/worker_manager.py:0
2024-04-22 23:10:34.222358 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:0
2024-04-22 23:10:34.225205 Call to <module> in /usr/local/lib/python3.11/site-packages/safetensors/__init__.py:0
2024-04-22 23:10:34.232361 Return from <module> in /usr/local/lib/python3.11/site-packages/safetensors/__init__.py:2
2024-04-22 23:10:34.234383 Call to <module> in /usr/local/lib/python3.11/site-packages/safetensors/torch.py:0
2024-04-22 23:10:34.234760 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.235013 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.235281 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.235595 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.235756 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.235825 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.236042 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.236329 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.236598 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.236716 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.236899 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.237106 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.237510 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.237884 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.238215 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.238343 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.238541 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.238750 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.239003 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.239274 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.239659 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.239851 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.240157 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.240251 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.240527 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.240681 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.240867 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.241021 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.241244 Return from <module> in /usr/local/lib/python3.11/site-packages/safetensors/torch.py:455
2024-04-22 23:10:34.243113 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/lora.py:0
2024-04-22 23:10:34.243365 Call to LoRALayerWeights in /usr/local/lib/python3.11/site-packages/vllm/lora/lora.py:8
2024-04-22 23:10:34.243564 Return from LoRALayerWeights in /usr/local/lib/python3.11/site-packages/vllm/lora/lora.py:59
2024-04-22 23:10:34.243750 Call to PackedLoRALayerWeights in /usr/local/lib/python3.11/site-packages/vllm/lora/lora.py:93
2024-04-22 23:10:34.244077 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.244233 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.245241 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.245601 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.245999 Return from PackedLoRALayerWeights in /usr/local/lib/python3.11/site-packages/vllm/lora/lora.py:160
2024-04-22 23:10:34.246411 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/lora.py:93
2024-04-22 23:10:34.248336 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/utils.py:0
2024-04-22 23:10:34.248528 Call to init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:57
2024-04-22 23:10:34.249016 Return from init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:69
2024-04-22 23:10:34.249310 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/utils.py:19
2024-04-22 23:10:34.249514 Call to init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:57
2024-04-22 23:10:34.250278 Return from init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:69
2024-04-22 23:10:34.250609 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.250881 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.251216 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.251454 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.251773 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.251943 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.252261 Call to LoRAModel in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:105
2024-04-22 23:10:34.252435 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.252511 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.252680 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.252902 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.253110 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.253212 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.253521 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.253685 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.253922 Return from LoRAModel in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:191
2024-04-22 23:10:34.254205 Call to LoRAModelManager in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:257
2024-04-22 23:10:34.254438 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.254568 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.254767 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.254940 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.255155 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.255265 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.255432 Return from LoRAModelManager in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:539
2024-04-22 23:10:34.255569 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.255644 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.255768 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.255967 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.256147 Call to LoRALRUCache in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:558
2024-04-22 23:10:34.256473 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.256760 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.256920 Return from LoRALRUCache in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:565
2024-04-22 23:10:34.257230 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.257348 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.257444 Call to LRUCacheLoRAModelManager in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:571
2024-04-22 23:10:34.257635 Return from LRUCacheLoRAModelManager in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:616
2024-04-22 23:10:34.257842 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.257940 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.258139 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.258251 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.258342 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/models.py:623
2024-04-22 23:10:34.258550 Call to init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:57
2024-04-22 23:10:34.258973 Return from init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:69
2024-04-22 23:10:34.259195 Call to AbstractWorkerLoRAManager in /usr/local/lib/python3.11/site-packages/vllm/lora/worker_manager.py:16
2024-04-22 23:10:34.259510 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.259668 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.259914 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.260111 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.260522 Return from AbstractWorkerLoRAManager in /usr/local/lib/python3.11/site-packages/vllm/lora/worker_manager.py:61
2024-04-22 23:10:34.260769 Call to WorkerLoRAManager in /usr/local/lib/python3.11/site-packages/vllm/lora/worker_manager.py:65
2024-04-22 23:10:34.261133 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.261350 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.262086 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.262348 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.262744 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.263085 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.263251 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.263368 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.263594 Return from WorkerLoRAManager in /usr/local/lib/python3.11/site-packages/vllm/lora/worker_manager.py:194
2024-04-22 23:10:34.263873 Call to LRUCacheWorkerLoRAManager in /usr/local/lib/python3.11/site-packages/vllm/lora/worker_manager.py:198
2024-04-22 23:10:34.264037 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.264141 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.264317 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.264438 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.264558 Return from LRUCacheWorkerLoRAManager in /usr/local/lib/python3.11/site-packages/vllm/lora/worker_manager.py:236
2024-04-22 23:10:34.264721 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/lora/worker_manager.py:198
2024-04-22 23:10:34.268244 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/__init__.py:0
2024-04-22 23:10:34.272273 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py:0
2024-04-22 23:10:34.274688 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/tensorizer.py:0
2024-04-22 23:10:34.275863 Call to find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:57
2024-04-22 23:10:34.276306 Call to _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:24
2024-04-22 23:10:34.276524 Return from _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:27
2024-04-22 23:10:34.276762 Return from find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:59
2024-04-22 23:10:34.277042 Call to init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:57
2024-04-22 23:10:34.277668 Return from init_logger in /usr/local/lib/python3.11/site-packages/vllm/logger.py:69
2024-04-22 23:10:34.277894 Call to TensorizerConfig in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/tensorizer.py:40
2024-04-22 23:10:34.278264 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.278496 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.278942 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.279442 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.279922 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.280169 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.280601 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.281044 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.281341 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.281461 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.281654 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.281726 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.281822 Return from TensorizerConfig in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/tensorizer.py:80
2024-04-22 23:10:34.283986 Call to TensorizerArgs in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/tensorizer.py:100
2024-04-22 23:10:34.284313 Return from TensorizerArgs in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/tensorizer.py:234
2024-04-22 23:10:34.285818 Call to TensorizerAgent in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/tensorizer.py:243
2024-04-22 23:10:34.286070 Return from TensorizerAgent in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/tensorizer.py:305
2024-04-22 23:10:34.286416 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.286611 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.286868 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.287152 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.287408 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.287585 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.287777 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/tensorizer.py:347
2024-04-22 23:10:34.290062 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/utils.py:0
2024-04-22 23:10:34.290439 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.290627 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.290775 Return from <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/utils.py:39
2024-04-22 23:10:34.292657 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/weight_utils.py:0
2024-04-22 23:10:34.292961 Call to __getattr__ in /usr/local/lib/python3.11/site-packages/huggingface_hub/__init__.py:482
2024-04-22 23:10:34.295654 Call to <module> in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:0
2024-04-22 23:10:34.298713 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/__init__.py:0
2024-04-22 23:10:34.302152 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/_version.py:0
2024-04-22 23:10:34.302351 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/_version.py:20
2024-04-22 23:10:34.304474 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:0
2024-04-22 23:10:34.304848 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.305029 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.305261 Call to BaseCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:39
2024-04-22 23:10:34.305362 Return from BaseCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:62
2024-04-22 23:10:34.305588 Call to MMapCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:72
2024-04-22 23:10:34.305710 Return from MMapCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:147
2024-04-22 23:10:34.305966 Call to ReadAheadCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:153
2024-04-22 23:10:34.306147 Return from ReadAheadCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:169
2024-04-22 23:10:34.306392 Call to FirstChunkCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:195
2024-04-22 23:10:34.306548 Return from FirstChunkCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:208
2024-04-22 23:10:34.306835 Call to BlockCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:226
2024-04-22 23:10:34.306920 Return from BlockCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:326
2024-04-22 23:10:34.307030 Call to BytesCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:366
2024-04-22 23:10:34.307234 Return from BytesCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:455
2024-04-22 23:10:34.307409 Call to AllBytes in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:459
2024-04-22 23:10:34.307497 Return from AllBytes in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:476
2024-04-22 23:10:34.307615 Call to KnownPartsOfAFile in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:480
2024-04-22 23:10:34.307821 Return from KnownPartsOfAFile in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:535
2024-04-22 23:10:34.308153 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.308354 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.308568 Call to UpdatableLRU in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:578
2024-04-22 23:10:34.308678 Call to CacheInfo in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:585
2024-04-22 23:10:34.309033 Return from CacheInfo in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:589
2024-04-22 23:10:34.309847 Return from UpdatableLRU in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:628
2024-04-22 23:10:34.310165 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.310444 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.310559 Call to BackgroundBlockCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:638
2024-04-22 23:10:34.310642 Return from BackgroundBlockCache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:805
2024-04-22 23:10:34.310883 Call to register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:851
2024-04-22 23:10:34.311117 Return from register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:867
2024-04-22 23:10:34.311495 Call to register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:851
2024-04-22 23:10:34.311758 Return from register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:867
2024-04-22 23:10:34.312045 Call to register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:851
2024-04-22 23:10:34.312399 Return from register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:867
2024-04-22 23:10:34.312670 Call to register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:851
2024-04-22 23:10:34.312791 Return from register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:867
2024-04-22 23:10:34.313160 Call to register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:851
2024-04-22 23:10:34.313396 Return from register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:867
2024-04-22 23:10:34.313526 Call to register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:851
2024-04-22 23:10:34.313619 Return from register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:867
2024-04-22 23:10:34.313760 Call to register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:851
2024-04-22 23:10:34.313964 Return from register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:867
2024-04-22 23:10:34.314161 Call to register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:851
2024-04-22 23:10:34.314345 Return from register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:867
2024-04-22 23:10:34.314442 Call to register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:851
2024-04-22 23:10:34.314514 Return from register_cache in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:867
2024-04-22 23:10:34.314783 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/caching.py:870
2024-04-22 23:10:34.316716 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:0
2024-04-22 23:10:34.316883 Call to Callback in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:4
2024-04-22 23:10:34.317313 Return from Callback in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:195
2024-04-22 23:10:34.317461 Call to NoOpCallback in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:207
2024-04-22 23:10:34.317587 Return from NoOpCallback in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:212
2024-04-22 23:10:34.317688 Call to DotPrinterCallback in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:216
2024-04-22 23:10:34.317777 Return from DotPrinterCallback in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:232
2024-04-22 23:10:34.317870 Call to TqdmCallback in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:237
2024-04-22 23:10:34.317972 Return from TqdmCallback in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:320
2024-04-22 23:10:34.318195 Call to __init__ in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:25
2024-04-22 23:10:34.318297 Return from __init__ in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:29
2024-04-22 23:10:34.318387 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/callbacks.py:324
2024-04-22 23:10:34.320280 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:0
2024-04-22 23:10:34.322346 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/utils.py:0
2024-04-22 23:10:34.322778 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/utils.py:707
2024-04-22 23:10:34.325291 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:0
2024-04-22 23:10:34.327114 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/config.py:0
2024-04-22 23:10:34.327490 Call to set_conf_files in /usr/local/lib/python3.11/site-packages/fsspec/config.py:64
2024-04-22 23:10:34.327915 Return from set_conf_files in /usr/local/lib/python3.11/site-packages/fsspec/config.py:82
2024-04-22 23:10:34.328108 Call to set_conf_env in /usr/local/lib/python3.11/site-packages/fsspec/config.py:14
2024-04-22 23:10:34.328823 Return from set_conf_env in /usr/local/lib/python3.11/site-packages/fsspec/config.py:59
2024-04-22 23:10:34.328976 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/config.py:131
2024-04-22 23:10:34.330900 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/dircache.py:0
2024-04-22 23:10:34.331187 Call to DirCache in /usr/local/lib/python3.11/site-packages/fsspec/dircache.py:6
2024-04-22 23:10:34.331502 Return from DirCache in /usr/local/lib/python3.11/site-packages/fsspec/dircache.py:94
2024-04-22 23:10:34.331777 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/dircache.py:6
2024-04-22 23:10:34.333462 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/transaction.py:0
2024-04-22 23:10:34.333587 Call to Transaction in /usr/local/lib/python3.11/site-packages/fsspec/transaction.py:4
2024-04-22 23:10:34.333840 Return from Transaction in /usr/local/lib/python3.11/site-packages/fsspec/transaction.py:39
2024-04-22 23:10:34.334074 Call to FileActor in /usr/local/lib/python3.11/site-packages/fsspec/transaction.py:52
2024-04-22 23:10:34.334233 Return from FileActor in /usr/local/lib/python3.11/site-packages/fsspec/transaction.py:66
2024-04-22 23:10:34.334416 Call to DaskTransaction in /usr/local/lib/python3.11/site-packages/fsspec/transaction.py:70
2024-04-22 23:10:34.334696 Return from DaskTransaction in /usr/local/lib/python3.11/site-packages/fsspec/transaction.py:83
2024-04-22 23:10:34.334831 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/transaction.py:70
2024-04-22 23:10:34.335058 Call to _Cached in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:35
2024-04-22 23:10:34.335261 Return from _Cached in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:64
2024-04-22 23:10:34.335471 Call to AbstractFileSystem in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:96
2024-04-22 23:10:34.335963 Return from AbstractFileSystem in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:1560
2024-04-22 23:10:34.336214 Call to __init__ in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:52
2024-04-22 23:10:34.336419 Return from __init__ in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:62
2024-04-22 23:10:34.336639 Call to AbstractBufferedFile in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:1568
2024-04-22 23:10:34.336879 Return from AbstractBufferedFile in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:1964
2024-04-22 23:10:34.337079 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:1568
2024-04-22 23:10:34.337275 Call to register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:17
2024-04-22 23:10:34.337367 Return from register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:50
2024-04-22 23:10:34.337484 Call to register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:17
2024-04-22 23:10:34.337599 Return from register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:50
2024-04-22 23:10:34.338405 Call to find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:57
2024-04-22 23:10:34.338745 Call to _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:24
2024-04-22 23:10:34.339021 Return from _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:27
2024-04-22 23:10:34.339111 Return from find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:59
2024-04-22 23:10:34.339321 Call to register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:17
2024-04-22 23:10:34.339447 Return from register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:50
2024-04-22 23:10:34.339555 Call to register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:17
2024-04-22 23:10:34.339630 Return from register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:50
2024-04-22 23:10:34.339692 Call to register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:17
2024-04-22 23:10:34.339857 Return from register_compression in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:50
2024-04-22 23:10:34.357009 Call to find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:57
2024-04-22 23:10:34.357609 Call to _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:24
2024-04-22 23:10:34.358544 Return from _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:27
2024-04-22 23:10:34.358909 Return from find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:59
2024-04-22 23:10:34.359196 Call to SnappyFile in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:107
2024-04-22 23:10:34.359497 Return from SnappyFile in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:132
2024-04-22 23:10:34.360565 Call to find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:57
2024-04-22 23:10:34.360998 Call to _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:24
2024-04-22 23:10:34.361355 Return from _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:27
2024-04-22 23:10:34.362020 Return from find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:59
2024-04-22 23:10:34.363134 Call to find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:57
2024-04-22 23:10:34.363384 Call to _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:24
2024-04-22 23:10:34.363608 Return from _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:27
2024-04-22 23:10:34.363826 Return from find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:59
2024-04-22 23:10:34.365110 Call to find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:57
2024-04-22 23:10:34.365465 Call to _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:24
2024-04-22 23:10:34.365915 Return from _module_matches_namespace in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:27
2024-04-22 23:10:34.366239 Return from find_spec in /usr/local/lib/python3.11/site-packages/setuptools/extern/__init__.py:59
2024-04-22 23:10:34.366642 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/compression.py:172
2024-04-22 23:10:34.368975 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/core.py:0
2024-04-22 23:10:34.371247 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/registry.py:0
2024-04-22 23:10:34.371555 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/registry.py:296
2024-04-22 23:10:34.371818 Call to OpenFile in /usr/local/lib/python3.11/site-packages/fsspec/core.py:31
2024-04-22 23:10:34.372063 Return from OpenFile in /usr/local/lib/python3.11/site-packages/fsspec/core.py:137
2024-04-22 23:10:34.372411 Call to OpenFiles in /usr/local/lib/python3.11/site-packages/fsspec/core.py:146
2024-04-22 23:10:34.372707 Return from OpenFiles in /usr/local/lib/python3.11/site-packages/fsspec/core.py:200
2024-04-22 23:10:34.373020 Call to PickleableTextIOWrapper in /usr/local/lib/python3.11/site-packages/fsspec/core.py:694
2024-04-22 23:10:34.373111 Return from PickleableTextIOWrapper in /usr/local/lib/python3.11/site-packages/fsspec/core.py:713
2024-04-22 23:10:34.373318 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/core.py:694
2024-04-22 23:10:34.375488 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/exceptions.py:0
2024-04-22 23:10:34.375644 Call to BlocksizeMismatchError in /usr/local/lib/python3.11/site-packages/fsspec/exceptions.py:7
2024-04-22 23:10:34.375826 Return from BlocksizeMismatchError in /usr/local/lib/python3.11/site-packages/fsspec/exceptions.py:8
2024-04-22 23:10:34.376031 Call to FSTimeoutError in /usr/local/lib/python3.11/site-packages/fsspec/exceptions.py:14
2024-04-22 23:10:34.376245 Return from FSTimeoutError in /usr/local/lib/python3.11/site-packages/fsspec/exceptions.py:15
2024-04-22 23:10:34.376552 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/exceptions.py:14
2024-04-22 23:10:34.378648 Call to <module> in /usr/local/lib/python3.11/site-packages/fsspec/mapping.py:0
2024-04-22 23:10:34.379032 Call to FSMap in /usr/local/lib/python3.11/site-packages/fsspec/mapping.py:13
2024-04-22 23:10:34.379326 Return from FSMap in /usr/local/lib/python3.11/site-packages/fsspec/mapping.py:195
2024-04-22 23:10:34.379609 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/mapping.py:210
2024-04-22 23:10:34.379844 Call to get_versions in /usr/local/lib/python3.11/site-packages/fsspec/_version.py:20
2024-04-22 23:10:34.380020 Return from get_versions in /usr/local/lib/python3.11/site-packages/fsspec/_version.py:21
2024-04-22 23:10:34.380138 Call to process_entries in /usr/local/lib/python3.11/site-packages/fsspec/__init__.py:41
2024-04-22 23:10:34.444849 Call to register_implementation in /usr/local/lib/python3.11/site-packages/fsspec/registry.py:17
2024-04-22 23:10:34.445113 Return from register_implementation in /usr/local/lib/python3.11/site-packages/fsspec/registry.py:45
2024-04-22 23:10:34.445328 Return from process_entries in /usr/local/lib/python3.11/site-packages/fsspec/__init__.py:53
2024-04-22 23:10:34.445472 Return from <module> in /usr/local/lib/python3.11/site-packages/fsspec/__init__.py:70
2024-04-22 23:10:34.446736 Call to HfFileSystemResolvedPath in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:49
2024-04-22 23:10:34.447007 Return from HfFileSystemResolvedPath in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:61
2024-04-22 23:10:34.448093 Call to HfFileSystem in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:71
2024-04-22 23:10:34.448433 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.448784 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.449026 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.449293 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.449833 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.450024 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.450370 Call to _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2696
2024-04-22 23:10:34.450591 Return from _check_generic in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2704
2024-04-22 23:10:34.450715 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.450803 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.451271 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.451670 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.451953 Return from HfFileSystem in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:666
2024-04-22 23:10:34.452165 Call to __init__ in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:52
2024-04-22 23:10:34.452291 Return from __init__ in /usr/local/lib/python3.11/site-packages/fsspec/spec.py:62
2024-04-22 23:10:34.452430 Call to HfFileSystemFile in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:673
2024-04-22 23:10:34.452565 Return from HfFileSystemFile in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:744
2024-04-22 23:10:34.452730 Call to HfFileSystemStreamFile in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:748
2024-04-22 23:10:34.452843 Return from HfFileSystemStreamFile in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:843
2024-04-22 23:10:34.453084 Return from <module> in /usr/local/lib/python3.11/site-packages/huggingface_hub/hf_file_system.py:866
2024-04-22 23:10:34.453371 Return from __getattr__ in /usr/local/lib/python3.11/site-packages/huggingface_hub/__init__.py:497
2024-04-22 23:10:34.453508 Call to __getattr__ in /usr/local/lib/python3.11/site-packages/huggingface_hub/__init__.py:482
2024-04-22 23:10:34.455396 Call to <module> in /usr/local/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py:0
2024-04-22 23:10:34.455715 Call to get_logger in /usr/local/lib/python3.11/site-packages/huggingface_hub/utils/logging.py:78
2024-04-22 23:10:34.455944 Return from get_logger in /usr/local/lib/python3.11/site-packages/huggingface_hub/utils/logging.py:100
2024-04-22 23:10:34.456339 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.456458 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.456786 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.456976 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.457215 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.457436 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.457711 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.457825 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.458060 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.458391 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.458726 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.459009 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.459316 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.459586 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.459887 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.460024 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.460300 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.460488 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.460756 Call to _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2761
2024-04-22 23:10:34.460935 Return from _collect_parameters in /usr/local/lib/python3.11/site-packages/typing_extensions.py:2797
2024-04-22 23:10:34.461193 Call to validate_hf_hub_args in /usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py:47
2024-04-22 23:10:34.461682 Return from validate_hf_hub_args in /usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py:121
2024-04-22 23:10:34.461828 Return from <module> in /usr/local/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py:35
2024-04-22 23:10:34.462002 Return from __getattr__ in /usr/local/lib/python3.11/site-packages/huggingface_hub/__init__.py:4
Looks like a huggingface issue. Can you try huggingface first without vllm, to confirm if you can download and load the model successfully?
@youkaichao Huggingface by itself seems like it's working fine. It loads the full model in about 3.5mins.
from transformers import AutoModel, AutoTokenizer
print("Loading initial model")
model = AutoModel.from_pretrained(MODEL_DIR)
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
print("Loaded model")
Loading checkpoint shards: 37%|███▋ | 11/30 [01:24<02:02, 6.43s/it]
Loading checkpoint shards: 57%|█████▋ | 17/30 [01:58<01:14, 5.75s/it]
Loading checkpoint shards: 100%|██████████| 30/30 [03:14<00:00, 6.47s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loaded model
Just to be clear, the model is already downloaded into the MODEL_DIR - so no actual fetching of remote dependencies should be taking place.
From the trace, the last call into vllm is:
2024-04-22 23:10:34.292657 Call to <module> in /usr/local/lib/python3.11/site-packages/vllm/model_executor/model_loader/weight_utils.py:02024-04-22 23:10:34.292961 Call to __getattr__ in /usr/local/lib/python3.11/site-packages/huggingface_hub/__init__.py:482
I think this is a problem of huggingface. You can try to execute:
from huggingface_hub import HfFileSystem, snapshot_download
See if it hangs.
@youkaichao That seems fine too:
print("Trying to load hub...")
from huggingface_hub import HfFileSystem, snapshot_download
print("Did load hub...")
Trying to load hub...
Did load hub...
:( then I don't know what's wrong.
One possible way is to inspect from the log, to see where is the process.
e.g. the last line is 2024-04-22 23:10:34.462002 Return from __getattr__ in /usr/local/lib/python3.11/site-packages/huggingface_hub/__init__.py:4, and you need to figure out which function is the process returning to.
~~Hey @piercefreeman! AFAIK tensorizer shouldn't be supported when TP > 1.~~
~~@sangstar Could you take a look at this issue?~~
Edit: Doesn't look like this is related to tensorizer.
@piercefreeman can you try to throw https://github.com/vllm-project/vllm/pull/4278 into your code, and see where is the process stuck point?
@youkaichao Sure thing, here's the environment with the additional logging. Based on this it looks like the stall is happening in a different location other than the HF hub:
2024-04-23 01:29:36.601523 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 01:29:36.601619 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 01:29:36.602484 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 01:29:36.602746 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 01:29:36.602937 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 01:29:36.603564 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 01:29:36.603776 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 01:29:36.603890 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 01:29:36.603968 Return from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981 to broadcast in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1909
Yeah, that seems close to the truth. You should add more lines, until it lands in vllm-related code.
@youkaichao Here's the full stack trace:
PART 0 up_helper in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1429
2024-04-23 05:48:36.124556 Return from pg_to_tag in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:534 to _new_process_group_helper in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1429
2024-04-23 05:48:36.124669 Return from _new_process_group_helper in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1430 to _new_group_with_tag in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3695
2024-04-23 05:48:36.124884 Call to <dictcomp> in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3708 from _new_group_with_tag in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3708
2024-04-23 05:48:36.125075 Return from <dictcomp> in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3708 to _new_group_with_tag in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3708
2024-04-23 05:48:36.125365 Call to pg_group_ranks in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:490 from _new_group_with_tag in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3708
2024-04-23 05:48:36.125654 Return from pg_group_ranks in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:498 to _new_group_with_tag in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3708
2024-04-23 05:48:36.125877 Call to _is_barrier_after_init in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:966 from _new_group_with_tag in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3712
2024-04-23 05:48:36.126708 Return from _is_barrier_after_init in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:971 to _new_group_with_tag in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3712
2024-04-23 05:48:36.127066 Return from _new_group_with_tag in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3735 to new_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3624
2024-04-23 05:48:36.127274 Return from new_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3624 to wrapper in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:86
2024-04-23 05:48:36.127489 Call to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:45 from wrapper in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:89
2024-04-23 05:48:36.127716 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:46
2024-04-23 05:48:36.127947 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.128111 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.128212 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.128368 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.128509 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:46
2024-04-23 05:48:36.128656 Call to _get_process_group_name in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3991 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:50
2024-04-23 05:48:36.128804 Call to pg_names in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:480 from _get_process_group_name in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3992
2024-04-23 05:48:36.128968 Return from pg_names in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:488 to _get_process_group_name in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3992
2024-04-23 05:48:36.129133 Return from _get_process_group_name in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:3992 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:50
2024-04-23 05:48:36.129397 Call to get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1024 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:51
2024-04-23 05:48:36.129540 Call to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:974 from get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1038
2024-04-23 05:48:36.129699 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.129808 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.130024 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.130371 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.130548 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.130648 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.130719 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.130901 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.131188 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.131424 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.131645 Return from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981 to get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1038
2024-04-23 05:48:36.131918 Call to _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:747 from get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1041
2024-04-23 05:48:36.132088 Return from _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:751 to get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1041
2024-04-23 05:48:36.132288 Call to pg_map in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:467 from get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1043
2024-04-23 05:48:36.132485 Return from pg_map in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:478 to get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1043
2024-04-23 05:48:36.132607 Call to pg_map in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:467 from get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1043
2024-04-23 05:48:36.132685 Return from pg_map in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:478 to get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1043
2024-04-23 05:48:36.132922 Return from get_backend in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1045 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:51
2024-04-23 05:48:36.133018 Call to get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1539 from _get_msg_
PART 10000 dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:52
2024-04-23 05:48:36.133109 Call to _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:747 from get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1552
2024-04-23 05:48:36.133316 Return from _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:750 to get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1552
2024-04-23 05:48:36.133561 Call to _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:833 from get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1555
2024-04-23 05:48:36.133715 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:835
2024-04-23 05:48:36.133805 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.133886 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.134031 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:835
2024-04-23 05:48:36.134169 Call to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:974 from _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:836
2024-04-23 05:48:36.134242 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.134317 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.134471 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.134688 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.134835 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.134970 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.135083 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.135366 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.135496 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.135642 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.135850 Return from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981 to _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:836
2024-04-23 05:48:36.136032 Return from _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:837 to get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1555
2024-04-23 05:48:36.136145 Return from get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1555 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:52
2024-04-23 05:48:36.136229 Call to get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1539 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:53
2024-04-23 05:48:36.136305 Call to _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:747 from get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1552
2024-04-23 05:48:36.136369 Return from _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:750 to get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1552
2024-04-23 05:48:36.136455 Call to _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:833 from get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1555
2024-04-23 05:48:36.136646 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:835
2024-04-23 05:48:36.136884 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.137227 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.137357 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:835
2024-04-23 05:48:36.137506 Call to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:974 from _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:836
2024-04-23 05:48:36.137647 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.137799 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.137884 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.138018 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.138168 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.138446 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.138597 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.138714 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.138869 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.138990 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.139099 Return from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981 to _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:836
2024-04-23 05:48:36.139229 Return from _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:837 to get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed
PART 20000 _c10d.py:1555
2024-04-23 05:48:36.139294 Return from get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1555 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:53
2024-04-23 05:48:36.139385 Call to get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1512 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:54
2024-04-23 05:48:36.139565 Call to _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:747 from get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1529
2024-04-23 05:48:36.139663 Return from _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:750 to get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1529
2024-04-23 05:48:36.139939 Call to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:974 from get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1532
2024-04-23 05:48:36.140052 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.140194 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.140381 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.140567 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.140688 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.140850 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.141022 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.141264 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.141459 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.141609 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.141717 Return from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981 to get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1532
2024-04-23 05:48:36.141796 Return from get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1534 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:54
2024-04-23 05:48:36.141878 Call to get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1512 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:55
2024-04-23 05:48:36.142096 Call to _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:747 from get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1529
2024-04-23 05:48:36.142311 Return from _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:750 to get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1529
2024-04-23 05:48:36.142471 Call to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:974 from get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1532
2024-04-23 05:48:36.142531 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.142585 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.142639 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.142696 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.142750 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.142803 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.142858 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.142982 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.143095 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.143335 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.143427 Return from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981 to get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1532
2024-04-23 05:48:36.143495 Return from get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1534 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:55
2024-04-23 05:48:36.143601 Call to version in /usr/local/lib/python3.11/site-packages/torch/cuda/nccl.py:34 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:58
2024-04-23 05:48:36.150967 Return from version in /usr/local/lib/python3.11/site-packages/torch/cuda/nccl.py:41 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:58
2024-04-23 05:48:36.151219 Call to <genexpr> in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59
2024-04-23 05:48:36.151501 Return from <genexpr> in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59
2024-04-23 05:48:36.151625 Call to <genexpr> in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59
2024-04-23 05:48:36.151786 Return from <genexpr> in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59
2024-04-23 05:48:36.152016 Call to <genexpr> in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59
2024-04-23 05:48:36.152213 Return from <genexpr> in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59
2024-04-23 05:48:36.152504 Call to <genexpr> in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59 from _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59
2024-04-23 05:48:36.152752 Return from <genexpr> in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59 to _get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:59
2024-04-23 05:48:36.153052 Return from _
PART 30000 get_msg_dict in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:65 to wrapper in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:89
2024-04-23 05:48:36.153400 Call to _patched_makeRecord in /usr/local/lib/python3.11/site-packages/ray/autoscaler/_private/cli_logger.py:116 from _log in /usr/local/lib/python3.11/logging/__init__.py:1632
2024-04-23 05:48:36.153749 Return from _patched_makeRecord in /usr/local/lib/python3.11/site-packages/ray/autoscaler/_private/cli_logger.py:143 to _log in /usr/local/lib/python3.11/logging/__init__.py:1632
2024-04-23 05:48:36.153959 Return from wrapper in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:93 to init_distributed_environment in /usr/local/lib/python3.11/site-packages/vllm/distributed/parallel_state.py:74
2024-04-23 05:48:36.154209 Return from init_distributed_environment in /usr/local/lib/python3.11/site-packages/vllm/distributed/parallel_state.py:77 to init_worker_distributed_environment in /usr/local/lib/python3.11/site-packages/vllm/worker/worker.py:288
2024-04-23 05:48:36.154527 Call to is_initialized in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:24 from init_worker_distributed_environment in /usr/local/lib/python3.11/site-packages/vllm/worker/worker.py:291
2024-04-23 05:48:36.154699 Return from is_initialized in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:26 to init_worker_distributed_environment in /usr/local/lib/python3.11/site-packages/vllm/worker/worker.py:291
2024-04-23 05:48:36.154862 Call to init_process_group in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:39 from init_worker_distributed_environment in /usr/local/lib/python3.11/site-packages/vllm/worker/worker.py:301
2024-04-23 05:48:36.155055 Call to is_initialized in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:24 from init_process_group in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:43
2024-04-23 05:48:36.155266 Return from is_initialized in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:26 to init_process_group in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:43
2024-04-23 05:48:36.155442 Call to ncclGetVersion in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:69 from init_process_group in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:45
2024-04-23 05:48:36.167192 Return from ncclGetVersion in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:78 to init_process_group in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:45
2024-04-23 05:48:36.167646 Call to _patched_makeRecord in /usr/local/lib/python3.11/site-packages/ray/autoscaler/_private/cli_logger.py:116 from _log in /usr/local/lib/python3.11/logging/__init__.py:1632
2024-04-23 05:48:36.167930 Return from _patched_makeRecord in /usr/local/lib/python3.11/site-packages/ray/autoscaler/_private/cli_logger.py:143 to _log in /usr/local/lib/python3.11/logging/__init__.py:1632
2024-04-23 05:48:36.168112 Call to format in /usr/local/lib/python3.11/site-packages/vllm/logger.py:23 from format in /usr/local/lib/python3.11/logging/__init__.py:953
2024-04-23 05:48:36.168446 Return from format in /usr/local/lib/python3.11/site-packages/vllm/logger.py:28 to format in /usr/local/lib/python3.11/logging/__init__.py:953
2024-04-23 05:48:36.168884 Call to __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:196 from init_process_group in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl_utils.py:46
2024-04-23 05:48:36.169231 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:208
2024-04-23 05:48:36.169350 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.169463 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.169599 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.169905 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.170247 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:208
2024-04-23 05:48:36.170590 Call to get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1512 from __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:220
2024-04-23 05:48:36.170816 Call to _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:747 from get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1529
2024-04-23 05:48:36.170966 Return from _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:750 to get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1529
2024-04-23 05:48:36.171234 Call to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:974 from get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1532
2024-04-23 05:48:36.171429 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.171549 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.171673 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.171763 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.172046 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.172226 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.172309 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.172385 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.172566 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.172784 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.173007 Return from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981 to get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1532
2024-04-23 05:48:36.173135 Return from get_rank in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1534 to __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:220
2024-04-23 05:48:36.173350 Call to get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1539 from __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:221
2024-04-23 05:48:36.173448 Call to _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:747 from get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1552
2024-04-23 05:48:36.173559 Return from _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:750 to get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1552
2024-04-23 05:48:36.173690 Call to _get_group_size in /usr/local/lib/python3.11/site-packages/torch/di
PART 40000 stributed/distributed_c10d.py:833 from get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1555
2024-04-23 05:48:36.173790 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:835
2024-04-23 05:48:36.173878 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.173983 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.174084 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:835
2024-04-23 05:48:36.174322 Call to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:974 from _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:836
2024-04-23 05:48:36.174466 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.174576 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.174746 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.174972 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.175116 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:36.175223 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:36.175311 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.175499 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.175595 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:36.175672 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:36.175823 Return from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981 to _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:836
2024-04-23 05:48:36.176023 Return from _get_group_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:837 to get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1555
2024-04-23 05:48:36.176306 Return from get_world_size in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1555 to __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:221
2024-04-23 05:48:36.181381 Call to set_device in /usr/local/lib/python3.11/site-packages/torch/cuda/__init__.py:396 from __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:228
2024-04-23 05:48:36.181600 Call to _get_device_index in /usr/local/lib/python3.11/site-packages/torch/cuda/_utils.py:9 from set_device in /usr/local/lib/python3.11/site-packages/torch/cuda/__init__.py:406
2024-04-23 05:48:36.182099 Return from _get_device_index in /usr/local/lib/python3.11/site-packages/torch/cuda/_utils.py:26 to set_device in /usr/local/lib/python3.11/site-packages/torch/cuda/__init__.py:406
2024-04-23 05:48:36.188177 Return from set_device in /usr/local/lib/python3.11/site-packages/torch/cuda/__init__.py:408 to __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:228
2024-04-23 05:48:36.188403 Call to ncclGetUniqueId in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:92 from __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:230
2024-04-23 05:48:37.010534 Return from ncclGetUniqueId in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:96 to __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:230
2024-04-23 05:48:37.170633 Call to wrapper in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:69 from __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:235
2024-04-23 05:48:37.171061 Call to broadcast in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1877 from wrapper in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:72
2024-04-23 05:48:37.171310 Call to _check_single_tensor in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:841 from broadcast in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1898
2024-04-23 05:48:37.171454 Return from _check_single_tensor in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:843 to broadcast in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1898
2024-04-23 05:48:37.171541 Call to _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:747 from broadcast in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1899
2024-04-23 05:48:37.171684 Return from _rank_not_in_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:750 to broadcast in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1899
2024-04-23 05:48:37.175292 Call to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:974 from broadcast in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1909
2024-04-23 05:48:37.175427 Call to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:948 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:37.175498 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:37.175569 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:37.175632 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:37.175706 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950
2024-04-23 05:48:37.175769 Return from is_initialized in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:950 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:976
2024-04-23 05:48:37.175842 Call to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:583 from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:37.175917 Call to default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:453 from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:37.176051 Return from default_pg in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:461 to WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585
2024-04-23 05:48:37.176221 Return from WORLD in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:585 to _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981
2024-04-23 05:48:37.176318 Return from _get_default_group in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:981 to broadcast in /usr/local/lib/python3.11/site-packages/torch/distributed/distributed_c10d.py:1909
Last vllm traces look to be:
packages/torch/cuda/__init__.py:408 to __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:228
2024-04-23 05:48:36.188403 Call to ncclGetUniqueId in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:92 from __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:230
2024-04-23 05:48:37.010534 Return from ncclGetUniqueId in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:96 to __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:230
2024-04-23 05:48:37.170633 Call to wrapper in /usr/local/lib/python3.11/site-packages/torch/distributed/c10d_logger.py:69 from __init__ in /usr/local/lib/python3.11/site-packages/vllm/distributed/device_communicators/pynccl.py:235
I've confirmed this particular trace reproduces across multiple runs.
- try to run with
export NCCL_DEBUG=TRACE, which will give you more information on nccl. - try to run with
export NCCL_P2P_DISABLE=1, which might resolve the problem?
I think the problem might come from nccl side.
If you are patient, you can also try to throw https://github.com/vllm-project/vllm/pull/4248 into your environment, and see if it helps.
I suspected there was something funky going on with NCCL. The image / hardware configuration is relatively conventional, though, so I wonder if there's something amiss at the host OS level.
modal:2:2 [0] NCCL INFO Bootstrap : Using eth0:172.20.104.2<0>
modal:2:2 [0] NCCL INFO Bootstrap : Using eth0:172.20.104.2<0>
modal:2:2 [0] NCCL INFO cudaDriverVersion 12020
NCCL version 2.19.3+cuda12.3
modal:2:774 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
modal:2:774 [0] NCCL INFO P2P plugin IBext_v7
modal:2:774 [0] NCCL INFO NET/IB : No device found.
modal:2:774 [0] NCCL INFO NET/IB : No device found.
modal:2:774 [0] NCCL INFO NET/Socket : Using [0]eth0:172.20.104.2<0>
modal:2:774 [0] NCCL INFO Using non-device net plugin version 0
modal:2:774 [0] NCCL INFO Using network Socket
modal:2:774 [0] NCCL INFO comm 0x55c216794980 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 50 commId 0x78bc1d8495bf9cb5 - Init START
modal:2:774 [0] NCCL INFO Channel 00/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 01/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 02/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 03/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 04/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 05/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 06/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 07/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 08/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 09/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 10/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 11/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 12/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 13/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 14/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 15/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 16/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 17/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 18/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 19/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 20/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 21/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 22/24 : 0 1
modal:2:774 [0] NCCL INFO Channel 23/24 : 0 1
modal:2:774 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] 1/-1/-1->0->-1 [5] 1/-1/-1->0->-1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] -1/-1/-1->0->1 [9] -1/-1/-1->0->1 [10] -1/-1/-1->0->1 [11] -1/-1/-1->0->1 [12] 1/-1/-1->0->-1 [13] 1/-1/-1->0->-1 [14] 1/-1/-1->0->-1 [15] 1/-1/-1->0->-1 [16] 1/-1/-1->0->-1 [17] 1/-1/-1->0->-1 [18] -1/-1/-1->0->1 [19] -1/-1/-1->0->1 [20] -1/-1/-1->0->1 [21] -1/-1/-1->0->1 [22] -1/-1/-1->0->1 [23] -1/-1/-1->0->1
modal:2:774 [0] NCCL INFO P2P Chunksize set to 524288
modal:2:774 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 06/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 07/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 08/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 09/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 10/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 11/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 12/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 13/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 14/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 15/0 : 0[0] -> 1[1] via P2P/CUMEM/read
Expand
modal:2:774 [0] NCCL INFO Channel 16/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 17/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 18/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 19/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 20/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 21/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 22/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:774 [0] NCCL INFO Channel 23/0 : 0[0] -> 1[1] via P2P/CUMEM/read
modal:2:777 [0] proxy.cc:1323 NCCL WARN Cuda failure 304 'OS call failed or operation not supported on this OS'
Disabling NCCL_P2P does in fact take care of the stall:
(RayWorkerWrapper pid=660) INFO 04-23 06:43:04 selector.py:28] Using FlashAttention backend.
INFO 04-23 06:43:05 pynccl_utils.py:45] vLLM is using nccl==2.18.1
(RayWorkerWrapper pid=660) INFO 04-23 06:43:05 pynccl_utils.py:45] vLLM is using nccl==2.18.1
INFO 04-23 06:43:16 utils.py:115] generating GPU P2P access cache for in /root/.config/vllm/gpu_p2p_access_cache_for_0,1.json
INFO 04-23 06:43:16 utils.py:129] reading GPU P2P access cache from /root/.config/vllm/gpu_p2p_access_cache_for_0,1.json
(RayWorkerWrapper pid=660) INFO 04-23 06:43:16 utils.py:129] reading GPU P2P access cache from /root/.config/vllm/gpu_p2p_access_cache_for_0,1.json
INFO 04-23 06:44:38 model_runner.py:173] Loading model weights took 65.7114 GB
(RayWorkerWrapper pid=660) INFO 04-23 06:44:38 model_runner.py:173] Loading model weights took 65.7114 GB
INFO 04-23 06:44:51 ray_gpu_executor.py:217] # GPU blocks: 1303, # CPU blocks: 1638
(RayWorkerWrapper pid=660) INFO 04-23 06:44:56 model_runner.py:976] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
(RayWorkerWrapper pid=660) INFO 04-23 06:44:56 model_runner.py:980] CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing `gpu_memory_utilization` or enforcing eager mode. You can also reduce the `max_num_seqs` as needed to decrease memory usage.
I'll try building a #4248 wheel tomorrow and see if that helps with NCCL still enabled.
I meet the same problem on CohereForAIc4ai-command-r-plus with docker image vllm/vllm-openai:v0.4.1, although it works well on llama3-70b-instruct.
I meet the same problem on CohereForAIc4ai-command-r-plus with docker image vllm/vllm-openai:v0.4.1, although it works well on llama3-70b-instruct.
I kill the docker container. And create a new container to load CohereForAIc4ai-command-r-plus with docker image vllm/vllm-openai:v0.4.1. Everything is OK.
@majestichou At least in my case, the issue was prompted by cross-GPU coordination (NCCL in particular) on an inference box. Doesn't thus far seem to be architecture related so might crop up in other architectures just as it did llama for me - so long as you're using multiple GPUs while you serve with vllm.
Similar problem here:
singularity run \
--nv \
--env HF_HOME=/workspace/huggingface/hub \
--writable-tmpfs \
--bind $volume:/workspace/huggingface/hub \
--env HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN \
docker://vllm/vllm-openai:v0.4.1 \
--model casperhansen/llama-3-70b-instruct-awq \
--tensor-parallel-size 4
Everything just hangs with this output
WARNING 05-06 00:07:12 config.py:169] awq quantization is not fully optimized yet. The speed can be slower than non-quantized models. 2024-05-06 00:07:15,100 INFO worker.py:1749 -- Started a local Ray instance.
We have added documentation for this situation in #5430. Please take a look.