olmocr icon indicating copy to clipboard operation
olmocr copied to clipboard

sglang cant start up

Open goodmaney opened this issue 9 months ago • 5 comments

🐛 Describe the bug

wsl2 Ubuntu 22.04 4090

SGLang server task ended when the model loaded. Sglang can work on my other env

2025-03-07 23:50:27,531 - __main__ - INFO - [2025-03-07 23:50:27 TP0] Automatically reduce --mem-fraction-static to 0.760 because this is a multimodal model.
2025-03-07 23:50:27,531 - __main__ - INFO - [2025-03-07 23:50:27 TP0] Automatically turn off --chunked-prefill-size and disable radix cache for qwen2-vl.
2025-03-07 23:50:27,532 - __main__ - INFO - [2025-03-07 23:50:27 TP0] Init torch distributed begin.
2025-03-07 23:50:27,532 - __main__ - WARNING - Attempt 13: Please wait for sglang server to become ready...
2025-03-07 23:50:27,727 - __main__ - INFO - [2025-03-07 23:50:27 TP0] Load weight begin. avail mem=22.46 GB
2025-03-07 23:50:28,550 - __main__ - WARNING - Attempt 14: Please wait for sglang server to become ready...
2025-03-07 23:50:28,606 - __main__ - INFO - [2025-03-07 23:50:28 TP0] Using model weights format ['*.safetensors']
Loading safetensors checkpoint shards:   0% Completed | 0/4 [00:00<?, ?it/s]
2025-03-07 23:50:29,569 - __main__ - WARNING - Attempt 15: Please wait for sglang server to become ready...
Loading safetensors checkpoint shards:  25% Completed | 1/4 [00:00<00:01,  2.04it/s]
Loading safetensors checkpoint shards:  50% Completed | 2/4 [00:00<00:00,  2.85it/s]
Loading safetensors checkpoint shards:  75% Completed | 3/4 [00:01<00:00,  2.27it/s]
2025-03-07 23:50:30,587 - __main__ - WARNING - Attempt 16: Please wait for sglang server to become ready...
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:01<00:00,  2.04it/s]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:01<00:00,  2.15it/s]
2025-03-07 23:50:31,078 - __main__ - INFO -
2025-03-07 23:50:31,173 - __main__ - INFO - [2025-03-07 23:50:31 TP0] Load weight end. type=Qwen2VLForConditionalGeneration, dtype=torch.bfloat16, avail mem=6.73 GB
2025-03-07 23:50:31,259 - __main__ - INFO - [2025-03-07 23:50:31 TP0] KV Cache is allocated. K size: 0.67 GB, V size: 0.67 GB.
2025-03-07 23:50:31,260 - __main__ - INFO - [2025-03-07 23:50:31 TP0] Memory pool end. avail mem=4.47 GB
2025-03-07 23:50:31,373 - __main__ - INFO - [2025-03-07 23:50:31 TP0] Capture cuda graph begin. This can take up to several minutes.
2025-03-07 23:50:31,605 - __main__ - WARNING - Attempt 17: Please wait for sglang server to become ready...
  0%|          | 0/4 [00:00<?, ?it/s]INFO -
2025-03-07 23:50:31,654 - __main__ - INFO - [2025-03-07 23:50:31 TP0] Scheduler hit an exception: Traceback (most recent call last):
2025-03-07 23:50:31,654 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 1773, in run_scheduler_process
2025-03-07 23:50:31,654 - __main__ - INFO -     scheduler = Scheduler(server_args, port_args, gpu_id, tp_rank, dp_rank)
2025-03-07 23:50:31,654 - __main__ - INFO -                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 23:50:31,654 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 239, in __init__
2025-03-07 23:50:31,654 - __main__ - INFO -     self.tp_worker = TpWorkerClass(
2025-03-07 23:50:31,654 - __main__ - INFO -                      ^^^^^^^^^^^^^^
2025-03-07 23:50:31,654 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/tp_worker.py", line 68, in __init__
2025-03-07 23:50:31,655 - __main__ - INFO -     self.model_runner = ModelRunner(
2025-03-07 23:50:31,655 - __main__ - INFO -                         ^^^^^^^^^^^^
2025-03-07 23:50:31,655 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 214, in __init__
2025-03-07 23:50:31,655 - __main__ - INFO -     self.init_cuda_graphs()
2025-03-07 23:50:31,655 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 730, in init_cuda_graphs
2025-03-07 23:50:31,655 - __main__ - INFO -     self.cuda_graph_runner = CudaGraphRunner(self)
2025-03-07 23:50:31,655 - __main__ - INFO -                              ^^^^^^^^^^^^^^^^^^^^^
2025-03-07 23:50:31,655 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 226, in __init__
2025-03-07 23:50:31,655 - __main__ - INFO -     self.capture()
2025-03-07 23:50:31,655 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 292, in capture
2025-03-07 23:50:31,655 - __main__ - INFO -     ) = self.capture_one_batch_size(bs, forward)
2025-03-07 23:50:31,655 - __main__ - INFO -         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 23:50:31,656 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 351, in capture_one_batch_size
2025-03-07 23:50:31,656 - __main__ - INFO -     self.model_runner.attn_backend.init_forward_metadata_capture_cuda_graph(
2025-03-07 23:50:31,656 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/layers/attention/flashinfer_backend.py", line 259, in init_forward_metadata_capture_cuda_graph
2025-03-07 23:50:31,656 - __main__ - INFO -     self.indices_updater_decode.update(
2025-03-07 23:50:31,656 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/layers/attention/flashinfer_backend.py", line 504, in update_single_wrapper
2025-03-07 23:50:31,656 - __main__ - INFO -     self.call_begin_forward(
2025-03-07 23:50:31,656 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/layers/attention/flashinfer_backend.py", line 595, in call_begin_forward
2025-03-07 23:50:31,656 - __main__ - INFO -     create_flashinfer_kv_indices_triton[(bs,)](
2025-03-07 23:50:31,656 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/triton/runtime/jit.py", line 345, in <lambda>
2025-03-07 23:50:31,656 - __main__ - INFO -     return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
2025-03-07 23:50:31,656 - __main__ - INFO -                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 23:50:31,656 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/triton/runtime/jit.py", line 662, in run
2025-03-07 23:50:31,656 - __main__ - INFO -     kernel = self.compile(
2025-03-07 23:50:31,656 - __main__ - INFO -              ^^^^^^^^^^^^^
2025-03-07 23:50:31,657 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/triton/compiler/compiler.py", line 250, in compile
2025-03-07 23:50:31,657 - __main__ - INFO -     metadata_group = fn_cache_manager.get_group(metadata_filename) or {}
2025-03-07 23:50:31,657 - __main__ - INFO -                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 23:50:31,657 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/triton/runtime/cache.py", line 88, in get_group
2025-03-07 23:50:31,657 - __main__ - INFO -     grp_data = json.load(f)
2025-03-07 23:50:31,657 - __main__ - INFO -                ^^^^^^^^^^^^
2025-03-07 23:50:31,657 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/json/__init__.py", line 293, in load
2025-03-07 23:50:31,657 - __main__ - INFO -     return loads(fp.read(),
2025-03-07 23:50:31,657 - __main__ - INFO -            ^^^^^^^^^^^^^^^^
2025-03-07 23:50:31,657 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/json/__init__.py", line 346, in loads
2025-03-07 23:50:31,657 - __main__ - INFO -     return _default_decoder.decode(s)
2025-03-07 23:50:31,657 - __main__ - INFO -            ^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 23:50:31,657 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/json/decoder.py", line 337, in decode
2025-03-07 23:50:31,657 - __main__ - INFO -     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
2025-03-07 23:50:31,657 - __main__ - INFO -                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-07 23:50:31,657 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/json/decoder.py", line 355, in raw_decode
2025-03-07 23:50:31,657 - __main__ - INFO -     raise JSONDecodeError("Expecting value", s, err.value) from None
2025-03-07 23:50:31,657 - __main__ - INFO - json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
2025-03-07 23:50:31,658 - __main__ - INFO -
2025-03-07 23:50:31,658 - __main__ - INFO - [2025-03-07 23:50:31] Received sigquit from a child proces. It usually means the child failed.
2025-03-07 23:50:31,848 - __main__ - WARNING - SGLang server task ended
2025-03-07 23:50:32,624 - __main__ - WARNING - Attempt 18: Please wait for sglang server to become ready...
2025-03-07 23:50:33,642 - __main__ - WARNING - Attempt 19: Please wait for sglang server to become ready...
2025-03-07 23:50:34,661 - __main__ - WARNING - Attempt 20: Please wait for sglang server to become ready...
2025-03-07 23:50:35,680 - __main__ - WARNING - Attempt 21: Please wait for sglang server to become ready..

Versions

Python 3.11.11 aiohappyeyeballs==2.5.0 aiohttp==3.11.13 aiosignal==1.3.2 annotated-types==0.7.0 anthropic==0.49.0 anyio==4.8.0 asttokens==3.0.0 attrs==25.1.0 beaker-py==1.34.1 bleach==6.2.0 boto3==1.37.7 botocore==1.37.7 cached_path==1.6.7 cachetools==5.5.2 certifi==2025.1.31 cffi==1.17.1 charset-normalizer==3.4.1 click==8.1.8 cloudpickle==3.1.1 compressed-tensors==0.8.0 cryptography==44.0.2 cuda-bindings==12.8.0 cuda-python==12.8.0 datasets==3.3.2 decorator==5.2.1 decord==0.6.0 dill==0.3.8 diskcache==5.6.3 distro==1.9.0 docker==7.1.0 einops==0.8.1 executing==2.2.0 fastapi==0.115.11 filelock==3.17.0 flashinfer==0.1.6+cu124torch2.4 frozenlist==1.5.0 fsspec==2024.12.0 ftfy==6.3.1 gguf==0.10.0 google-api-core==2.24.1 google-auth==2.38.0 google-cloud-core==2.4.2 google-cloud-storage==2.19.0 google-crc32c==1.6.0 google-resumable-media==2.7.2 googleapis-common-protos==1.69.0 h11==0.14.0 hf_transfer==0.1.9 httpcore==1.0.7 httptools==0.6.4 httpx==0.28.1 huggingface-hub==0.27.1 idna==3.10 importlib_metadata==8.6.1 interegular==0.3.3 ipython==9.0.1 ipython_pygments_lexers==1.1.1 jedi==0.19.2 Jinja2==3.1.6 jiter==0.8.2 jmespath==1.0.1 jsonschema==4.23.0 jsonschema-specifications==2024.10.1 lark==1.2.2 lingua-language-detector==2.0.2 litellm==1.63.2 llvmlite==0.44.0 lm-format-enforcer==0.10.11 markdown-it-py==3.0.0 markdown2==2.5.3 MarkupSafe==3.0.2 matplotlib-inline==0.1.7 mdurl==0.1.2 mistral_common==1.5.3 modelscope==1.23.2 mpmath==1.3.0 msgpack==1.1.0 msgspec==0.19.0 multidict==6.1.0 multiprocess==0.70.16 nest-asyncio==1.6.0 networkx==3.4.2 numba==0.61.0 numpy==1.26.4 nvidia-cublas-cu12==12.4.5.8 nvidia-cuda-cupti-cu12==12.4.127 nvidia-cuda-nvrtc-cu12==12.4.127 nvidia-cuda-runtime-cu12==12.4.127 nvidia-cudnn-cu12==9.1.0.70 nvidia-cufft-cu12==11.2.1.3 nvidia-curand-cu12==10.3.5.147 nvidia-cusolver-cu12==11.6.1.9 nvidia-cusparse-cu12==12.3.1.170 nvidia-cusparselt-cu12==0.6.2 nvidia-ml-py==12.570.86 nvidia-nccl-cu12==2.21.5 nvidia-nvjitlink-cu12==12.4.127 nvidia-nvtx-cu12==12.4.127 -e git+https://github.com/allenai/olmocr.git@d006e8f33111771355f24c81521f3715a1a3735e#egg=olmocr openai==1.65.4 opencv-python-headless==4.11.0.86 orjson==3.10.15 outlines==0.0.46 packaging==24.2 pandas==2.2.3 parso==0.8.4 partial-json-parser==0.2.1.1.post5 pexpect==4.9.0 pillow==11.1.0 prometheus-fastapi-instrumentator==7.0.2 prometheus_client==0.21.1 prompt_toolkit==3.0.50 propcache==0.3.0 proto-plus==1.26.0 protobuf==5.29.3 psutil==7.0.0 ptyprocess==0.7.0 pure_eval==0.2.3 py-cpuinfo==9.0.0 pyairports==2.1.1 pyarrow==19.0.1 pyasn1==0.6.1 pyasn1_modules==0.4.1 pycountry==24.6.1 pycparser==2.22 pydantic==2.10.6 pydantic_core==2.27.2 Pygments==2.19.1 pypdf==5.3.1 pypdfium2==4.30.1 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-multipart==0.0.20 pytz==2025.1 PyYAML==6.0.2 pyzmq==26.2.1 ray==2.43.0 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 rich==13.9.4 rpds-py==0.23.1 rsa==4.9 s3transfer==0.11.4 safetensors==0.5.3 sentencepiece==0.2.0 setproctitle==1.3.5 sgl-kernel==0.0.3.post1 sglang==0.4.2 six==1.17.0 smart-open==7.1.0 sniffio==1.3.1 stack-data==0.6.3 starlette==0.46.0 sympy==1.13.1 tiktoken==0.9.0 tokenizers==0.21.0 torch==2.5.1 torchao==0.9.0 torchvision==0.20.1 tqdm==4.67.1 traitlets==5.14.3 transformers==4.49.0 triton==3.1.0 typing_extensions==4.12.2 tzdata==2025.1 urllib3==2.3.0 uvicorn==0.34.0 uvloop==0.21.0 vllm==0.6.4.post1 watchfiles==1.0.4 wcwidth==0.2.13 webencodings==0.5.1 websockets==15.0.1 wrapt==1.17.2 xformers==0.0.28.post3 xgrammar==0.1.15 xxhash==3.5.0 yarl==1.18.3 zipp==3.21.0 zstandard==0.23.0

goodmaney avatar Mar 07 '25 15:03 goodmaney

and it automatic exit

2025-03-08 02:04:48,693 - __main__ - INFO - [2025-03-08 02:04:48 TP0] Scheduler hit an exception: Traceback (most recent call last):
2025-03-08 02:04:48,693 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 1773, in run_scheduler_process
2025-03-08 02:04:48,693 - __main__ - INFO -     scheduler = Scheduler(server_args, port_args, gpu_id, tp_rank, dp_rank)
2025-03-08 02:04:48,693 - __main__ - INFO -                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-08 02:04:48,693 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 239, in __init__
2025-03-08 02:04:48,694 - __main__ - INFO -     self.tp_worker = TpWorkerClass(
2025-03-08 02:04:48,694 - __main__ - INFO -                      ^^^^^^^^^^^^^^
2025-03-08 02:04:48,694 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/tp_worker.py", line 68, in __init__
2025-03-08 02:04:48,694 - __main__ - INFO -     self.model_runner = ModelRunner(
2025-03-08 02:04:48,694 - __main__ - INFO -                         ^^^^^^^^^^^^
2025-03-08 02:04:48,694 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 214, in __init__
2025-03-08 02:04:48,694 - __main__ - INFO -     self.init_cuda_graphs()
2025-03-08 02:04:48,694 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/model_runner.py", line 730, in init_cuda_graphs
2025-03-08 02:04:48,694 - __main__ - INFO -     self.cuda_graph_runner = CudaGraphRunner(self)
2025-03-08 02:04:48,694 - __main__ - INFO -                              ^^^^^^^^^^^^^^^^^^^^^
2025-03-08 02:04:48,694 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 226, in __init__
2025-03-08 02:04:48,694 - __main__ - INFO -     self.capture()
2025-03-08 02:04:48,694 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 292, in capture
2025-03-08 02:04:48,695 - __main__ - INFO -     ) = self.capture_one_batch_size(bs, forward)
2025-03-08 02:04:48,695 - __main__ - INFO -         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-08 02:04:48,695 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/cuda_graph_runner.py", line 351, in capture_one_batch_size
2025-03-08 02:04:48,695 - __main__ - INFO -     self.model_runner.attn_backend.init_forward_metadata_capture_cuda_graph(
2025-03-08 02:04:48,695 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/layers/attention/flashinfer_backend.py", line 259, in init_forward_metadata_capture_cuda_graph
2025-03-08 02:04:48,695 - __main__ - INFO -     self.indices_updater_decode.update(
2025-03-08 02:04:48,695 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/layers/attention/flashinfer_backend.py", line 504, in update_single_wrapper
2025-03-08 02:04:48,695 - __main__ - INFO -     self.call_begin_forward(
2025-03-08 02:04:48,695 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/layers/attention/flashinfer_backend.py", line 595, in call_begin_forward
2025-03-08 02:04:48,695 - __main__ - INFO -     create_flashinfer_kv_indices_triton[(bs,)](
2025-03-08 02:04:48,695 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/triton/runtime/jit.py", line 345, in <lambda>
2025-03-08 02:04:48,696 - __main__ - INFO -     return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
2025-03-08 02:04:48,696 - __main__ - INFO -                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-08 02:04:48,696 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/triton/runtime/jit.py", line 662, in run
2025-03-08 02:04:48,696 - __main__ - INFO -     kernel = self.compile(
2025-03-08 02:04:48,696 - __main__ - INFO -              ^^^^^^^^^^^^^
2025-03-08 02:04:48,696 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/triton/compiler/compiler.py", line 250, in compile
2025-03-08 02:04:48,696 - __main__ - INFO -     metadata_group = fn_cache_manager.get_group(metadata_filename) or {}
2025-03-08 02:04:48,696 - __main__ - INFO -                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-08 02:04:48,696 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/triton/runtime/cache.py", line 88, in get_group
2025-03-08 02:04:48,696 - __main__ - INFO -     grp_data = json.load(f)
2025-03-08 02:04:48,696 - __main__ - INFO -                ^^^^^^^^^^^^
2025-03-08 02:04:48,696 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/json/__init__.py", line 293, in load
2025-03-08 02:04:48,696 - __main__ - INFO -     return loads(fp.read(),
2025-03-08 02:04:48,697 - __main__ - INFO -            ^^^^^^^^^^^^^^^^
2025-03-08 02:04:48,697 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/json/__init__.py", line 346, in loads
2025-03-08 02:04:48,697 - __main__ - INFO -     return _default_decoder.decode(s)
2025-03-08 02:04:48,697 - __main__ - INFO -            ^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-08 02:04:48,697 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/json/decoder.py", line 337, in decode
2025-03-08 02:04:48,697 - __main__ - INFO -     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
2025-03-08 02:04:48,697 - __main__ - INFO -                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-08 02:04:48,697 - __main__ - INFO -   File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/json/decoder.py", line 355, in raw_decode
2025-03-08 02:04:48,697 - __main__ - INFO -     raise JSONDecodeError("Expecting value", s, err.value) from None
2025-03-08 02:04:48,697 - __main__ - INFO - json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
2025-03-08 02:04:48,697 - __main__ - INFO -
2025-03-08 02:04:48,697 - __main__ - INFO - [2025-03-08 02:04:48] Received sigquit from a child proces. It usually means the child failed.
2025-03-08 02:04:48,879 - __main__ - WARNING - SGLang server task ended
2025-03-08 02:04:48,880 - __main__ - ERROR - Ended up starting the sglang server more than 5 times, cancelling pipeline
2025-03-08 02:04:48,880 - __main__ - ERROR -
2025-03-08 02:04:48,880 - __main__ - ERROR - Please make sure sglang is installed according to the latest instructions here: https://docs.sglang.ai/start/install.html
Exception ignored in atexit callback: <function sglang_server_task.<locals>._kill_proc at 0x7f1a109714e0>
Traceback (most recent call last):
  File "/home/xx/olmocr/olmocr/pipeline.py", line 535, in _kill_proc
    proc.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/subprocess.py", line 143, in terminate
    self._transport.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 149, in terminate
    self._check_proc()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 142, in _check_proc
    raise ProcessLookupError()
ProcessLookupError:
Exception ignored in atexit callback: <function sglang_server_task.<locals>._kill_proc at 0x7f1a0f5a9120>
Traceback (most recent call last):
  File "/home/xx/olmocr/olmocr/pipeline.py", line 535, in _kill_proc
    proc.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/subprocess.py", line 143, in terminate
    self._transport.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 149, in terminate
    self._check_proc()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 142, in _check_proc
    raise ProcessLookupError()
ProcessLookupError:
Exception ignored in atexit callback: <function sglang_server_task.<locals>._kill_proc at 0x7f1a0f582de0>
Traceback (most recent call last):
  File "/home/xx/olmocr/olmocr/pipeline.py", line 535, in _kill_proc
    proc.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/subprocess.py", line 143, in terminate
    self._transport.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 149, in terminate
    self._check_proc()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 142, in _check_proc
    raise ProcessLookupError()
ProcessLookupError:
Exception ignored in atexit callback: <function sglang_server_task.<locals>._kill_proc at 0x7f1a10931f80>
Traceback (most recent call last):
  File "/home/xx/olmocr/olmocr/pipeline.py", line 535, in _kill_proc
    proc.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/subprocess.py", line 143, in terminate
    self._transport.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 149, in terminate
    self._check_proc()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 142, in _check_proc
    raise ProcessLookupError()
ProcessLookupError:
Exception ignored in atexit callback: <function sglang_server_task.<locals>._kill_proc at 0x7f1a0f5a8b80>
Traceback (most recent call last):
  File "/home/xx/olmocr/olmocr/pipeline.py", line 535, in _kill_proc
    proc.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/subprocess.py", line 143, in terminate
    self._transport.terminate()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 149, in terminate
    self._check_proc()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_subprocess.py", line 142, in _check_proc
    raise ProcessLookupError()
ProcessLookupError:
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-2' coro=<sglang_server_host() done, defined at /home/xx/olmocr/olmocr/pipeline.py:614> exception=SystemExit(1)>
Traceback (most recent call last):
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_events.py", line 641, in run_until_complete
    self.run_forever()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
    self._run_once()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
    handle._run()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/asyncio/events.py", line 84, in _run
    self._context.run(self._callback, *self._args)
  File "/home/xx/olmocr/olmocr/pipeline.py", line 627, in sglang_server_host
    sys.exit(1)
SystemExit: 1

goodmaney avatar Mar 07 '25 18:03 goodmaney

I try run a Sglang serve ,It report error:

[2025-03-08 15:16:07 TP0] Scheduler hit an exception: Traceback (most recent call last):
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1446, in _call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/repro/after_dynamo.py", line 129, in __call__
    compiled_gm = compiler_fn(gm, example_inputs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/__init__.py", line 2234, in __call__
    return compile_fx(model_, inputs_, config_patches=self.config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1521, in compile_fx
    return aot_autograd(
           ^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/backends/common.py", line 72, in __call__
    cg = aot_module_simplified(gm, example_inputs, **self.kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 1071, in aot_module_simplified
    compiled_fn = dispatch_and_compile()
                  ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 1056, in dispatch_and_compile
    compiled_fn, _ = create_aot_dispatcher_function(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 522, in create_aot_dispatcher_function
    return _create_aot_dispatcher_function(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 759, in _create_aot_dispatcher_function
    compiled_fn, fw_metadata = compiler_fn(
                               ^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 179, in aot_dispatch_base
    compiled_fw = compiler(fw_module, updated_flat_args)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1350, in fw_compiler_base
    return _fw_compiler_base(model, example_inputs, is_inference)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1421, in _fw_compiler_base
    return inner_compile(
           ^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 475, in compile_fx_inner
    return wrap_compiler_debug(_compile_fx_inner, compiler_name="inductor")(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/repro/after_aot.py", line 85, in debug_wrapper
    inner_compiled_fn = compiler_fn(gm, example_inputs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 661, in _compile_fx_inner
    compiled_graph = FxGraphCache.load(
                     ^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/codecache.py", line 1324, in load
    compiled_graph = FxGraphCache._lookup_graph(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/codecache.py", line 1051, in _lookup_graph
    for candidate in iterate_over_candidates():
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/codecache.py", line 1022, in iterate_over_candidates
    subdir = FxGraphCache._get_tmp_dir_for_key(key)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/codecache.py", line 983, in _get_tmp_dir_for_key
    return os.path.join(FxGraphCache._get_tmp_dir(), key[1:3], key)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/codecache.py", line 976, in _get_tmp_dir
    return os.path.join(cache_dir(), "fxgraph")
                        ^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_inductor/runtime/runtime_utils.py", line 93, in cache_dir
    os.makedirs(cache_dir, exist_ok=True)
  File "<frozen os>", line 225, in makedirs
FileNotFoundError: [Errno 2] No such file or directory: ''

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 1784, in run_scheduler_process
    scheduler.event_loop_normal()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 477, in event_loop_normal
    result = self.run_batch(batch)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/scheduler.py", line 1065, in run_batch
    logits_output, next_token_ids = self.tp_worker.forward_batch_generation(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/managers/tp_worker.py", line 163, in forward_batch_generation
    forward_batch = ForwardBatch.init_new(model_worker_batch, self.model_runner)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/sglang/srt/model_executor/forward_batch_info.py", line 316, in init_new
    ret.positions = clamp_position(batch.seq_lens)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 465, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1269, in __call__
    return self._torchdynamo_orig_callable(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1064, in __call__
    result = self._inner_convert(
             ^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 526, in __call__
    return _compile(
           ^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 924, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 666, in compile_inner
    return _compile_inner(code, one_graph, hooks, transform)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_utils_internal.py", line 87, in wrapper_function
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 699, in _compile_inner
    out_code = transform_code_object(code, transform)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/bytecode_transformation.py", line 1322, in transform_code_object
    transformations(instructions, code_options)
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 219, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 634, in transform
    tracer.run()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2796, in run
    super().run()
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 983, in run
    while self.step():
          ^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 895, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2987, in RETURN_VALUE
    self._return(inst)
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2972, in _return
    self.output.compile_subgraph(
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1117, in compile_subgraph
    self.compile_and_call_fx_graph(tx, list(reversed(stack_values)), root)
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1369, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1416, in call_user_compiler
    return self._call_user_compiler(gm)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xx/anaconda3/envs/olmocr/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1465, in _call_user_compiler
    raise BackendCompilerFailed(self.compiler_fn, e) from e
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
FileNotFoundError: [Errno 2] No such file or directory: ''

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True


[2025-03-08 15:16:07] Received sigquit from a child proces. It usually means the child failed.
Killed

I try to run This VLM model with Sglang(v0.4.3) in Xinferece env, that can work.

goodmaney avatar Mar 08 '25 07:03 goodmaney

Well, I soved it with removing the DIR /tmp/triton_cache And the TORCH INDUCTOR CACHE DIR like "/tmp/torchinductor_username/", then restart the terminal. And I dont know why, dont konw which step worked

goodmaney avatar Mar 08 '25 07:03 goodmaney

Well, I soved it with removing the DIR /tmp/triton_cache And the TORCH INDUCTOR CACHE DIR like "/tmp/torchinductor_username/", then restart the terminal. And I dont know why, dont konw which step worked

Hello, when you succeeded, what changes did you make to the code? Is it convenient to check if the environment is still consistent with what you showed earlier?

Devcode518 avatar Mar 10 '25 02:03 Devcode518

I solved my problem installing the python3.11-dev package

franzbischoff avatar Apr 03 '25 09:04 franzbischoff

We’ve transitioned from SGLang to vLLM for various reasons, including this one. For now, we’re closing this issue. However, if you’d like to discuss anything specific about the SGLang version, feel free to reopen it.

aman-17 avatar Jul 10 '25 21:07 aman-17