olmocr 启动单个pdf报 502的错误。

🐛 Describe the bug

python -m olmocr.pipeline ./localworkspace --pdfs tests/gnarly_pdfs/horribleocr.pdf INFO:olmocr.check:pdftoppm is installed and working. 2025-02-27 15:16:34,235 - main - INFO - Got --pdfs argument, going to add to the work queue 2025-02-27 15:16:34,235 - main - INFO - Loading file at tests/gnarly_pdfs/horribleocr.pdf as PDF document 2025-02-27 15:16:34,235 - main - INFO - Found 1 total pdf paths to add Sampling PDFs to calculate optimal length: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 601.51it/s]2025-02-27 15:16:34,238 - main - INFO - Calculated items_per_group: 500 based on average pages per PDF: 1.00 INFO:olmocr.work_queue:Found 1 total paths INFO:olmocr.work_queue:0 new paths to add to the workspace 2025-02-27 15:16:34,375 - main - INFO - Starting pipeline with PID 61231 INFO:olmocr.work_queue:Initialized local queue with 1 work items INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:34,617 - main - INFO - Attempt 1: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:35,719 - main - INFO - Attempt 2: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:36,820 - main - INFO - Attempt 3: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:37,921 - main - INFO - Attempt 4: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:39,051 - main - INFO - Attempt 5: Unexpected status code 502 2025-02-27 15:16:39,885 - main - INFO - [2025-02-27 15:16:39] server_args=ServerArgs(model_path='allenai/olmOCR-7B-0225-preview', tokenizer_path='allenai/olmOCR-7B-0225-preview', tokenizer_mode='auto', load_format='auto', trust_remote_code=False, dtype='auto', kv_cache_dtype='auto', quantization_param_path=None, quantization=None, context_length=None, device='cuda', served_model_name='allenai/olmOCR-7B-0225-preview', chat_template='qwen2-vl', is_embedding=False, revision=None, skip_tokenizer_init=False, host='127.0.0.1', port=30024, mem_fraction_static=0.8, max_running_requests=None, max_total_tokens=None, chunked_prefill_size=2048, max_prefill_tokens=16384, schedule_policy='lpm', schedule_conservativeness=1.0, cpu_offload_gb=0, prefill_only_one_req=False, tp_size=1, stream_interval=1, stream_output=False, random_seed=188481481, constrained_json_whitespace_pattern=None, watchdog_timeout=300, download_dir=None, base_gpu_id=0, log_level='info', log_level_http='warning', log_requests=False, show_time_cost=False, enable_metrics=False, decode_log_interval=40, api_key=None, file_storage_pth='sglang_storage', enable_cache_report=False, dp_size=1, load_balance_method='round_robin', ep_size=1, dist_init_addr=None, nnodes=1, node_rank=0, json_model_override_args='{}', lora_paths=None, max_loras_per_batch=8, attention_backend='flashinfer', sampling_backend='flashinfer', grammar_backend='outlines', speculative_draft_model_path=None, speculative_algorithm=None, speculative_num_steps=5, speculative_num_draft_tokens=64, speculative_eagle_topk=8, enable_double_sparsity=False, ds_channel_config_path=None, ds_heavy_channel_num=32, ds_heavy_token_num=256, ds_heavy_channel_type='qk', ds_sparse_decode_threshold=4096, disable_radix_cache=False, disable_jump_forward=False, disable_cuda_graph=False, disable_cuda_graph_padding=False, disable_outlines_disk_cache=False, disable_custom_all_reduce=False, disable_mla=False, disable_overlap_schedule=False, enable_mixed_chunk=False, enable_dp_attention=False, enable_ep_moe=False, enable_torch_compile=False, torch_compile_max_bs=32, cuda_graph_max_bs=8, cuda_graph_bs=None, torchao_config='', enable_nan_detection=False, enable_p2p_check=False, triton_attention_reduce_in_fp32=False, triton_attention_num_kv_splits=8, num_continuous_decode_steps=1, delete_ckpt_after_loading=False, enable_memory_saver=False, allow_auto_truncate=False, enable_custom_logit_processor=False, tool_call_parser=None) INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:40,227 - main - INFO - Attempt 6: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:41,329 - main - INFO - Attempt 7: Unexpected status code 502 2025-02-27 15:16:41,800 - main - INFO - Using a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with use_fast=False. INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:42,432 - main - INFO - Attempt 8: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:43,534 - main - INFO - Attempt 9: Unexpected status code 502 2025-02-27 15:16:43,792 - main - INFO - [2025-02-27 15:16:43] Use chat template for the OpenAI-compatible API server: qwen2-vl INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:44,645 - main - INFO - Attempt 10: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:45,746 - main - INFO - Attempt 11: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:46,848 - main - INFO - Attempt 12: Unexpected status code 502 2025-02-27 15:16:47,433 - main - INFO - Using a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with use_fast=False. INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:47,951 - main - INFO - Attempt 13: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:49,053 - main - INFO - Attempt 14: Unexpected status code 502 2025-02-27 15:16:49,529 - main - INFO - [2025-02-27 15:16:49 TP0] Overlap scheduler is disabled for multimodal models. 2025-02-27 15:16:50,153 - main - INFO - [2025-02-27 15:16:50 TP0] Automatically reduce --mem-fraction-static to 0.760 because this is a multimodal model. 2025-02-27 15:16:50,153 - main - INFO - [2025-02-27 15:16:50 TP0] Automatically turn off --chunked-prefill-size and disable radix cache for qwen2-vl. 2025-02-27 15:16:50,153 - main - INFO - [2025-02-27 15:16:50 TP0] Init torch distributed begin. INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:50,155 - main - INFO - Attempt 15: Unexpected status code 502 2025-02-27 15:16:50,319 - main - INFO - [2025-02-27 15:16:50 TP0] Load weight begin. avail mem=21.76 GB 2025-02-27 15:16:51,256 - main - INFO - [2025-02-27 15:16:51 TP0] Using model weights format ['*.safetensors'] INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:51,257 - main - INFO - Attempt 16: Unexpected status code 502 Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s] INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:52,362 - main - INFO - Attempt 17: Unexpected status code 502 Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:01<00:03, 1.31s/it] INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:53,464 - main - INFO - Attempt 18: Unexpected status code 502 INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:54,579 - main - INFO - Attempt 19: Unexpected status code 502 Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:02<00:02, 1.50s/it] INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:55,683 - main - INFO - Attempt 20: Unexpected status code 502 Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:04<00:01, 1.55s/it] INFO:httpx:HTTP Request: GET http://localhost:30024/v1/models "HTTP/1.1 502 Bad Gateway" 2025-02-27 15:16:56,785 - main - INFO - Attempt 21: Unexpected status code 502 Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:05<00:00, 1.19s/it] Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:05<00:00, 1.30s/it]