olmocr icon indicating copy to clipboard operation
olmocr copied to clipboard

issue with RTX A6000 execution

Open keithjohngates opened this issue 5 days ago • 9 comments

🐛 Describe the bug

I am using:

nvidia RTX A6000 48GB

Followed the instructions carefully, all seemed to install and be fine.

CUDA_DEVICE_ORDER=PCI_BUS_ID python -m olmocr.pipeline ./localworkspace --pdfs /media/pop/samsung256/x64_gsqld_report_files/e408bd57-03eb-4d08-b92c-ab7bf632cfca/cr_100468_7.pdf

Any ideas???

something to do with sglang ?? not installing - although it seems to be in the conda env, but not running properly ??

Gives the following error:

(olmocr) pop@pop-os:~/Documents/olmocr$ CUDA_DEVICE_ORDER=PCI_BUS_ID python -m olmocr.pipeline ./localworkspace --pdfs /media/pop/samsung256/x64_gsqld_report_files/e408bd57-03eb-4d08-b92c-ab7bf632cfca/cr_100468_7.pdf INFO:olmocr.check:pdftoppm is installed and working. 2025-02-28 13:00:42,979 - main - INFO - Got --pdfs argument, going to add to the work queue 2025-02-28 13:00:42,979 - main - INFO - Loading file at /media/pop/samsung256/x64_gsqld_report_files/e408bd57-03eb-4d08-b92c-ab7bf632cfca/cr_100468_7.pdf as PDF document 2025-02-28 13:00:42,979 - main - INFO - Found 1 total pdf paths to add Sampling PDFs to calculate optimal length: 100%|███████████████| 1/1 [00:00<00:00, 178.34it/s] 2025-02-28 13:00:42,985 - main - INFO - Calculated items_per_group: 33 based on average pages per PDF: 15.00 INFO:olmocr.work_queue:Found 1 total paths INFO:olmocr.work_queue:0 new paths to add to the workspace 2025-02-28 13:00:43,106 - main - INFO - Starting pipeline with PID 66979 INFO:olmocr.work_queue:Initialized local queue with 1 work items 2025-02-28 13:00:43,168 - main - WARNING - Attempt 1: All connection attempts failed 2025-02-28 13:00:44,193 - main - WARNING - Attempt 2: All connection attempts failed 2025-02-28 13:00:45,228 - main - WARNING - Attempt 3: All connection attempts failed 2025-02-28 13:00:46,273 - main - WARNING - Attempt 4: All connection attempts failed 2025-02-28 13:00:47,299 - main - WARNING - Attempt 5: All connection attempts failed 2025-02-28 13:00:48,346 - main - WARNING - Attempt 6: All connection attempts failed 2025-02-28 13:00:48,469 - main - INFO - [2025-02-28 13:00:48] server_args=ServerArgs(model_path='allenai/olmOCR-7B-0225-preview', tokenizer_path='allenai/olmOCR-7B-0225-preview', tokenizer_mode='auto', load_format='auto', trust_remote_code=False, dtype='auto', kv_cache_dtype='auto', quantization_param_path=None, quantization=None, context_length=None, device='cuda', served_model_name='allenai/olmOCR-7B-0225-preview', chat_template='qwen2-vl', is_embedding=False, revision=None, skip_tokenizer_init=False, host='127.0.0.1', port=30024, mem_fraction_static=0.8, max_running_requests=None, max_total_tokens=None, chunked_prefill_size=2048, max_prefill_tokens=16384, schedule_policy='lpm', schedule_conservativeness=1.0, cpu_offload_gb=0, prefill_only_one_req=False, tp_size=1, stream_interval=1, stream_output=False, random_seed=136363370, constrained_json_whitespace_pattern=None, watchdog_timeout=300, download_dir=None, base_gpu_id=0, log_level='info', log_level_http='warning', log_requests=False, show_time_cost=False, enable_metrics=False, decode_log_interval=40, api_key=None, file_storage_pth='sglang_storage', enable_cache_report=False, dp_size=1, load_balance_method='round_robin', ep_size=1, dist_init_addr=None, nnodes=1, node_rank=0, json_model_override_args='{}', lora_paths=None, max_loras_per_batch=8, attention_backend='flashinfer', sampling_backend='flashinfer', grammar_backend='outlines', speculative_draft_model_path=None, speculative_algorithm=None, speculative_num_steps=5, speculative_num_draft_tokens=64, speculative_eagle_topk=8, enable_double_sparsity=False, ds_channel_config_path=None, ds_heavy_channel_num=32, ds_heavy_token_num=256, ds_heavy_channel_type='qk', ds_sparse_decode_threshold=4096, disable_radix_cache=False, disable_jump_forward=False, disable_cuda_graph=False, disable_cuda_graph_padding=False, disable_outlines_disk_cache=False, disable_custom_all_reduce=False, disable_mla=False, disable_overlap_schedule=False, enable_mixed_chunk=False, enable_dp_attention=False, enable_ep_moe=False, enable_torch_compile=False, torch_compile_max_bs=32, cuda_graph_max_bs=8, cuda_graph_bs=None, torchao_config='', enable_nan_detection=False, enable_p2p_check=False, triton_attention_reduce_in_fp32=False, triton_attention_num_kv_splits=8, num_continuous_decode_steps=1, delete_ckpt_after_loading=False, enable_memory_saver=False, allow_auto_truncate=False, enable_custom_logit_processor=False, tool_call_parser=None) 2025-02-28 13:00:49,387 - main - WARNING - Attempt 7: All connection attempts failed 2025-02-28 13:00:50,094 - main - INFO - Using a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with use_fast=False. 2025-02-28 13:00:50,544 - main - WARNING - Attempt 8: All connection attempts failed 2025-02-28 13:00:51,567 - main - WARNING - Attempt 9: All connection attempts failed 2025-02-28 13:00:51,900 - main - INFO - [2025-02-28 13:00:51] Use chat template for the OpenAI-compatible API server: qwen2-vl 2025-02-28 13:00:52,614 - main - WARNING - Attempt 10: All connection attempts failed 2025-02-28 13:00:53,637 - main - WARNING - Attempt 11: All connection attempts failed 2025-02-28 13:00:54,660 - main - WARNING - Attempt 12: All connection attempts failed 2025-02-28 13:00:55,683 - main - WARNING - Attempt 13: All connection attempts failed

Versions

hope this is something obvious !!!

keithjohngates avatar Feb 28 '25 04:02 keithjohngates