Images Missing in Markdown Output When Using `marker` Command for Multiple PDFs
Description
When using the marker command to convert multiple PDFs to markdown in batch mode, the output markdown files include the extracted text but do not include images. However, when using the marker_single command to convert a single PDF, both text and images are included correctly in the output. This indicates a bug specific to the batch processing functionality of the marker command.
Environment
- Operating System: Windows 11 Home Single Language
- OS Version: 10.0.26100 Build 26100
- System Model: ROG Strix G16 G614JIR
- CPU: Intel(R) Core(TM) i9-14900HX (24 cores, 32 logical processors, 3.66 GHz)
- GPU: NVIDIA GeForce RTX 4070 Laptop GPU
- VRAM: 8.0 GB dedicated, 15.8 GB total
- CUDA Version: 11.8 (as indicated by
torch 2.6.0+cu118) - PyTorch Version: 2.6.0+cu118
- Marker Version: v1.6.1
Steps to Reproduce
- Create a folder (e.g.,
input_folder) containing multiple PDFs with images (e.g.,pdf1.pdf,pdf2.pdf). - Run the
markercommand for batch conversion:marker --output_dir .\output_folder .\input_folder --workers 4 - Inspect the markdown files in
output_folder. The text is present, but images are missing. - For comparison, run the
marker_singlecommand on one of the PDFs:marker_single .\input_folder\pdf1.pdf --output_dir output_folder_single - The output from
marker_singleincludes both text and images as expected.
Expected Behavior
- The
markercommand should generate markdown files that include both text and images from the PDFs, consistent with the behavior ofmarker_single.
Actual Behavior
- When using
markerfor batch processing, the markdown files contain only text, with no images included. In contrast,marker_singlecorrectly includes both text and images when processing a single PDF.
Possible Cause
- The issue might stem from how the
markercommand handles multiprocessing with CUDA-enabled systems. In theconvert.pyscript, models are loaded in the main process and shared across worker processes, which may not properly support image extraction due to CUDA context requirements. Themarker_singlecommand, running in a single process, avoids this problem by loading and using models directly.
Additional Information
- No error messages appear during the conversion; the process completes successfully but omits images.
- The issue occurs consistently, regardless of the number of PDFs processed.
- Hardware and software details (listed above) may help identify if this is specific to certain GPU or CUDA configurations.
Hardware and Software Details
- GPU Model: NVIDIA GeForce RTX 4070 Laptop GPU
- VRAM: 8.0 GB dedicated, 15.8 GB total
- CUDA Version: 11.8 (as indicated by
torch 2.6.0+cu118) - PyTorch Version: 2.6.0+cu118
- Marker Version: v1.6.1
Not sure where the bug is yet, but I observe that when modifying marker/scripts/convert.py by removing the global variable, the bug disappears.
converter = converter_cls(
config=config_dict,
# artifact_dict=model_refs,
artifact_dict=create_model_dict(),
processor_list=config_parser.get_processors(),
renderer=config_parser.get_renderer(),
llm_service=config_parser.get_llm_service()
)
Conclusion: something about torch multiprocessing for the model_dict not working on certain machines (possibly platform-dependent?)
Edit: this fixed it for me, and also explains the observed platform dependence.
if settings.TORCH_DEVICE == "mps" or settings.TORCH_DEVICE_MODEL == "mps":
model_dict = None
else:
model_dict = None
# create_model_dict()
# for k, v in model_dict.items():
# v.model.share_memory()
I have the exactly the same behavior described by @Saketh-Chandra using marker CLI for multiple pdf. The images are not treated and the final rendering is very bad. It works fine with marker_single command. I tried some fix of @conjuncts but it did not solved the problem.
Operating System: Windows 11 Professionnel OS Version: 24H2 (1000.26100.54.0) System Model: Asus ROG Strix Z890-E CPU: Intel(R) Core(TM) Ultra 9 285K 3.70 GHz (24 cores) GPU: NVIDIA GeForce RTX 4060TI Laptop GPU VRAM: 16 GB CUDA Version: 11.8 (as indicated by torch 2.6.0+cu118) PyTorch Version: 2.6.0+cu118 Marker Version: v1.6.1