docling icon indicating copy to clipboard operation
docling copied to clipboard

RapidOCR w/ engines other than ONNXRuntime is not supported

Open Voileexperiments opened this issue 9 months ago • 5 comments

RapidOCR supports multiple OCR engines besides rapidocr_onnxruntime, such as rapidocr_paddle.

However I could not get it to work. Using the RapidOCR example code from the docs, it gives this error

  File "/docling/docling_rapidocr_test.py", line 52, in <module>
    main()
  File "/docling/docling_rapidocr_test.py", line 47, in main
    conversion_result: ConversionResult = converter.convert(source=source)
  File "/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_validate_call.py", line 38, in wrapper_function
    return wrapper(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_validate_call.py", line 111, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
  File "/usr/local/lib/python3.10/dist-packages/docling/document_converter.py", line 203, in convert
    return next(all_res)
  File "/usr/local/lib/python3.10/dist-packages/docling/document_converter.py", line 226, in convert_all
    for conv_res in conv_res_iter:
  File "/usr/local/lib/python3.10/dist-packages/docling/document_converter.py", line 261, in _convert
    for item in map(
  File "/usr/local/lib/python3.10/dist-packages/docling/document_converter.py", line 302, in _process_document
    conv_res = self._execute_pipeline(in_doc, raises_on_error=raises_on_error)
  File "/usr/local/lib/python3.10/dist-packages/docling/document_converter.py", line 323, in _execute_pipeline
    pipeline = self._get_pipeline(in_doc.format)
  File "/usr/local/lib/python3.10/dist-packages/docling/document_converter.py", line 289, in _get_pipeline
    self.initialized_pipelines[pipeline_class] = pipeline_class(
  File "/usr/local/lib/python3.10/dist-packages/docling/pipeline/standard_pdf_pipeline.py", line 73, in __init__
    if (ocr_model := self.get_ocr_model(artifacts_path=artifacts_path)) is None:
  File "/usr/local/lib/python3.10/dist-packages/docling/pipeline/standard_pdf_pipeline.py", line 179, in get_ocr_model
    return RapidOcrModel(
  File "/usr/local/lib/python3.10/dist-packages/docling/models/rapid_ocr_model.py", line 38, in __init__
    raise ImportError(
ImportError: RapidOCR is not installed. Please install it via `pip install rapidocr_onnxruntime` to use this OCR engine. Alternatively, Docling has support for other OCR engines. See the documentation.

And there are nothing in the documentation about how to make Docling work with RapidOCR in other engines.

If Docling only support RapidOCR with ONNXRuntime this should at least be clearly stated, as RapidOCR recommends not using ONNXRuntime for GPU-powered inference, and CPU inference performance becomes unacceptable for larger inference models.

Voileexperiments avatar Feb 10 '25 04:02 Voileexperiments

Thanks for the insights. At the moment we support only RapidOCR with ONNXRuntime, so, as suggested, it could be best to mention it in the docs.

It looks like RapidOCR recently published a torch inference which could be very interesting for us, since it won't require ONNXRuntime but integrate nicely with the other Docling dependencies.

dolfim-ibm avatar Feb 10 '25 07:02 dolfim-ibm

By the way, when trying to run the example code through rapidocr_onnxruntime, it gives this warning message 3 times:

2025-02-10 08:48:05,619 - OrtInferSession - WARNING: CUDAExecutionProvider is not in available providers (['AzureExecutionProvider', 'CPUExecutionProvider']). Use AzureExecutionProvider inference by default.
2025-02-10 08:48:05,619 - OrtInferSession - INFO: !!!Recommend to use rapidocr_paddle for inference on GPU.
2025-02-10 08:48:05,619 - OrtInferSession - INFO: (For reference only) If you want to use GPU acceleration, you must do:
2025-02-10 08:48:05,619 - OrtInferSession - INFO: First, uninstall all onnxruntime pakcages in current environment.
2025-02-10 08:48:05,619 - OrtInferSession - INFO: Second, install onnxruntime-gpu by `pip install onnxruntime-gpu`.
2025-02-10 08:48:05,619 - OrtInferSession - INFO:       Note the onnxruntime-gpu version must match your cuda and cudnn version.
2025-02-10 08:48:05,619 - OrtInferSession - INFO:       You can refer this link: https://onnxruntime.ai/docs/execution-providers/CUDA-EP.html
2025-02-10 08:48:05,619 - OrtInferSession - INFO: Third, ensure CUDAExecutionProvider is in available providers list. e.g. ['CUDAExecutionProvider', 'CPUExecutionProvider']
2025-02-10 08:48:05,619 - OrtInferSession - WARNING: DirectML is only supported in Windows OS. The current OS is Linux. Use AzureExecutionProvider inference by default.

This was what caused me to look into rapidocr_paddle in the first place.

Voileexperiments avatar Feb 10 '25 08:02 Voileexperiments

@Voileexperiments Did you suppress these warnings? And how did you implement rapidocr_paddle. I am getting a ton of these warnings for large PDFs. Can you help me out?

vishaldasnewtide avatar Apr 03 '25 14:04 vishaldasnewtide

@dolfim-ibm Does RapidOCR now support Rapidocr_add? I think docling server still defaults to ONNXRime

lockIchikawa avatar May 23 '25 03:05 lockIchikawa

@Voileexperiments Do you have any good solutions to this problem? Using rapid on GPUs is not very user-friendly

lockIchikawa avatar May 23 '25 03:05 lockIchikawa