docling icon indicating copy to clipboard operation
docling copied to clipboard

Memory leak caused by EasyOCR

Open evelyn023 opened this issue 8 months ago • 5 comments

Bug

Hello. I have been experimenting with Docling for a while and am impressed by its performance. Everything runs well in my local environment. The problem is that when I ran the same codes in a container environment, the CPU memory kept increasing until it went OOM and the container killed itself. I have figured out the problem is a memory leak caused by the reader.readertxt function in EasyOCR, and a similar issue https://github.com/JaidedAI/EasyOCR/issues/815 was reported but unsolved under EasyOCR's repo.

Steps to reproduce

The piece of code I used is

pipeline_options = PdfPipelineOptions()
pipeline_options.do_ocr = True
pipeline_options.do_table_structure = True
pipeline_options.table_structure_options.do_cell_matching = True
pipeline_options.ocr_options = EasyOcrOptions(force_full_page_ocr=True, lang=['es'])
pipeline_options.accelerator_options = AcceleratorOptions(
        num_threads=8, device=AcceleratorDevice.CPU
    )
doc_converter = DocumentConverter(
     format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)
     }
 )
 conv_result = doc_converter.convert(input_doc_path)

My local environment is MacOS (M3) with 18G RAM. My container environment is Linux with 18G memory limit.

Docling version

2.25.2

Python version

3.12

evelyn023 avatar Apr 09 '25 14:04 evelyn023

Did you try the latest version of docling?

ColeDrain avatar Apr 09 '25 22:04 ColeDrain

@ColeDrain Yes. The problem persists with 2.28.4

evelyn023 avatar Apr 10 '25 01:04 evelyn023

Hello, I am noticing a similar issue, when processing a lot of documents, the memory will often progressively go up and sometimes even suddenly spike. killing the container.

carl-krikorian avatar May 27 '25 07:05 carl-krikorian

Same here. macOS 15.5, clean install of the latest version, easyOCR engine causes a memory leak on a single 5-page scanned PDF.

XInTheDark avatar Jun 05 '25 11:06 XInTheDark

Same issues here.

acewhitegui avatar Nov 04 '25 09:11 acewhitegui