docling icon indicating copy to clipboard operation
docling copied to clipboard

invalid escape sequence

Open mophilly opened this issue 5 days ago • 1 comments

Bug

`python3.12/site-packages/docling_ibm_models/reading_order/reading_order_rb.py:204: SyntaxWarning: invalid escape sequence ',' m1 = re.fullmatch(".+([a-z,-])(\s*)", elem.text)

/python3.12/site-packages/docling_ibm_models/reading_order/reading_order_rb.py:205: SyntaxWarning: invalid escape sequence '\s' m2 = re.fullmatch("(\s*[a-z])(.+)", sorted_elements[ind_p1].text) `

Steps to reproduce

Set up project to invoke docling and write pages to log install docling run test on small PDF

Docling version

Docling version: 2.24.0 Docling Core version: 2.20.0 Docling IBM Models version: 3.4.0 Docling Parse version: 3.4.0 Python: cpython-312 (3.12.8) Platform: macOS-15.3.1-arm64-arm-64bit

Python version

Python 3.12.8

I am just starting with docling. As a first task I created a small test script.

` try: # create a document loader document_loader = DocumentLoaderDocling() except Exception as e: raise DocumentProcessingError(f"document loader assignment failed: {str(e)}")

    try:
        pages = document_loader.load(source_file_path)
    except Exception as e:
        raise DocumentProcessingError(f"document loader .load() failed: {str(e)}")
    
    page_num = 0
    for page in pages:
        page_num += 1
        self.logger.info(f"page {page_num}: \n{page}\n***************\n")`

The process did complete in spite of the error.

mophilly avatar Feb 21 '25 22:02 mophilly