docling
docling copied to clipboard
invalid escape sequence
Bug
`python3.12/site-packages/docling_ibm_models/reading_order/reading_order_rb.py:204: SyntaxWarning: invalid escape sequence ',' m1 = re.fullmatch(".+([a-z,-])(\s*)", elem.text)
/python3.12/site-packages/docling_ibm_models/reading_order/reading_order_rb.py:205: SyntaxWarning: invalid escape sequence '\s' m2 = re.fullmatch("(\s*[a-z])(.+)", sorted_elements[ind_p1].text) `
Steps to reproduce
Set up project to invoke docling and write pages to log install docling run test on small PDF
Docling version
Docling version: 2.24.0 Docling Core version: 2.20.0 Docling IBM Models version: 3.4.0 Docling Parse version: 3.4.0 Python: cpython-312 (3.12.8) Platform: macOS-15.3.1-arm64-arm-64bit
Python version
Python 3.12.8
I am just starting with docling. As a first task I created a small test script.
` try: # create a document loader document_loader = DocumentLoaderDocling() except Exception as e: raise DocumentProcessingError(f"document loader assignment failed: {str(e)}")
try:
pages = document_loader.load(source_file_path)
except Exception as e:
raise DocumentProcessingError(f"document loader .load() failed: {str(e)}")
page_num = 0
for page in pages:
page_num += 1
self.logger.info(f"page {page_num}: \n{page}\n***************\n")`
The process did complete in spite of the error.