OCRmyPDF
OCRmyPDF copied to clipboard
[Feature]: sidecar Support Text Output to io.StringIO()
Describe the proposed feature
Using sidecar="page_text.txt"
works great and the generated file on disk has the expected text. However, passing an io.StringIO()
to sidecar
doesn't seem to work and no text is being saved to the buffer:
import ocrmypdf
import io
output_pdf_file_obj = io.BytesIO()
page_text_buffer = io.StringIO()
ocrmypdf.ocr(
"source_pdf.pdf",
output_pdf_file_obj,
sidecar=page_text_buffer,
pages=1,
tesseract_pagesegmode=1,
)
>>> page_text_buffer.seek(0)
>>> print(page_text_buffer.read())
''
>>> page_text_buffer.seek(0)
>>> print(page_text_buffer.getvalue())
''