tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Add ocr'ed text back to image and generate a PDF

Open ramiromagno opened this issue 3 years ago • 1 comments

It would be great if this package supported adding back the retrieved text from a raster to PDF format.

For example, using tesseract directly from the command line makes this possible in one single command:

tesseract --dpi 600 --oem 2 input_01.png output_01 pdf

ramiromagno avatar Nov 06 '20 23:11 ramiromagno

Second this.

Relatedly, would be nice if could take in a .pdf that contains some images and convert these to editable text, returning a new .pdf like how done with adobe.

brshallo avatar Feb 06 '22 02:02 brshallo