tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Convert PDF to TIFF

Open vijtad opened this issue 5 years ago • 3 comments

Do you have any example which converts multiple PDF pages into multi-page TIFF ?

vijtad avatar May 02 '20 15:05 vijtad

Your best bet is probably to use a pdf lib/tool that's specifically designed for it.

On Sun, 3 May 2020, 01:06 Vijay Prakash Tadinada, [email protected] wrote:

Do you have any example which converts multiple PDF pages into multi-page TIFF ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/513, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSBH2XWJGNN24Y43IS3RPQZGRANCNFSM4MXW7MYA .

charlesw avatar May 03 '20 09:05 charlesw

You can use pdfium C# wrapper to get content from PDF and directly convert to them to TIFF. I currently use below library to convert PDF to PNG and do the OCR https://github.com/GowenGit/docnet

MohanVijayakumar avatar Jun 06 '20 08:06 MohanVijayakumar

Look at the OCR PDF in .NET article. It describes how to OCR PDF using Docotic.Pdf and Tesseract.

Pure PDF to TIFF conversion does not relate to Tesseract at all. Tesseract is for optical recognition only.

shibaev avatar Jul 07 '20 03:07 shibaev