jbarlow

Results 380 comments of jbarlow

JPEGXL sounds promising but I don't think it's in the PDF 2.0 spec at all, so we're many years from it being in use in PDF. I personally think JPEG...

It seems like a decent idea to me. (It will break down on low quality images, i.e. when you have a poor sense of what text is there at all,...

Oh, thanks. People often don't follow through on PR-requests. :) That's pretty good for your first few days of Python. I don't mind if you ignore ruffus and do the...

As far as I'm concerned this effort should replace the existing `--rotate-pages` provided we can prove it's an improvement in most cases. That could mean it does Tesseract OSD as...

Some scanners save PDFs with color segmentation, meaning the page is broken up into color, gray and mono regions. If you use image processing, all of the segments images have...

Oh, your case was simpler than I was expecting. Your scanner software is not doing color segmentation; that's just one full page color image. I believe what's happening here is...

I believe what's going on was not leptonica. There's a decision that's made about the minimal colorspace needed to represent the images in the file. In the absence of information...

You're welcome. Probably not necessary for this file but you seem to keep finding interesting files, so if you can approval send them to me as they appear might help...