OCRmyPDF icon indicating copy to clipboard operation
OCRmyPDF copied to clipboard

[Feature]: Add MRC (mixed raster content) feature as --optimize 4

Open keneedy opened this issue 9 months ago • 1 comments

Describe the proposed feature

So MRC is extremely useful and I couldn't find a good application to do it on windows, except relying on some sketchy websites with server processing. So as far as I'm aware there is some approach made public by internet archive under archive-pdf-tools, but that is not accessible for windows users. Would it be possible to add this feature on ocrmypdf so we have access to hyper-compression?

Thank you.

keneedy avatar Apr 12 '25 21:04 keneedy

Would be useful but I'm not likely to have the time to take on a major feature like.

Several years ago, a developer (pretty sure it was Merlijn Wajer) from internet archive approached me proposing to add the relevant components from archive-pdf-tools, since I believe they were using ocrmypdf + archive-pdf-tools to finalize work, but their license is not compatible with ocrmypdf (it was AGPL I believe) and their management was not willing to change it. As it is, they depend on PyMuPDF which is also AGPL, so quite a few parties would need to change their minds on licensing. Integration of that IP is a dead end.

I suppose ocrmypdf could call archive-pdf-tools from command line similar to our use of GPL Ghostscript, but I'm trying to eliminate command line dependencies rather than add them.

jbarlow83 avatar Apr 13 '25 21:04 jbarlow83