scantailor-advanced
scantailor-advanced copied to clipboard
Экспорт файлов изображений
Please export the image files for separate processing of black-and-white and color images (for subsequent pasting using the DjVu Imager program). This option is available in Scan Tailor Featured. (Automatic translation).

It's already implemented. In step 6-Output choose Mode -> Mixed and tick Splitting -> Split output.

- When exporting to Scan Tailor Feature, all black-and-white files are displayed, and not just for pictures that had a gray part. In addition, the file names change: a simple numbering is "0000".
- I tried to work around this discrepancy as follows: I took the files from the "foreground" folder and replaced them with the main output files. But when pasting pictures, an error is generated about the discrepancy between the pixel sizes of the files. Then I left only one image file for the paste (which was not mentioned in the error message), the paste occurred, but not on the page where it was necessary: the picture from page 17 was inserted on the 10th. That is, the insert pages are confused. I think this is due to file names (the program only takes into account the first 4 digits of the file name)
- I propose to do as in the Scan Tailor Feature program:
3.1 When exporting, make the file names a simple numbering of the form “0000” (without adding “1L” or “2R”).
3.2 Black and white files should be displayed for all pages (and not just for those that had a gray part).
3.3 It might be more convenient to export for all gray pages with a single click on the menu as in Scan Tailor Feature. Now it is necessary to indicate such an output in each page. However, this easily applies to all pages.
If you try to rename in another program, the following problem appears: how to set the correspondence between the names in the "background" folder and the general serial numbering of files. For example: "0010_2R", "0016_2R", "0021_1L", "0026_1L", etc. How do I set file numbers in this case?
(Automatic translation).

It's true, STA generates separate files only for the images that use the Mixed mode. And I think this is the right behavior. To get only black and white images you can copy the files in the foreground folder to the out folder, overwriting the mixed images. That way, all the images in out are bitonal and the background folder contains the images. (foreground folder is not needed anymore.)
I seem to remember that DJVU Imager has some problems with files that do not have a simple number name. This only happens when you split pages in two (in step 2). There is no simple workaround for this, you either have to manually change all names from XXXX_1L to YYYY to make them correspond or manually adjust the page # in DJVU Imager.

I agree with you that STA should have an option to change the output file name format to something more regular. I suggest adding a 4-digit number suffix to the current names. From OriginalName_1L to XXXX_OriginalName_1L (or OriginalName_2R to XXXX_OriginalName_2R). It would keep the best from both worlds.
There is no simple workaround for this
The separate output function itself involves the continued use of the DJVU Imager. Therefore, it seems to me that STA should produce files convenient for this.
I agree with you. I'm hoping 4lex4 or someone else capable of making it happen also agrees :).
PS: I should add that the separate image method can also be applied to pdf files (via QPDF or Adobe Acrobat, among others).
That is, can I convert separate files not to DJVU, but to Adobe?
It can be done, yes. You can use the overlay function of QPDF to "paste" the images over the text-only PDF. But you need to craft a command line script that calls QPDF on every page. Otherwise it would be manual work, just like before.
You can PM me if you need help with this.
Where are the private messages? I didn’t find something...
@d4fe say:
That is, can I convert separate files not to DJVU, but to Adobe?
See also:
- https://github.com/ImageProcessing-ElectronicPublications/python-cropper-tk
- https://github.com/ImageProcessing-ElectronicPublications/python-pfbgmrgr
- https://github.com/ImageProcessing-ElectronicPublications/python-pdfwatermark
- https://github.com/m-click/mcpdf
Thanks! I will get acquainted.