paperless-ng
paperless-ng copied to clipboard
[Other] Merging pdfs in preconsume script
Hi there and thank you for this wonderful piece of software!
I'm right in the middle of scanning all my files and I have run into a shortcoming of my scanner (brother ads-1700w). It can either scan all documents in the ADF into a single pdf, or every page into a separate pdf. This means that the front and back sides are stored as separate pdfs as well.
Now I have a whole bunch of documents that are on one paper but two-sided and scanning them separately would be a nightmare.
My idea is to scan them to a subdirectory in the consume folder and have a pre consume script that merges every two pages together.
But I have some questions about that:
- Does the pre consume script receive the full path or just the file name?
- Can a pre consume script abort the consumption? When the first of two pages is stored, it should not be consumed until the second page is also available.
- How to merge two pdfs the paperless-way? I'm using the docker image and would mount the script directy into the container, so I can use the available tooling inside it, rather than installing everything neccessary on the host.
Thanks again for this tool and I would really appreciate any help to make my experience even more awesome :)
I just work on a docker container that does the opposite. You add separator pages between each document, scan them in one go and the script will split the file into individual pdfs per document and removes possible blank pages. Would that be of interest to you?
Hey Marty, I was on a little research for this topic because I`ll do the same. Do you have completed your script? Would appreciate to hear from you :)
Greetings
Hi @h0d3nt3uf3l, yes but I did it with a cronjob instead of a pre-consume script. It works very well without any issues so far. You need docker for it, tho. You can find it here: https://github.com/Marty/paperless-scripts/blob/main/merge-two-siders.sh
Hey Marty, thanks for your reply. I saw it yesterday and it show me the way. Will script it via python for my Nas :)