ocrmypdf-auto
ocrmypdf-auto copied to clipboard
Docker container to automate use of OCRmyPDF to process documents.
Add prefix/suffix to filename with human-readable timestamp (`datetime.now().strftime("%Y-%m-%d %H:%M:%S")`) and/or unique number before moving to archive folder. In my opinion, all files must stay in archive folder without overwriting. My...
hi, I tested in chi-tra and chi-sim but not work, it may be related - and _ problem??
Sorry -- noob question ahead. First of all thank you for the fantastic project. Unfortunately, I was not able to install. I got to the part where I run the...
I have a separate drive for input, output, ocrtemp etc, however having issued with OS drive filling up during conversions Appears that "/var/lib/docker/overlay2/**container ID**/diff/tmp" is actually being used, folder looks...
I realize this may be thoroughly outside the intended scope of this project, but it would be wonderful if it would process not just PDF files, but a variety of...
Hello, I use your container for my paperless office, and it works great. The scanner stores the PDF in a folder (Input) on my server which the users don't have...
Removed from unRAID templates since they're not required but unRAID generates invalid docker commandlines without a specified mount for every volume in a template. Should add documentation to the unRAID...
When an output file is moved or deleted quickly after processing completes, especially in parallel processing of many files, `OcrTask` may not yet have been scheduled to sanity check the...
I tried putting the archives into another location outside the container to get them into an archive but this is resulting in ``` 2022-12-18 22:06:53 [ThreadPoolExecutor-0_2] - Error in OcrTask.process:...