scan-to-paperless icon indicating copy to clipboard operation
scan-to-paperless copied to clipboard

Docker Compose : Missig config file: /root/.config/scan-to-paperless.yaml

Open nodecentral opened this issue 3 years ago • 15 comments

Hi

I’ve creaged the following Docker Compose file,

version: '2'
services:
    scan-to-paperless:
        image: sbrunner/scan-to-paperless
        container_name: scan-to-paperless
        environment:
            - PUID=1005
            - PGID=1000
            - TZ=Europe/London
        volumes:
            - /share/Container/scan-to-paperless/consume:/destination
            - /share/Container/scan-to-paperless/scan:/source
  restart: unless-stopped 

and while it installs and loads fine, I’m unable to do anything with it ? if I access the command line via docker exec -it scan-to-paperless /bin/bash and run the scan command it returns the following message

[~] # docker exec -it scan-to-paperless /bin/bash
root@986e247365d55:/opt# scan
Missig config file: /root/.config/scan-to-paperless.yaml
The scan folder isn't set, use:
    scan --set-settings scan_folder <a_folder>
    This should be shared with the process container in 'source'.
root@986e62632d55:/opt# 

I’ve tied to address this by adding additional environment variables (see beklow), but that hasn’t fixed it either?

version: '2'
services:
    scan-to-paperless:
        image: sbrunner/scan-to-paperless
        container_name: scan-to-paperless
        environment:
            - PUID=1005
            - PGID=1000
            - TZ=Europe/London
        volumes:
            - /share/Container/scan-to-paperless/consume:/destination
            - /share/Container/scan-to-paperless/scan:/source
        environment:
            SCAN_SOURCE_FOLDER: source
            SCAN_FINAL_FOLDER: destination
  restart: unless-stopped 

Any ideas?

nodecentral avatar Oct 22 '22 23:10 nodecentral

The scan command should run on the client part, see: https://github.com/sbrunner/scan-to-paperless#on-the-desktop :-)

sbrunner avatar Oct 24 '22 06:10 sbrunner

Hi @sbrunner, sorry if I’m being a bit stupid here.

As I’m trying to setup scan_to_paperless via Docker on my QNAP NAS, what is the desktop In this set up ?

nodecentral avatar Oct 24 '22 08:10 nodecentral

The server part is where the document were processed. The client part is the host on which on the scanner is connected :-)

sbrunner avatar Oct 24 '22 09:10 sbrunner

Ahh, I see, thanks -

In my set up, I have a network scanner so everything is going to a FTP share (no client in that sense) ..

Is there scope in the future, for an option to process uploaded files in network share ?

nodecentral avatar Oct 24 '22 09:10 nodecentral

If I understand, with your scanner you push one button, and the result will be a PDF on a folder of your NAS?

sbrunner avatar Oct 24 '22 09:10 sbrunner

If I understand, with your scanner you push one button, and the result will be a PDF on a folder of your NAS?

Correct, i can set it up to go to either a ftp share or a cloud storage service. I’m currently sending everything to a folder for me to review/work on (e.g. split/combine a pdf etc.) before I move it to Paperless consume folder.

nodecentral avatar Oct 24 '22 19:10 nodecentral

Can you add in this issue a generated PDF, then I can do a real test?

sbrunner avatar Oct 25 '22 06:10 sbrunner

Sure, what type of pdf do you want, although to test, you can use anything from the internet? The idea is that you can do the following before Paperless-ngx does it’s bit...

  1. define the split on a multiple page pdf to create seperate pdf files
  2. rotate a pdf of a document scanned in upside down
  3. merge multip pdfs into one since pdf

nodecentral avatar Oct 25 '22 19:10 nodecentral

1 and 3 are not currently implemented features, 2 is, I'd like just to have a document of two pages, I like to work on real example and also to have document from some other scanner than mine :-)

sbrunner avatar Oct 26 '22 09:10 sbrunner

Hi @sbrunner , here you go it’s a pdf of a 2 page form i found, conveniently scanned in upside down (which was a genuine mistake) - hopefully this helps BRW008092DC5E53_233662.pdf

nodecentral avatar Oct 27 '22 19:10 nodecentral

Excellent, thanks :-)

sbrunner avatar Oct 28 '22 05:10 sbrunner

No problem..

@sbrunner - I have a dream ! -> https://github.com/paperless-ngx/paperless-ngx/discussions/1848#discussioncomment-3984883

Anything you can do to help make it a reality, would be amazing :-)

nodecentral avatar Oct 28 '22 19:10 nodecentral

Ahh, I see, thanks -

In my set up, I have a network scanner so everything is going to a FTP share (no client in that sense) ..

Is there scope in the future, for an option to process uploaded files in network share ?

For this i use https://github.com/ocrmypdf/OCRmyPDF

ocrmypdf:
#    user: 0:0
    restart: always
    container_name: Ocrmypdf-x
    image: jbarlow83/ocrmypdf:latest
#    ports:
#       - 5000:5000    
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /share/scanned_originals/OCR-pickup:/input
      - /share/scanned_originals/OCR-done:/output
      - /share/docker-volumes/Home-tools/ocrmypdf/tessdata:/usr/local/share/tessdata
      - /share/docker-volumes/Home-tools/ocrmypdf/config:/usr/local/share/config
#      - /share/docker-volumes/Home-tools/ocrmypdf/tessdata:/usr/share/tesseract-ocr/4.00/tessdata
    environment:
      - TZ=Europe/Amsterdam
      - OCR_ON_SUCCESS_DELETE=1
      - OCR_DESKEW=1
      - OCR_FORCE-OCR=1
      - LANGUAGE=nld+eng
#      - OCR_JSON_SETTINGS='{"rotate_pages": true, "skip_text": true, "language": "nld+eng", "output_type":"pdf"}'
      - PYTHONUNBUFFERED=1
#      - TESSDATA_PREFIX=/usr/local/share/tessdata
#      - TESSDATA_PREFIX=/share/Container-volumes/Home-tools/ocrmypdf/tessdata/
    entrypoint: python3
    command: >- 
      watcher.py

But i'm still fiddling with the correct ultimate best settings....

Maximus48p avatar Nov 09 '22 22:11 Maximus48p

Hi @Maximus48p thanks so much for sharing that link and your docker compose, I’ll take a look later.

Do you tie this ocrmypdf into your Paperless-ngx workflow ?

nodecentral avatar Nov 10 '22 11:11 nodecentral

You probably just make that the /output of ocrmypdf is the same folder that the /consume of paperless-ngx :-)

sbrunner avatar Nov 11 '22 13:11 sbrunner