import-paperless script fails due to mixed API paths (/api vs /app/api) and async processing in Docspell 0.43
Hi Docspell team,
I'm migrating from Paperless using tools/import-paperless.sh (originally written for Docspell 0.3 beta) on a UGREEN NAS (Docker Compose) and encountered several issues with the current import script:
- Mixed API base paths
- Open auth endpoint works at root: POST http://localhost:7880/api/v1/open/auth/login → 200 OK
- All secured endpoints require
/appprefix: http://localhost:7880/app/api/v1/sec/... - Using /api/v1/sec/... (without /app) → 404 Not found
- Conclusion: Login uses
/api/v1/open/auth/login, all secured endpoints must use/app/api/v1/sec/...
- Asynchronous document processing not handled
- Upload endpoint
/app/api/v1/sec/upload/itemreturns success immediately but documents are queued for processing - The script attempts to set metadata immediately after upload, but documents don't have IDs yet
- Processing can take 15-30+ minutes per document (OCR, NLP analysis, etc.)
- Solution: Implemented two-pass approach:
- Pass 1 (
mode=upload): Upload all documents quickly - Wait for processing queue to complete
- Pass 2 (
mode=metadata): Apply all metadata using /checkfile/{checksum} to get document IDs
- Pass 1 (
- Upload payload format
- The script was using
file=@pathbut web UI usesfile[]=@pathwith specific metadata JSON - Required metadata structure:
{"multiple":true,"flattenArchives":false,"direction":"incoming","folder":null,"skipDuplicates":true,"tags":null,"fileFilter":null,"language":null,"attachmentsOnly":null,"customData":null}
- Shell quoting issues
- Original script had nested quote problems and missing fields (e.g., "use":"correspondent" for organizations)
- Fixed by using
jq -nto build JSON payloads safely
Environment
- Platform: UGREEN NAS (Docker Compose)
- Docspell version: 0.43.0
- UI base: http://localhost:7880/app/dashboard
- Auth endpoint:
/api/v1/open/auth/login - Secured endpoints:
/app/api/v1/sec/...
Working solution summary
Authentication:
payload=$(printf '{"account":"%s","password":"%s"}' "$user" "$password")
curl -s -X POST -H 'Content-Type: application/json' \
-d "$payload" "http://localhost:7880/api/v1/open/auth/login"
Organization create (with required "use" field):
payload=$(jq -n --arg name "$org_name" \
'{id: "", name: $name, address: {street: "", zip: "", city: "", country: ""}, contacts: [], notes: null,
created: 0, shortName: null, use: "correspondent"}')
curl -s -X POST -H "X-Docspell-Auth: $token" \
-H 'Content-Type: application/json' \
-d "$payload" "http://localhost:7880/app/api/v1/sec/organization"
Document upload (matching web UI format):
meta_json='{"multiple":true,"flattenArchives":false,"direction":"incoming","folder":null,"skipDuplicates":true,"tags":null,"fileFilter":null,"language":null,"attachmentsOnly":null,"customData":null}'
curl -s -X POST -H "X-Docspell-Auth: $token" \
-F "meta=$meta_json" \
-F "file[]=@$filepath" \
"http://localhost:7880/app/api/v1/sec/upload/item"
Check processing status:
curl -s -H "X-Docspell-Auth: $token" \
"http://localhost:7880/app/api/v1/sec/checkfile/$checksum"
Returns: {"exists":false} while processing, {"exists":true,"items":[{"id":"..."}]} when done.
Results
I successfully imported 51 files end‑to‑end using the two‑pass approach (upload first, then metadata after processing) with the following commands:
Pass 1: upload (fast, queues processing)
./import-paperless.sh \
http://localhost:7880 \
your_user \
'YOUR_PASSWORD' \
/home/user/paperless/data/db.sqlite3 \
/home/user/paperless/media/documents/originals \
upload
Pass 2: metadata (after processing finishes)
./import-paperless.sh \
http://localhost:7880 \
your_user \
'YOUR_PASSWORD' \
/home/user/paperless/data/db.sqlite3 \
/home/user/paperless/media/documents/originals \
metadata
Note: This run was orchestrated with Claude Code; the core changes were aligning paths (auth at /api/v1/open/auth/login, secured endpoints under /app/api/v1/sec/…), switching JSON payloads to printf/jq -n, fixing upload format to file[] with proper metadata JSON, and deferring metadata until items had IDs via /checkfile/{checksum}. Also, I did disable my 2FA in the process, to make things simpler.
Request
- Please clarify if the
/apivs/app/apisplit is intended behavior when UI is served from /app - Consider updating import-paperless.sh to:
a. Support two-pass mode (upload, then metadata after processing)
b. Use correct API paths consistently
c. Include required fields like
"use":"correspondent"for organizations d. Use proper upload format (file[]with metadata JSON) e. Build JSON viajqorprintfto avoid quoting issues f. Handle missingbccommand (use$((seconds * 1000))instead)
Attached is the modified version of totti4ever's original import-paperless.sh that successfully imported the documents with full metadata from Paperless-ngx v2.18.4.
Thanks for the excellent project!
Hi @voyager Thank you for sharing your work! The script was 5 years old so this was probably some effort! 💪🏼 I myself don't run paperless and won't have time to look into it. If you want, you can make a PR (just paste your text into the readme so people have some info how to use it). The tools section is really a best-effort thing and I don't expect these scripts to work without taking a look into them in general. They are great if they just work :-) and also when being a starting point for the next person.
Regarding the api paths: This is new to me. There is no separation of these paths. The /app path is always only for the ui and all api calls are behind /api. This is how ui and api are separated. I just checked my installation and the browser's network tab is doing it like this. whenever I use /app/api/* (or /app/* in general), I get the html page - I never get an api response. So I'm not sure why you observed this - really strange to me!