Stirling-PDF icon indicating copy to clipboard operation
Stirling-PDF copied to clipboard

[Bug]: Auto Split PDF page dividers not recognised

Open canders1212 opened this issue 1 month ago • 5 comments

Installation Method

Docker

The Problem

When using the Auto Split PDF function, and uploading a scanned document containing the QR code divider, the divider is not recognised.

Scan is 300 DPI (attached).

If I instead add the divider 'electronically' using the merge, organise, export functions, it works fine. Suggesting that it does not like something about the scanned QR code divider. I note that I have tried it using the newly released V2 website, and it splits fine.

brn94ddf85af93d_000711.pdf

Version of Stirling-PDF

1.5.0

Last Working Version of Stirling-PDF

No response

Page Where the Problem Occurred

No response

Docker Configuration

version: '3.3'
services:
  stirling-pdf:
    image: docker.stirlingpdf.com/stirlingtools/stirling-pdf:latest
    ports:
      - '7890:8080'
    volumes:
      - ./StirlingPDF/trainingData:/usr/share/tessdata # Required for extra OCR languages
      - ./StirlingPDF/extraConfigs:/configs
      - ./StirlingPDF/customFiles:/customFiles/
      - ./StirlingPDF/logs:/logs/
      - ./StirlingPDF/pipeline:/pipeline/
    environment:
      - DISABLE_ADDITIONAL_FEATURES=false
      - LANGS=en_GB

Relevant Log Output

info.log shows the following when the Auto Split tool is trying to digest the scan:

2025-11-03 23:03:30,880 WARN o.a.p.p.g.c.PDICCBased [qtp1783192968-42] ICC profile is Perceptual, ignoring, treating as Display class

Additional Information

No response

Browsers Affected

Chrome

No Duplicate of the Issue

  • [x] I have verified that there are no existing issues raised related to my problem.

canders1212 avatar Nov 03 '25 23:11 canders1212

Hi @canders1212! I'm Dosu and I’m helping the Stirling-PDF team.

The issue with scanned QR code dividers not being recognized in Stirling-PDF v1.5.0 is likely due to limitations in how that version processes scanned images—especially around image preprocessing, ICC profiles, and QR code detection. Electronically added dividers are cleaner and easier for the system to detect, while scanned dividers can be affected by scan quality, DPI, and color space handling. The ICC profile warning in your logs suggests the image rendering pipeline may not be optimally handling your scan, which can impact detection.

Recent updates in the newly released V2 (see PR #4446) introduced improved support for QR code divider sheets, duplex mode, and better image handling, which is why your scans work fine there. If possible, upgrading to V2 is the most reliable solution.

If you need to use v1.5.0, try rescanning with different settings (e.g., grayscale, higher contrast, or different DPI), and ensure your Docker image includes all necessary dependencies like OpenCV for image processing. Also, use the official divider sheet provided by Stirling-PDF (download link) for best results.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Nov 03 '25 23:11 dosubot[bot]

Really odd V2 website split it fine

Per my knowledge it should be running the same as v1.5 for backend code for this feature

Frooodle avatar Nov 03 '25 23:11 Frooodle

I've just checked again and the v2 website handles it no problem.

I think we can rule out some kind of install issue on my end given that i'm using the stock docker-compose (other than port), and it works fine if I insert the divider page electronically rather than by scan...?

Really odd V2 website split it fine

Per my knowledge it should be running the same as v1.5 for backend code for this feature

canders1212 avatar Nov 03 '25 23:11 canders1212

I just updated my docker container and auto split stopped working. Using the files from here: https://github.com/Stirling-Tools/Stirling-PDF/issues/2281#issuecomment-2644132277

Pinning the image to 1.3.2-fat works, 1.4.0 and 1.5.0/latest don't work

edit: the pipline split stopped working, haven't tested the webinterface

benj919 avatar Nov 04 '25 22:11 benj919

Ahhh that explains it.. our V2 is based on a older version ATM we are in the works of updating V2 to pull latest OSS

Must be a bug introduced recently Let me investigate and resolve this thanks!

Frooodle avatar Nov 04 '25 23:11 Frooodle