Thread race condition when writing into log file
Describe the bug We use 2 threads if needed in cupsd, and in rare case and if logging is set into a file, the output gets combined, which might corrupt the file for other tools, like grep.
We can sometimes see it in current stable release 2.4.16 (in upstream project CI), but atm I have logs from 2.4.10 from Centos Stream 10 - the log file corruption causes failure of CUPS test suite at point where we look for expected number of warnings. If the file is corrupted (by NULL byte - I see it Vim as ^@), grep does not scan the whole file and prints out only matches before the NULL byte, which causes mismatch in number of warnings.
I see it in the logs (here is the log which contains the NULL byte):
9039 D [05/Dec/2025:11:50:50.263433 +0000] Test3: Generating PPD file from "ipp://localhost:8631/printers/Test2"...
9040 D [05/Dec/2025:11:50:50.263494 +0000] [Notifier] event-notification-attributes-tag:
9041 D [05/Dec/2025:11:50:50.263501 +0000] [Notifier]
9042 D [05/Dec/2025:11:50:50.263506 +0000] [Notifier] notify-charset (charset) utf-8
9043 D [05/Dec/2025:11:50:50.263510 +0000] [Notifier] notify-natural-language (naturalLanguage) en-us
9044 D [05/Dec/2025:11:50:50.263515 +0000] [Notifier] notify-subscription-id (integer) 4
9045 D [05/Dec/2025:11:50:50.263519 +0000] [Notifier] notify-sequence-number (integer) 14
9046 D [05/Dec/2025:11:50:50.263524 +0000] [Notifier] notify-subscribed-event (keyword) printer-state-changed
9047 D [05/Dec/2025:11:50:50.263528 +0000] [Notifier] printer-up-time (integer) 1764935450
9048 D [05/Dec/2025:11:50:50.263532 +0000] [Notifier] notify-text (textWithoutLanguage) Now accepting jobs.
9049 D [05/Dec/2025:11:50:50.263536 +0000] [Notifier] notify-printer-uri (uri) ipp://491a4fd8-fbb9-4631-972a-dd0a5c29331b/printers/Test3
9050 D [05/Dec/2025:11:50:50.263544 +0000] [Notifier] printer-name (nameWithoutLanguage) Test3
9051 D [05/Dec/2025:11:50:50.263548 +0000] [Notifier] printer-state (enum) stopped
9052 D [05/Dec/2025:11:50:50.263552 +0000] [Notifier] printer-state-reasons (keyword) paused
9053 D [05/Dec/2025:11:50:50.263556 +0000] [Notifier] printer-is-accepting-jobs (boolean) true
9054 D [05/Dec/2025:11:50:50.263560 +0000] Test3: Connected to localhost:8631, sending Get-Printer-Attributes request...^@/printers/Test3
9055 D [05/Dec/2025:11:50:50.263568 +0000] [Notifier]
9056 D [05/Dec/2025:11:50:50.263577 +0000] [Notifier] notify-charset (charset) utf-8
9057 D [05/Dec/2025:11:50:50.263581 +0000] [Notifier] notify-natural-language (naturalLanguage) en-us
9058 D [05/Dec/2025:11:50:50.263586 +0000] [Notifier] notify-subscription-id (integer) 4
9059 D [05/Dec/2025:11:50:50.263594 +0000] [Notifier] notify-sequence-number (integer) 15
9060 D [05/Dec/2025:11:50:50.263598 +0000] [Notifier] notify-subscribed-event (keyword) printer-state-changed
and a case when the mixing of logs happens as well, but there is no NULL byte to corrupt the log:
D [05/Dec/2025:11:19:34.144323 +0000] Test3: Generating PPD file from "ipp://localhost:8631/printers/Test2"...
D [05/Dec/2025:11:19:34.144386 +0000] [Notifier] event-notification-attributes-tag:
D [05/Dec/2025:11:19:34.144392 +0000] [Notifier]
D [05/Dec/2025:11:19:34.144397 +0000] [Notifier] notify-charset (charset) utf-8
D [05/Dec/2025:11:19:34.144402 +0000] [Notifier] notify-natural-language (naturalLanguage) en-us
D [05/Dec/2025:11:19:34.144406 +0000] [Notifier] notify-subscription-id (integer) 4
D [05/Dec/2025:11:19:34.144409 +0000] [Notifier] notify-sequence-number (integer) 14
D [05/Dec/2025:11:19:34.144413 +0000] [Notifier] notify-subscribed-event (keyword) printer-state-changed
D [05/Dec/2025:11:19:34.144417 +0000] [Notifier] printer-up-time (integer) 1764933574
D [05/Dec/2025:11:19:34.144421 +0000] [Notifier] notify-text (textWithoutLanguage) Now accepting jobs.
D [05/Dec/2025:11:19:34.144425 +0000] [Notifier] notify-printer-uri (uri) ipp://c4b38e94-e69d-4938-8190-dae9089c5e0c/printers/Test3
D [05/Dec/2025:11:19:34.144429 +0000] [Notifier] printer-name (nameWithoutLanguage) Test3
D [05/Dec/2025:11:19:34.144434 +0000] Test3: Connected to tate (enum) stopped
D [05/Dec/2025:11:19:34.144438 +0000] [Notifier] printer-state-reasons (keyword) paused
D [05/Dec/2025:11:19:34.144455 +0000] [Notifier] printer-state-reasons (keyword) paused
D [05/Dec/2025:11:19:34.144461 +0000] [Notifier] printer-is-accepting-jobs (boolean) true
It looks like it always happens during running the thread for creating PPD and notifier, so I guess it is threading problem.
To Reproduce Steps to reproduce the behavior:
- run CUPS test suite by 'make test'
- sometimes the test suite fails due incorrect number of expected warnings because grep is not able to read whole log file
Test suite log looks like this:
PASS: 33 error messages.
grep: /tmp/cups-cupstester/log/error_log: binary file matches
FAIL: 4 warning messages, expected 14.
W [05/Dec/2025:11:50:40.404924 +0000] No JobPrivateAccess defined in policy default - using defaults.
W [05/Dec/2025:11:50:40.404948 +0000] No JobPrivateValues defined in policy default - using defaults.
W [05/Dec/2025:11:50:40.404952 +0000] No SubscriptionPrivateAccess defined in policy default - using defaults.
W [05/Dec/2025:11:50:40.404954 +0000] No SubscriptionPrivateValues defined in policy default - using defaults.
grep: /tmp/cups-cupstester/log/error_log: binary file matches
Expected behavior Passing test suite, not mixed logs
System Information:
- logs are from Centos Stream 10 with 2.4.10, but seen in upstream project as well sometimes
Strange, since we have a mutex controlling writes to the error_log file...
Hah, I knew it was not my imagination and I saw this symptom even in Github CI - so f.e. CI results from my last PR - https://github.com/OpenPrinting/cups/actions/runs/20136013588/job/57789439542?pr=1454 :
grep: /tmp/cups-runner/log/error_log: binary file matches
FAIL: 4 warning messages, expected 14.
grep: /tmp/cups-runner/log/error_log: binary file matches
W [11/Dec/2025:14:16:02.638950 +0000] No JobPrivateAccess defined in policy default - using defaults.
W [11/Dec/2025:14:16:02.638953 +0000] No JobPrivateValues defined in policy default - using defaults.
W [11/Dec/2025:14:16:02.638956 +0000] No SubscriptionPrivateAccess defined in policy default - using defaults.
W [11/Dec/2025:14:16:02.638958 +0000] No SubscriptionPrivateValues defined in policy default - using defaults.
grep: /tmp/cups-runner/log/error_log: binary file matches
PASS: 0 notice messages.
grep: /tmp/cups-runner/log/error_log: binary file matches
PASS: 127 info messages.
grep: /tmp/cups-runner/log/error_log: binary file matches
PASS: 1961 debug messages.
grep: /tmp/cups-runner/log/error_log: binary file matches
PASS: 6063 debug2 messages.
grep: /tmp/cups-runner/log/error_log: binary file matches
I'll print out the file in the CI runner (I don't know a way how to access the files from CI run...) and see if file contains a NULL character somewhere.
Like we could work it around in tests by using grep -a so the binary files would be taken as a text and the test would have passed, but it really seems there is some threading issue.
Neat! You already set the Github CI to upload test artifacts - so the NULL byte is here in the file:
9135 d [11/Dec/2025:14:16:13.224470 +0000] cupsdStartProcess: Setting file actions.
9136 D [11/Dec/2025:14:16:13.224472 +0000] cupsdStartProcess: Setting file actions.
9137 d [11/Dec/2025:14:16:13.224474 +0000] [Notifier] printer-is-accepting-jobs (boolean) true^@otifier/testnotify"
9138 D [11/Dec/2025:14:16:13.224483 +0000] [Notifier] printer-is-accepting-jobs (boolean) true
9139 d [11/Dec/2025:14:16:13.224485 +0000] cupsdDeleteTemporaryPrinters: Removing unused temporary printers
again, related with Notifier messaging.