briefcase Improve log streaming for macOS and iOS test suites

What is the problem or limitation you are having?

iOS and macOS both use log streaming to surface the output of test suites. However, this approach can be problematic, as the log streamer can (and does) drop some log messages (and reports this fact); and the log streaming process takes some time to start, so the start of tests can be lost.

Describe the solution you'd like

Fundamentally, macOS and iOS should be able to reliably obtain test output, rather than being reliant on not dropping output.

One possible approach: this problem only exists problem because log streaming is unreliable - calling log stream --predicate ... involves filtering all logs at runtime; if the system is busy, this can overload the ability to live stream, and the log streamer prioritises currency over completeness.

It is possible to obtain a guaranteed complete log after the test suite has executed by invoking log show --start <date> --predicate ... - the same predicate, but with a specific start time.

If briefcase test fails, briefcase could call log show --start to get a guaranteed complete log (ensuring that all test failures are visible to the end user). If briefcase test fails to detect an exit status, the content of log show --start can be used to detect the exit status.

This does mean that in the case of a failure, the test suite output would be displayed twice... but I'd argue it's better to guarantee that you have a complete test log than to fail or obscure a test failure because the streamer couldn't keep up.

This is only needed on test suite runs, when the test suite fails (or has an uncertain outcome). Dropping log streams is less critical during normal app execution; and if we know a test suite passes, there's not really anything to diagnose in the logs because all the output will tell us is "everything worked".

Describe alternatives you've considered

Status quo, including hacks that repeat log output with pauses to prevent missing log lines
Rework how test output is collected so that Apple's log streaming isn't used at all. Something like the debugging hooks being discussed as part of #2147 might work for this, if it were used to inject a debugging hook that installed a stdout/stderr handler, and directed output over a socket to a Briefcase server that is waiting to receive that output. This would also allow the introduction of a character based, rather than a line-based console interface. This is an invasive approach, but might work
Rework how tests are executed at all. This has been discussed in the context of adding a test suite for the web backend - making the app test suite something that runs locally, using a "remote control" mechanism to run the tests in the app container. That way there's no streaming issue - but it does mean that true "run my test suite as the app" testing (such as is required by Python itself, and the Python-support-testbed) isn't possible.

Additional context

No response

Feb 14 '25 07:02 freakboy3742

Other ideas mentioned in #2086:

Filter the log with the --process option.
Use the --stdout and --stderr arguments to open.

Feb 14 '25 12:02 mhsmith

At least for iOS, this may be solved by the method of https://github.com/python/cpython/pull/138018.

Aug 21 '25 09:08 mhsmith

To be clear - the Cpython fix won't impact Briefcase (at least, not without additional changes). That fix only corrects the XCUnit operation of the iOSTestbed, which is used by CPython and cibuildwheel, but not Briefcase.

However, it might be possible to use a similar approach, integrated into the Briefcase iOS Xcode template.

Aug 21 '25 09:08 freakboy3742