Intermittent crash in Android testbed
Describe the bug
I keep getting a frustratingly intermittent crash when running the Android testbed test suite. It's reproducible in the sense that I've yet to run the test suite all the way through successfully, but exactly where it happens seems highly variable. I've tried running several of the apparently offending tests in isolation, but haven't gotten any of them to fail except for tests/window/test_window.py::test_window_state_change. That one crashes sometimes.
Very rarely, it crashes before the tests have even started.
Steps to reproduce
- Run the Android testbed:
briefcase run android --test - See error at a seemingly random point.
- Weep.
Expected behavior
The testbed running successfully and not crashing.
Screenshots
No response
Environment
- Operating System: macOS Sonoma 14.7.6
- Python version: 3.12.0
- Software versions:
- Briefcase: 0.3.24 (also tried with 0.3.22)
- Toga: main development branch (also tried with 0.5.1 and 0.5.2)
Logs
When it crashes during testing (which is the vast majority of the time):
--------- beginning of crash
E/AndroidRuntime: FATAL EXCEPTION: main
E/AndroidRuntime: Process: org.beeware.toga.testbed, PID: 10419
E/AndroidRuntime: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
E/AndroidRuntime: at java.util.ArrayList.get(ArrayList.java:437)
E/AndroidRuntime: at android.view.ViewTreeObserver$CopyOnWriteArray$Access.get(ViewTreeObserver.java:1272)
E/AndroidRuntime: at android.view.ViewTreeObserver.dispatchOnGlobalLayout(ViewTreeObserver.java:1061)
E/AndroidRuntime: at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:3197)
E/AndroidRuntime: at android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:2126)
E/AndroidRuntime: at android.view.ViewRootImpl$TraversalRunnable.run(ViewRootImpl.java:8653)
E/AndroidRuntime: at android.view.Choreographer$CallbackRecord.run(Choreographer.java:1037)
E/AndroidRuntime: at android.view.Choreographer.doCallbacks(Choreographer.java:845)
E/AndroidRuntime: at android.view.Choreographer.doFrame(Choreographer.java:780)
E/AndroidRuntime: at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:1022)
E/AndroidRuntime: at android.os.Handler.handleCallback(Handler.java:938)
E/AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:99)
E/AndroidRuntime: at android.os.Looper.loopOnce(Looper.java:201)
E/AndroidRuntime: at android.os.Looper.loop(Looper.java:288)
E/AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:7839)
E/AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method)
E/AndroidRuntime: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
E/AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1003)
I/Process : Sending signal. PID: 10419 SIG: 9
This seems to always be identical, with the exception of the PID and the index / size of the array.
When it crashes during test collection, I see this:
--------- beginning of crash
E/AndroidRuntime: FATAL EXCEPTION: main
E/AndroidRuntime: Process: org.beeware.toga.testbed, PID: 1869
E/AndroidRuntime: java.lang.RuntimeException: Unable to start activity ComponentInfo{org.beeware.toga.testbed/org.beeware.android.MainActivity}:
com.chaquo.python.PyException: CoverageWarning: Couldn't import C tracer: No module named 'coverage.tracer' (no-ctracer)
E/AndroidRuntime: at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3635)
E/AndroidRuntime: at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3792)
E/AndroidRuntime: at android.app.ActivityThread.handleRelaunchActivityInner(ActivityThread.java:5738)
E/AndroidRuntime: at android.app.ActivityThread.handleRelaunchActivity(ActivityThread.java:5630)
E/AndroidRuntime: at android.app.servertransaction.ActivityRelaunchItem.execute(ActivityRelaunchItem.java:71)
E/AndroidRuntime: at android.app.servertransaction.ActivityTransactionItem.execute(ActivityTransactionItem.java:45)
E/AndroidRuntime: at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
E/AndroidRuntime: at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
E/AndroidRuntime: at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2210)
E/AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:106)
E/AndroidRuntime: at android.os.Looper.loopOnce(Looper.java:201)
E/AndroidRuntime: at android.os.Looper.loop(Looper.java:288)
E/AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:7839)
E/AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method)
E/AndroidRuntime: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
E/AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1003)
E/AndroidRuntime: Caused by: com.chaquo.python.PyException: CoverageWarning: Couldn't import C tracer: No module named 'coverage.tracer' (no-ctracer)
E/AndroidRuntime: at <python>.coverage.control._warn(control.py:461)
E/AndroidRuntime: at <python>.coverage.core.__init__(core.py:97)
E/AndroidRuntime: at <python>.coverage.control._init_for_start(control.py:558)
E/AndroidRuntime: at <python>.coverage.control.start(control.py:664)
E/AndroidRuntime: at <python>.__main__.<module>(testbed.py:154)
E/AndroidRuntime: at <python>.runpy._run_code(<frozen runpy>:88)
E/AndroidRuntime: at <python>.runpy._run_module_code(<frozen runpy>:98)
E/AndroidRuntime: at <python>.runpy.run_module(<frozen runpy>:226)
E/AndroidRuntime: at <python>.chaquopy_java.call(chaquopy_java.pyx:352)
E/AndroidRuntime: at <python>.chaquopy_java.Java_com_chaquo_python_PyObject_callAttrThrowsNative(chaquopy_java.pyx:324)
E/AndroidRuntime: at com.chaquo.python.PyObject.callAttrThrowsNative(Native Method)
E/AndroidRuntime: at com.chaquo.python.PyObject.callAttrThrows(PyObject.java:232)
E/AndroidRuntime: at com.chaquo.python.PyObject.callAttr(PyObject.java:221)
E/AndroidRuntime: at org.beeware.android.MainActivity.onCreate(MainActivity.java:105)
E/AndroidRuntime: at android.app.Activity.performCreate(Activity.java:8051)
E/AndroidRuntime: at android.app.Activity.performCreate(Activity.java:8031)
E/AndroidRuntime: at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1329)
E/AndroidRuntime: at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3608)
E/AndroidRuntime: ... 15 more
I/Process : Sending signal. PID: 1869 SIG: 9
(That is, unfortunately, truncated even in the saved log file.)
Additional context
No response
E/AndroidRuntime: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
E/AndroidRuntime: at java.util.ArrayList.get(ArrayList.java:437)
E/AndroidRuntime: at android.view.ViewTreeObserver$CopyOnWriteArray$Access.get(ViewTreeObserver.java:1272)
https://stackoverflow.com/questions/67821711 suggests this may be caused by adding or removing a GlobalLayoutListener while the system is iterating through the listener list. We do this in a few places, but the __del__ method in BaseProbe looks like the most likely candidate to happen at an unpredictable time.
E/AndroidRuntime: Caused by: com.chaquo.python.PyException: CoverageWarning: Couldn't import C tracer: No module named 'coverage.tracer' (no-ctracer)
E/AndroidRuntime: at <python>.coverage.control._warn(control.py:461)
E/AndroidRuntime: at <python>.coverage.core.__init__(core.py:97)
This looks like a simple case of a warning being raised while Python is set to treat warnings as errors, though I don't know why it would be intermittent.
"15 more" is not a truncation, it indicates that the remainder of the low-level exception's stack trace is shared with the high-level exception above the "Caused by" line.
https://stackoverflow.com/questions/67821711 suggests this may be caused by adding or removing a GlobalLayoutListener while the system is iterating through the listener list. We do this in a few places, but the
__del__method inBaseProbelooks like the most likely candidate to happen at an unpredictable time.
If nothing else, that should be something we can test - @HalfWhitt If you comment out the BaseProbe.__del__ method, presumably the app will leak, but it shouldn't crash... does that fix the problem for you?
If it does, the question becomes whether there's any way to put a concurrency lock on removing a listener, or to restructure the way we're handling the listener so that we don't need to add and remove the LayoutListener instance regularly.
One thought would be to avoid creating a fresh LayoutListener in Probe, and instead add a hook on the LayoutListener that is added as part of the Android implementation of Window. That way we're not adding a fresh listener at the Android level - we're using the same Android listener every time.