Flaky UI tests for ToggleIcon and Tabulator on MacOS
With latest 9404b4348a80a190b3dcda7f270ba3e5b3c10210 on MacOS I get a few UI test fails.
One run see full log where these fail:
FAILED panel/tests/ui/pane/test_textual.py::test_textual_app - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_patch_no_height_resize - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_tabulator.py::test_selection_indices_on_paginated_sorted_and_filtered_data[remote] - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[index-True] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[index-False] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[foo-False] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[foo-True] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
Another run see full log where these fail:
FAILED panel/tests/ui/io/test_reload.py::test_reload_app_on_local_module_change - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/pane/test_textual.py::test_textual_app - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_patch_no_height_resize - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_tabulator.py::test_selection_indices_on_paginated_sorted_and_filtered_data[remote] - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[index-True] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
=
I tried turning xdist off via
$ pixi run -e test-ui pytest --ui panel/tests/ui/widgets/test_icon.py -v --browser chromium -n logical --dist no -n 0
but still got test fails ( see full log ):
FAILED panel/tests/ui/pane/test_textual.py::test_textual_app - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/pane/test_vizzu.py::test_vizzu_click - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/template/test_editabletemplate.py::test_editable_template_drag_item - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_icon.py::test_toggle_icon_width_height - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_icon.py::test_toggle_icon_size - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_patch_no_height_resize - TimeoutError: wait_until timed out in 5000 milliseconds
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_header_filter_no_horizontal_rescroll[remote] - AssertionError: assert {'height': 20...: 714, 'y': 9} == {'height': 20...: 264, 'y': 9}
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[index-True] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[index-False] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[foo-True] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[foo-False] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
FAILED panel/tests/ui/widgets/test_tabulator.py::test_selection_indices_on_paginated_sorted_and_filtered_data[remote] - TimeoutError: wait_until timed out in 5000 milliseconds
ERROR panel/tests/ui/widgets/test_tabulator.py::test_tabulator_header_filter_no_horizontal_rescroll[None] - pluggy.PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown.
The textual fail is due to a recent breaking API change - see #7117
The others are flaky tests I think, although this seems to fail for me consistently now:
panel $ pixi run -e test-ui pytest --ui panel/tests/ui/widgets/test_icon.py -k test_toggle_icon_size -v
=============================================================================================== test session starts ================================================================================================
platform darwin -- Python 3.12.5, pytest-7.4.4, pluggy-1.5.0 -- /Users/cdeil/code/oss/panel/.pixi/envs/test-ui/bin/python3.12
cachedir: .pytest_cache
rootdir: /Users/cdeil/code/oss/panel
configfile: pyproject.toml
plugins: asyncio-0.23.8, cov-5.0.0, github-actions-annotate-failures-0.2.0, playwright-0.5.0, rerunfailures-14.0, anyio-4.4.0, base-url-2.1.0, xdist-3.6.1
asyncio: mode=Mode.AUTO
collected 17 items / 16 deselected / 1 selected
panel/tests/ui/widgets/test_icon.py::test_toggle_icon_size FAILED [100%]
===================================================================================================== FAILURES =====================================================================================================
______________________________________________________________________________________________ test_toggle_icon_size _______________________________________________________________________________________________
page = <Page url='http://localhost:65348/'>
def test_toggle_icon_size(page):
icon = ToggleIcon(size="120px")
serve_component(page, icon)
# test defaults
assert icon.icon == "heart"
assert not icon.value
icon_element = page.locator(".ti-heart")
> wait_until(lambda: icon_element.bounding_box()["width"] == 120)
E TimeoutError: wait_until timed out in 5000 milliseconds
panel/tests/ui/widgets/test_icon.py:66: TimeoutError
----------------------------------------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------------------------------------
Launching server at http://localhost:65348
----------------------------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------------------------
INFO:bokeh.server.server:Starting Bokeh server version 3.5.1 (running on Tornado 6.4.1)
INFO:bokeh.server.tornado:User authentication hooks NOT provided (default user enabled)
INFO:bokeh.server.views.ws:WebSocket connection opened
INFO:bokeh.server.views.ws:ServerConnection created
------------------------------------------------------------------------------------------------ Captured log call -------------------------------------------------------------------------------------------------
INFO tornado.access:web.py:2348 200 GET /liveness (127.0.0.1) 0.39ms
INFO tornado.access:web.py:2348 200 GET / (::1) 17.99ms
INFO tornado.access:web.py:2348 200 GET /static/js/bokeh.min.js?v=276377ed021e1611c60311b355033c865900f31a918aa4565aba37a78700f17b017100a8a618bded4140c6ad247a0b0237d3a02bee9fd722ce67a459479522dc (::1) 1.99ms
INFO tornado.access:web.py:2348 200 GET /static/extensions/panel/bundled/reactiveesm/es-module-shims@%5E1.10.0/dist/es-module-shims.min.js (::1) 2.09ms
INFO tornado.access:web.py:2348 200 GET /static/js/bokeh-gl.min.js?v=70bc1a9856b732e888ed6b2a8e9b6382bf538fee3ec9f1145b8db1778158fd51e478dbe0600650e30d5a0083b12fc43961bc7b2ef3e9f366000199b83b9a1644 (::1) 0.38ms
INFO tornado.access:web.py:2348 200 GET /static/extensions/panel/panel.min.js?v=a91daab4668e3299f59ed231b5da2e657f5e65d10a1d501ff0a660306b1fdb79 (::1) 4.22ms
INFO tornado.access:web.py:2348 200 GET /static/js/bokeh-widgets.min.js?v=8541420c1bb1dbde534df1d9b2be7c8248f61fca353a821ffc4d459b08b79c4b39f0ea1dd6960aa3b734bea988cf822dc6993c786de844db80e4f258dd90727f (::1) 1.91ms
INFO tornado.access:web.py:2348 200 GET /static/js/bokeh-tables.min.js?v=26281191594de496d010d87b3a56c1679330da29fcf72d3dab91ac4a45479c16b36e82ce4325f4217df4614fad13927fd7f1e1be64cf838e4a18a60852e2be0e (::1) 2.00ms
INFO tornado.access:web.py:2348 101 GET /ws (::1) 0.32ms
INFO tornado.access:web.py:2348 200 GET /static/extensions/panel/css/loading.css?v=1.5.0-b.3 (::1) 1.10ms
INFO tornado.access:web.py:2348 200 GET /static/extensions/panel/css/icon.css?v=1.5.0-b.3 (::1) 1.52ms
INFO tornado.access:web.py:2348 200 GET /static/extensions/panel/bundled/theme/default.css?v=1.5.0-b.3 (::1) 4.11ms
INFO tornado.access:web.py:2348 200 GET /static/extensions/panel/bundled/theme/native.css?v=1.5.0-b.3 (::1) 9.85ms
--------------------------------------------------------------------------------------------- Captured stderr teardown ---------------------------------------------------------------------------------------------
INFO:bokeh.server.views.ws:WebSocket connection closed: code=1001, reason=None
============================================================================================= short test summary info ==============================================================================================
FAILED panel/tests/ui/widgets/test_icon.py::test_toggle_icon_size - TimeoutError: wait_until timed out in 5000 milliseconds
========================================================================================= 1 failed, 16 deselected in 6.08s =========================================================================================
I see similar fails in CI: https://github.com/holoviz/panel/actions/runs/10333902177/job/28606768301?pr=7120
I've tried to mitigate some of these but it is indeed a game of whack-a-mole. I also couldn't reproduce a bunch of them so these are the ones I focused on:
FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_patch_no_height_resize - TimeoutError: wait_until timed out in 5000 milliseconds FAILED panel/tests/ui/widgets/test_tabulator.py::test_selection_indices_on_paginated_sorted_and_filtered_data[remote] - TimeoutError: wait_until timed out in 5000 milliseconds FAILED panel/tests/ui/widgets/test_tabulator.py::test_tabulator_edit_event_and_header_filters_same_column[index-True] - playwright._impl._errors.TimeoutError: Locator.fill: Timeout 20000ms exceeded.
Wow, thanks!
Maybe mark remaining flaky UI tests on MacOS only like this to remove the noise?
pytest.mark.skipif(sys.platform == 'darwin', strict=False, reason="Flaky, see GH 7118")
See https://docs.pytest.org/en/7.1.x/explanation/flaky.html
Or alternatively - do you think it should be possible to get reliable tests? Or is there something fundamental in Panel / Bokeh / Python async & threading / MacOS / Pywright / etc that prevents this?
I saw yesterday that Bokeh doesn't use Playwright and do much UI testing on MacOS probably because they've run into similar issues?
Yes, it should be possible to get more reliable tests, I'm 99% certain this is just about how the tests are structured. Specifically Playwright operates much faster than any real world usage ever would, so that causes some issues that aren't visible otherwise. By restructuring the tests and/or adding a bunch of additional timeouts we could probably make them more reliable. You could try to test that theory by re-running the UI tests with --slowmo 100 or so, which adds 100 ms timeouts between all interactions.