trio-websocket
trio-websocket copied to clipboard
tests have flakey coverage?
e.g. coverage tool reports regression on unrelated change:
https://coveralls.io/builds/20028763/source?filename=trio_websocket%2F_impl.py#L842
it's the except clause in _write_pending
if there are timing issues in the tests, that could explain intermittent errors like #68
Thank you for pointing this out, although I'm not sure how to interpret the coverage report.

It says "uncov" on the side but the lines are green?
Anyway, you are right that the coverage of these lines depends on a race. Here's a build where the 3.6 tests covered these lines but the 3.7 lines did not. There are other builds where both 3.6 and 3.7 cover.
Here's an example where test_server_does_not_close_handshake coverage is flakey. The first time it covers the lines in question (842-846), then the next time it does not.
(venv) mhaase@MEH-MBP:/V/C/trio-websocket:master$
rm .coverage; and pytest -k test_server_does_not_close_handshake --cov=trio_websocket; and coverage report -m
=============== test session starts ===============
platform darwin -- Python 3.7.0, pytest-3.10.0, py-1.7.0, pluggy-0.8.0
rootdir: /Volumes/Case_Sensitive/trio-websocket, inifile: pytest.ini
plugins: trio-0.5.1, cov-2.6.0
collected 31 items / 30 deselected
tests/test_connection.py . [100%]
---------- coverage: platform darwin, python 3.7.0-final-0 -----------
Name Stmts Miss Cover
------------------------------------------------
trio_websocket/__init__.py 1 0 100%
trio_websocket/_impl.py 365 160 56%
trio_websocket/version.py 1 0 100%
------------------------------------------------
TOTAL 367 160 56%
=============== 1 passed, 30 deselected in 0.24 seconds ===============
Name Stmts Miss Cover Missing
----------------------------------------------------------
trio_websocket/__init__.py 1 0 100%
trio_websocket/_impl.py 365 160 56% [SNIP] 808-809, 842-846, 852-854 [SNIP]
trio_websocket/version.py 1 0 100%
----------------------------------------------------------
TOTAL 367 160 56%
(venv) mhaase@MEH-MBP:/V/C/trio-websocket:master$
rm .coverage; and pytest -k test_server_does_not_close_handshake --cov=trio_websocket; and coverage report -m
=============== test session starts ===============
platform darwin -- Python 3.7.0, pytest-3.10.0, py-1.7.0, pluggy-0.8.0
rootdir: /Volumes/Case_Sensitive/trio-websocket, inifile: pytest.ini
plugins: trio-0.5.1, cov-2.6.0
collected 31 items / 30 deselected
tests/test_connection.py . [100%]
---------- coverage: platform darwin, python 3.7.0-final-0 -----------
Name Stmts Miss Cover
------------------------------------------------
trio_websocket/__init__.py 1 0 100%
trio_websocket/_impl.py 365 156 57%
trio_websocket/version.py 1 0 100%
------------------------------------------------
TOTAL 367 156 57%
=============== 1 passed, 30 deselected in 0.23 seconds ===============
Name Stmts Miss Cover Missing
----------------------------------------------------------
trio_websocket/__init__.py 1 0 100%
trio_websocket/_impl.py 365 156 57% [SNIP] 808-809, 846, 852-854 [SNIP]
trio_websocket/version.py 1 0 100%
----------------------------------------------------------
TOTAL 367 156 57%
I checked to see if there's any other difference in coverage between these two runs:
trio_websocket/_impl.py 365 160 56% 605-606, 842-846
trio_websocket/_impl.py 365 156 57% 604, 846
Notice that when the _write_pending() exception lines are skipped, so are 605 and 606. Here are the relevant source lines.
595: async def send_message(self, message):
596-602: <docstring>
603: if self._close_reason:
604: raise ConnectionClosed(self._close_reason)
605: self._wsproto.send_data(message)
606: await self._write_pending()
So the race seems to be related to sending at the same time a connection is closed. Sometimes it raise in line 604, and sometimes it raises inside _write_pending(). I don't necessarily think either code path is wrong, but each test needs to have a deterministic shutdown.
I messed around with this a bit on its own, and also in conjunction with implementing timeouts. I have a few ideas about how to improve this situation:
- Many of the WebSocketConnection methods have a guard condition like
if self._close_reason: raise ConnectionClosed. This guard condition is unnecessary and just adds a second code path that is prone to race conditions, as seen in the example at the end of my previous post. Most of these guards can safely be removed. - A lot of the tests have race conditions themselves, e.g. a test creates a client and a server and relies on the automatically closing behavior, so both ends race to initiating the closing handshake. Each test needs to have a single, deterministic order for tearing down connections.
- As an example, we can force the client to initiate closing by having the server call
get_message()when no message is pending, which will raiseConnectionClosedwhen the client has sent the closing handshake. - Alternatively, tests can use Trio events to force an intended ordering of events.
- As an example, we can force the client to initiate closing by having the server call
- I don't think it's possible to make network tests 100% determinstic, so a last resort we can add tests that force cover of problematic areas like lines 842-844 shown above. I.e. some tests may still have slightly different execution paths based on races, but at least one test is guaranteed to deterministically cover those lines. This doesn't prevent intermittent test failures, but it does ensure consistent coverage.
trio.testing may help
https://trio.readthedocs.io/en/latest/reference-testing.html#inter-task-ordering
Essentially the same as using events.
I don't think it's possible to make network tests 100% determinstic, so a last resort we can add tests that force cover of problematic areas like lines 842-844 shown above.
This is what I usually do...