klein icon indicating copy to clipboard operation
klein copied to clipboard

hyperlink generates invalid URLs via hypothesis which sometimes causes our tests to fail

Open glyph opened this issue 3 years ago • 1 comments

for example: https://github.com/twisted/klein/runs/6682625798?check_suite_focus=true

Error: 
Traceback (most recent call last):
  File "/home/runner/work/klein/klein/.tox/coverage-py37-tw203/lib/python3.7/site-packages/klein/test/test_request_compat.py", line 99, in test_uri
    def test_uri(self, url: DecodedURL) -> None:
  File "/home/runner/work/klein/klein/.tox/coverage-py37-tw203/lib/python3.7/site-packages/hypothesis/core.py", line 1235, in wrapped_test
    raise the_error_hypothesis_found
  File "/home/runner/work/klein/klein/.tox/coverage-py37-tw203/lib/python3.7/site-packages/hyperlink/hypothesis.py", line 321, in decoded_urls
    return DecodedURL(draw(encoded_urls()))
  File "/home/runner/work/klein/klein/.tox/coverage-py37-tw203/lib/python3.7/site-packages/hyperlink/_url.py", line 2046, in __init__
    self.host, self.userinfo, self.path, self.query, self.fragment
  File "/home/runner/work/klein/klein/.tox/coverage-py37-tw203/lib/python3.7/site-packages/hyperlink/_url.py", line 2179, in path
    for p in self._url.path
  File "/home/runner/work/klein/klein/.tox/coverage-py37-tw203/lib/python3.7/site-packages/hyperlink/_url.py", line 2179, in <listcomp>
    for p in self._url.path
  File "/home/runner/work/klein/klein/.tox/coverage-py37-tw203/lib/python3.7/site-packages/hyperlink/_url.py", line 766, in _percent_decode
    return unquoted_bytes.decode(subencoding)
builtins.UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

klein.test.test_request_compat.HTTPRequestWrappingIRequestTests.test_uri

glyph avatar Jul 14 '22 23:07 glyph

The bug in Hyperlink is here: https://github.com/python-hyper/hyperlink/issues/153

glyph avatar Jul 14 '22 23:07 glyph

We should really just stop using hypothesis. I don't think it's providing any value and these constant intermittent failures are definitely not worth whatever issues it is supposedly addressing.

glyph avatar May 02 '23 18:05 glyph

If the idea is "who cares about bugs in the URL parser, just let it be broken", then the Hypothesis strategy can be adjusted to avoid those cases.

exarkun avatar May 02 '23 19:05 exarkun

If the idea is "who cares about bugs in the URL parser, just let it be broken", then the Hypothesis strategy can be adjusted to avoid those cases.

There's no URL parser in this project. If you look at the traceback, it's entirely in Hyperlink and Hypothesis, so the bug should be dealt with over there. But Hypothesis is used in a few other places too, making the tests pointlessly slow (it specifically sets up a "patient" strategy so it can be extra slow) and feeding tons of test highly variable data into things that aren't parsing it and don't have any interesting branches around its inputs. If hyperlink is using hypothesis to good effect, then great, it's a parser, it benefits from this type of testing. But at best we are just replicating the work of testing it in this CI, which is not helpful.

I'm going to put in a simple stub module so the tests themselves remain somewhat compatibly parameterized if anyone cares to put it back.

glyph avatar May 02 '23 20:05 glyph

Also I don't know if it's doing a great job validating Hyperlink right now… https://github.com/python-hyper/hyperlink/issues/181

glyph avatar May 02 '23 20:05 glyph

Also, to be clear, the bug here is that encoded_urls generate garbage by instantiating garbage directly, rather than parsing garbage, so this is just data that we will never be dealing with anyway. So yes, there's a bug in the strategy too (and it has been filed upstream, as in the above comment) but this traceback just isn't relevant to Klein in any way.

glyph avatar May 02 '23 20:05 glyph