schemathesis icon indicating copy to clipboard operation
schemathesis copied to clipboard

[FEATURE] Distinguish more errors for Hypothesis' shrinking logic

Open Zac-HD opened this issue 4 years ago • 1 comments

Ideally, Schemathesis would produce a single minimal example of each distinct error. We use Hypothesis' shrinking for this, which requires us to distinguish errors by exception type, code location, and __context__; and so Schemathesis creates new exception types for each distinct error message:

https://github.com/schemathesis/schemathesis/blob/7e71b00f285139c79d96d2dfb5cb16f6463e2544/src/schemathesis/exceptions.py#L43-L47

Collapsing errors by message seems fine in cases like {expected_contenttype}_{got_contenttype}, but collapsing all HTTP 500 errors together because they share a status code probably causes under-counting.

Re-running on the fixed version will re-find the remaining errors, so that's not a disaster, but it would be a nicer UX (and paper result) to report them all in the first place.

Possible fixes

  • If we can access the original exception when using the ASGI/WSGI integration, raise schemathesis_exc from the_inner_exc is sufficient. (and perhaps e.g. enable Flask's debug mode to ensure such exceptions are propagated)

  • Maybe we could distinguish HTTP 5xx by message as well status code, if we removed the dynamic response content? This risks not-deduplicating though, which I don't like.

  • Dynamically create a "from" exception using locations and messages from stringified stack traces and/or locations in responses, for example from Jupyter Server & Jupyterhub, and then raise from that. This seems promising to me, and reasonably easy to implement on a per-programming-language basis.

Zac-HD avatar Oct 04 '21 03:10 Zac-HD

Hi @Zac-HD !

Thank you for creating an issue about this! :) I am inclined to think that having access to the underlying exception would be the best way to go in principle. However, as, generally, HTTP response content is opaque, we might need to have some kind of an agent on the application side (similar to what Sentry is doing) to get the data to re-create the exception on the Schemathesis side.

For such opaque cases, it could be another Schemathesis hook as it heavily depends on the app implementation. E.g. it will accept the app response + generated case and then return an exception class. In such a case, the user will be able to connect to the agent inside the hook and get the app-level exception.

Though, for the ASGI/WSGI case, it could be a built-in thing.

Maybe we could distinguish HTTP 5xx by message as well status code, if we removed the dynamic response content? This risks not-deduplicating though, which I don't like.

Indeed, looks very fragile to me.

Stranger6667 avatar Oct 16 '21 10:10 Stranger6667