pook icon indicating copy to clipboard operation
pook copied to clipboard

Deprecate non-`bytes` return types from mocked urllib responses

Open sarayourfriend opened this issue 3 months ago • 1 comments

How to prepare your code for this change

Note

The version with the deprecation notice has not yet been released, this copy is here just to prepare for its release. This notice will be removed once the deprecation is released.

Summary

Pook's mocked responses will, for urllib and HTTP client libraries that wrap it, incorrectly return strings rather than bytes for response content. The primary negative effect of this is many users will have introduced unnecessary code to handle the string responses from Pook, when their HTTP client libraries would under normal circumstances (e.g., during real application runtime) return bytes.

To fix this behaviour, call pook.apply_binary_body_fix() at least once (e.g., in your conftest.py if you use pytest). Additionally, remove any calls to Pook's .body() where binary is explicitly set. You may then remove any code that erroneously treated response content from your HTTP client library as non-bytes.

For example, you may have code that looks like this, despite the fact that urllib will always return bytes from Response::read under normal circumstances:

from urllib.request import urlopen

res = urlopen("...")
content = res.read()
if hasattr(content, "decode"):
    content = content.decode("utf-8")

After pook.apply_binary_body_fix(), you may remove the hasattr check, because now res.read() and its equivalents in other libraries will always return bytes.

Deprecation implementation details

This problem is caused by this set of lines from the Response class:

https://github.com/h2non/pook/blob/ff30ca754b7569245640e593ffc7389c91a3e500/src/pook/response.py#L173-L174

Combined with implementations like this in the interceptors:

https://github.com/h2non/pook/blob/d2c7acb633c3d734306390b30b65c39366ec3c3b/src/pook/interceptors/http.py#L95

The effect of this, is that pook's mocked response classes behave differently from the actual implementations (unless binary=True is passed to Response::body).

from urllib.request import urlopen
import pook

res = urlopen("https://httpbin.org/status/200")
assert type(res.read()) is bytes

with pook.use():
    pook.get("https://httpbin.org/status/200").reply(200).body("hello from pook")
    res = urlopen("https://httpbin.org/status/200")

    # This will fail!
    content = res.read()
    assert type(content) is bytes
    assert type(content) is str

Worse, if you pass bytes without binary=True, it will convert it to a string! binary=True alleviates some of this issue, but still maintains the default behaviour.

This must be deprecated. For users of pook and urllib, it introduces the need for code paths in application code that are always dead in actual operation. We could fix this just for urllib, but it's worth simply deprecating non-binary bodies whole-sale, and always handling body as bytes (and converting to bytes when given a string, for ease of use), so that pook's mocked responses always match those of the actual HTTP clients it mocks.

The deprecation process will follow:

  • [ ] In a patch release, introduce a DeprecationWarning any time body is called. To suppress the warning, users will need to call pook.apply_binary_body_fix at least once. Most users will need to make zero changes after doing this. Users who are explicitly passing binary=True or binary=False (the latter being highly unlikely), will have addtional deprecation warnings logged noting that the binary parameter itself will also be deprecated and should not be used going forward.
  • [ ] In a major release, convert Response::body's chunked to kwarg-only and remove binary. Convert all code to work exclusively using bytes bodies. This will be a major version release because it introduces significant changes to pook and will force users to update code who haven't seen the deprecation warnings and removed the binary kwarg from body. Update Response::body to remove the binary kwargs. Convert apply_binary_body_fix to a noop. We'll remove it with the next major pook version after this one, whenever that happens.

Because affected pook users will already have code handling both bytes and strings (becasue the real library returns bytes and only pook returns strings), in almost all expected circumstances, the only change they will need to "handle" this is remove code that checks for bytes vs. strings.

sarayourfriend avatar Mar 23 '24 21:03 sarayourfriend