openapi-python-client icon indicating copy to clipboard operation
openapi-python-client copied to clipboard

Use httpx.Client Directly

Open kalzoo opened this issue 3 years ago • 6 comments

As of version 0.6.1, the generated Client is somewhat configurable - headers, cookies, and timeout. However, these are all abstractions which have to then be handled explicitly within each generated API method.

Would it be simpler to just make calls using an httpx.Client or httpx.AsyncClient instance, and allow consumers to configure that directly? Advantages:

  • Multiple versions of httpx can be supported, and there's less likelihood that you'll have to change your package due to changes or new features in httpx.
  • It's more efficient than direct calls to httpx.get etc, and explicitly what httpx recommends in its documentation:

If you do anything more than experimentation, one-off scripts, or prototypes, then you should use a Client instance.

of course, this package does use the context manager within API operations, but that doesn't allow multiple calls to share the same client and thus connection.

  • Everything else good in that documentation, like the ability to use the generated client package as a WSGI test client
  • Event hooks will allow consumers to implement our own global retry logic (like refreshing authentication tokens) prior to official retry support from httpx itself.
  • AuthenticatedClient and Client can just each just become an httpx.Client configured with different headers.

tl;dr: it decreases coupling between the two packages and lets you worry less about the client configuration and how to abstract it. More httpx functionality will be directly available to consumers, so you'll get fewer (actionable) feature requests. Future breaking changes here will be less likely. Seems like this alone would allow closing a couple currently pending issues (retries, different auth methods, response mimetypes), by putting them entirely in the hands of the consumer.

Describe the solution you'd like

There are a few options.

  1. The httpx.Client could be used directly (i.e. replace client.py entirely). API methods would just accept the client and use it directly, and it would be up to the caller to configure and manage it. This is the simplest for sure, and meets the current use case. This is what I'd recommend.
def sync_detailed(
    *,
    client: httpx.Client,
    json_body: CreateUserRequest,
) -> Response[Union[User, Error]]:
    kwargs = _get_kwargs(
        client=client,
        json_body=json_body,
    )

    response = client.post(
        **kwargs,
    )

    return _build_response(response=response)
  1. The Client could wrap an httpx.Client which allows you to add convenience methods as needed, and stay in control of the Client object itself. This abstraction layer offers protected variation, but wouldn't be used for anything right now - headers, timeouts, and cookies can all be configured directly on an httpx.Client. However this need could also be met with configuration values passed directly to each API operation.
def sync_detailed(
    *,
    client: Client,
    json_body: CreateUserRequest,
) -> Response[Union[User, Error]]:
    kwargs = _get_kwargs(
        client=client.httpx_client,
        json_body=json_body,
    )

    response = client.httpx_client.post(
        **kwargs,
    )

    return _build_response(response=response)
  1. Keep the Client and proxy calls (with __getattr__) to an inner client, or typecheck client on each API operation to see if you've got a Client or httpx.Client. This allows them to be used interchangeably in API operations. This one's the most fragile and doesn't offer any advantages at the moment.

Of course, this would all apply to AsyncClient for the asyncio calls.

Additional context

Happy to send a PR, can do it pretty quickly. Am looking to use this in production, and would love to stay on (and contribute to) mainline rather than a fork!

kalzoo avatar Sep 30 '20 03:09 kalzoo

Thanks for suggestion @kalzoo, I really appreciate you taking the time to think this through. Here are the concerns I have initially:

  1. While this decreases internal coupling to httpx, it couples our API to httpx which I've been trying to avoid. Right now, we could swap out the internals without it being a breaking change for generated clients. While it's not on the roadmap, I can see a world in which someone wants to generate a client that uses requests instead of httpx to avoid an extra dependency. Maybe even generate a client which supports either as optional dependencies and will use whatever's installed.
  2. Today, users of the client don't need to know anything about httpx (or even that it exists) to use the generated client. Making them explicitly provide a client changes that.
  3. We do a couple things in the generated Client which using only an httpx.Client would not get us:
    1. Storing the base_url to use as a prefix in all endpoints.
    2. Setting the "Authorization" header. A user could do this themselves but the objective is really to have AuthenticatedClient be generated with whatever security mechanism the OpenAPI document specified so that callers don't have to worry about whether it's a header or cookie or what the name of it is.

So I think those concerns rule out the first option of using httpx.Client directly. The second option (including an httpx.Client in Client could definitely work and get us what we need, we'll just have to think about how to design the API in a way that makes sense.

It seems to be that the only really safe way to use httpx.Client is in a context manager. Right now I'm thinking about maybe something like this:

Simple use case, same generated API as today, user just calls the function

# No connection pooling, just like today
result = something.sync(client=client)
other = other.sync(client=client)

User wants connection pooling or to manage httpx themselves

with client as httpx_client:  # httpx_client is some new HttpxClient which inherits from Client
    httpx_client.httpx  # This is httpx.Client so do whatever you want
    result = something.sync(client=httpx_client)
    other = other.sync(client=httpx_client)

Generated functions look like this

def sync(*, client: Client):
    with client as _client:
        # if client was the base Client, this gives you a new HttpxClient which manages connection internally
        # If client was already an HttpxClient (user controlling it themselves) then it returns itself
        response = _client.httpx.post()

Then for async variants I think we can declare __aenter__ and __aexit__ and use those instead to return an AsyncHttpxClient maybe? I'll have to play with it to see what makes the most sense.

Anyway, that's generally what I'm thinking so far. Definitely let me know what you think!

dbanty avatar Sep 30 '20 16:09 dbanty

Thanks for the thoughtful reply!

You bet - I realized I had forgotten base_url shortly afterwards. Option 1 instead could be that all api calls - sync and such - would take another parameter, maybe ClientConfig, which carries that information separately from the httpx client. IMO Option 2 is better in that case.

You make a good point that we can't just generate an httpx.Client and stick that into an AuthenticatedClient because it would be too easy for the caller not to clean up/shutdown that inner client. And another good point that the AuthenticatedClient is simple now but won't stay that way as more schema auth methods are supported.

Another possibility, then: accepting a callback to build a client. Thus, the Client would carry an implementation which could be overridden by the user, but it will remain within a context manager by default. I'll try that out, and if it turns out alright I'll send a PR soon.

kalzoo avatar Oct 06 '20 19:10 kalzoo

A couple quick points:

  • Seems httpx.Client does support base_url option. Am I misunderstanding something here?
  • httpx.Client also supports passing headers. That being the case, couldn't we effectively defer authentication to the user?

A naive, simple approach would be to add an additional method, similar to what endpoint_module.pyi::_get_kwargs already does:

import httpx  # We're already importing httpx so this doesn't seem like a big deal.

# ...

def httpx_request({{ arguments(endpoint) | indent(4) }}) -> Response[{{ return_string }}]:
    {{ header_params(endpoint) | indent(4) }}

    {{ query_params(endpoint) | indent(4) }}

    {{ json_body(endpoint) | indent(4) }}

    response = client.request(
        {{ endpoint.method }},
        {{ endpoint.path }},
        {% if endpoint.json_body %}
        json={{ "json_" + endpoint.json_body.python_name }},
        {% endif %}
        {% if endpoint.query_parameters %}
        params=params,
        {% endif %}
        {% if endpoint.form_body_reference %}
        "data": asdict(form_data),
        {% endif %}
        {% if endpoint.multipart_body_reference %}
        "files": multipart_data.to_dict(),
        {% endif %}
    )

    return _build_response(response=response)

If this method is provided in addition to existing signatures, nothing breaks and httpx dependency is not pushed off to the client (they can use the existing methods). That said, if you no longer want to support httpx at a future date, the external API breaks, requiring a major version bump.


Tangentially, I think it's possible users may want customize other parts of _build_response, for instance, it may be nice raise Error rather than return it. My point here is less about this particular point of view than demonstrating user opinions differ based on their use case. Rather than taking stance on these ambiguous issues in this repo, it may be nice to give user more tools for customization.

I think JupyterHub provides some potential guidance here. They allow users to customize their UI Jinja templates. They accomplish this by:

  • allowing the user to pass a custom templates directory. When the JupyterHub server is rendering a template, it first looks to see if the user passed a custom templates directory, checks to see if the user has provided a custom template for the template path in question, and falls back to the default template if not.
  • making extensive use of Jinja template blocks. This mitigates the potential for the user having to copy and paste a bunch of boilerplate. They can customize a small, focused part of the template and leave the rest in tact. This may not be as valuable here, because the JupyterHub templates are fairly large html pages.
{% extends "templates/spawn_pending.html" %}

{% block message %}
{{ super() }}
<p>Patience is a virtue.</p>
{% endblock %}

I'm not seeing Jinja blocks in this repo, so that second point may be a bigger lift than what's worthwhile.

That said, providing a custom templates path seems like a fairly straightforward change, giving users more customization power and reducing responsibility of this repo to take a stance on ambiguous issues.

erichulburd avatar Oct 29 '20 18:10 erichulburd

@dbanty Let me know if this sort of change is welcome. Doesn't break the existing API and should allow users to workaround any particular template implementations in this repository: https://github.com/triaxtec/openapi-python-client/pull/231

erichulburd avatar Nov 03 '20 00:11 erichulburd

@erichulburd I'm absolutely on board with having users be able to provide their own templates for either partial or full override of the generated client. It was brought up before in #171 though I think I only discussed particulars briefly. I'm going to leave a more thoughtful response on the matter in #231 .

While custom templates are great for a wide variety of things, for this issue- I do think we still want to utilize httpx clients more effectively in the default template to benefit from connection pooling.

dbanty avatar Nov 03 '20 16:11 dbanty

+1

I do wait for this Enhancement , Here is my usecase.

I like to do deep debugging with the clients generated using openapi-python-client. I would like to see request, response objects for every api call is logged/ dumped for analysis.

From the httpx documentation, its possible using Event Hooks (https://www.python-httpx.org/advanced/).

But I could not successfully able to hook the events.


def log_httpx_request(request: httpx.Request) -> None:
    print ("*** ", type(request))
    print(f"+++Request event hook: {request.method} {request.url} - Waiting for response")


def log_httpx_response(response: httpx.Response) -> None:
    print ("***", type(response))
    request = response.request
    print(f"+++Response event hook: {request.method} {request.url} - Status {response.status_code}")

...
    client = Client(
        base_url="https://codebeamer.xxx.com/cb",
        headers={"Authorization": "Basic " + base64.b64encode(basic_auth_data).decode("ascii")},
        timeout=10.0,
    )

    client.event_hooks={'request': [log_httpx_request], 'response': [log_httpx_response]}

    version_data = get_server_version.sync(client=client)
...

HTTPx event hook is not called at all. Looks like event hooks work with httpx client object but the generated code directly uses "httpx.request" helper function. Please provide a way to do a full debugging / event hooks.

Environment:

OS: Ubuntu 22.04.1 LTS
Python Version: Python 3.10.4
openapi-python-client version 0.11.5

bakkiaraj avatar Sep 13 '22 05:09 bakkiaraj

I know a ton of use-cases are waiting on this issue—so if anyone wants to see my general thinking around it, you can check out #775. This will be a breaking change, sometimes subtly, to do it how I think is the "right" way (encourage consumers to always use context managers)

dbanty avatar Jun 29 '23 23:06 dbanty