async-http-client icon indicating copy to clipboard operation
async-http-client copied to clipboard

#700 & #701: HTTPClient.shared a globally shared singleton & .browserLike configuration

Open weissi opened this issue 1 year ago • 4 comments

Offer HTTPClient.shared, a global singleton HTTPClient that doesn't need shutdown (because it lives as long as your application). This globally shared HTTPClient does not accept any special configuration and uses HTTPClient.Configuration.browserLike (also introduced in this PR).

  • fixes #700
  • fixes #701

weissi avatar Aug 12 '23 14:08 weissi

I'm somewhat worried about this tbh and am wordering if rephrasing these APIs a bit could help?

What if we documented "do a let http = HTTPClient.makeShared() when you need one" and at least then if necessary moving to a configured one is less painful than the pain from moving of a HTTPClient.shared sprinkled allover the codebase...?

The docs could literarily say that the simplest way is:

let http = HTTPClient.makeShared()

and you can just exit the process if you want to -- you may lose in flight requests or whatever. And then we explain, well, if you do care -- please hook into service lifecycle, (and here's how ...). If folks don't care they can literarily plop a global let http into their codebase and call it a day.

The "make" could function the same as this ".shared" but I wonder how we should be recommending this to be used, such that moving to "oh no, actually I do need to configure it" isn't a world of pain... Otherwise we'll end up allowing "configure the global shared one" which is not something we'd want I feel...

Note also that a global will become main actor isolated in Swift 6 most likely, so there will be a hop like that. The SDK will have the same issue, so yeah we'd be on equal footing, but I am wondering if we can't do better and recommend things like:

actor INeedHTTP {
  let http: HTTPClient = .shared // .makeShared() ?

}

to avoid un-necessary hops to the global actor...?


Either way, at least tracing in applications "will work" but Fabian's concern that parallel testing with asserting on traces will be impacted and not usable if shared instances are used.


So... I guess this is fine, but I'm not very happy about it but I see the value provided of the simplicity here -- I'd like to consider if we can at least recommend "to the global variable yourself" and change how we phrase it in order to make "move away from static global func everywhere" less painful when people will need to do so.

ktoso avatar Aug 15 '23 08:08 ktoso

Hi, Has there been any progress on this PR?

MartinLau7 avatar Jan 09 '24 13:01 MartinLau7

I'm eager for this change. It resolves the main issues I see people (including myself at times) run into with libraries that depend on AHC, and I feel that 99% of people will want the default configuration for AHC anyways. This change makes it much easier to set up libraries based on AHC for end-users, in addition to making it more accessible to send HTTP requests.

One issue I tend to run into during (Lambda) application setup is that I need to fetch secrets before booting the remainder of services. If anything errors during this phase, and AHC is not correctly shut down under failing circumstances, the deinit will trigger a crash before the error is caught and displayed to the user. This leads to very confusing scenarios to debug, and is very hard to understand for the less-than-expert Swift developer.

Joannis avatar Jan 24 '24 16:01 Joannis

I'm also looking forward to be able to use this, but I do still want the configurability.

Perhaps we can use some mechanism like swift-log's LoggingSystem bootstrap to let the shared HTTPClient be configurable?

Some libraries like the "aws-lambda" can provide APIs to users to be able to do a "pre-setup" so users can setup stuff like the LoggingSystem and HTTPClient as soon as they should. Otherwise if the user is in charge of the main function of the entrypoint they should be able to setup the HTTPClient soon enough.

MahdiBM avatar Jan 26 '24 17:01 MahdiBM

After reading a recommendation to switch over from URLSession to this package, I too would love to see this land!

gregcotten avatar Mar 01 '24 05:03 gregcotten

Just today was working with AHC and wondering—what is the best way to work with the client? It's not very clear from description. My initial assumption was shutdown is when an application stops working. But apparently from description one should use shutdown to prevent memory leaks? It's also somewhat not very clear what Shuts down the client and event loop gracefully. does really mean...

The thing is, it became very manual to check all creations/shutdowns of clients, even for our small service, wondering how it's for bigger apps... So this PR is very welcoming.

Though I have some questions:

  1. Won't there be any memory leak issue then? 🤔 Maybe missing something, not very familiar with the code tbh.
  2. What is advantage of using let httpClient = HTTPClient.shared rather than defining let httpClient = HTTPClient(eventLoopGroupProvider: .singleton) in your app?
  3. Reading through comments, think maybe having a function makeShared() as @ktoso mentioned is actually a better approach? Otherwise it's a global instance and can be misused or lead to unexpected errors (e.g. when you want shared per some system but end with shared between them).

Anyway, looking forward for PR to be merged! 🙂

akbashev avatar Mar 27 '24 20:03 akbashev

If you are writing an app, then using a singleton like this is almost certainly the best approach. The main reason to use non-singleton clients is to isolate state. This is useful if you have separate privacy domains (e.g. different credentials, TLS identities, cookies, etc.) or if you want to be sure you're starting from a clean slate when you're writing tests.

Otherwise, you should have long-lived clients and except in rare cases you should never shut them down.

Lukasa avatar Mar 28 '24 09:03 Lukasa

build timeout:

[519/522] Testing AsyncHTTPClientTests.StressGetHttpsTests/testStressGetHttps
[520/522] Testing AsyncHTTPClientTests.TestIdleTimeoutNoReuse/testIdleTimeoutNoReuse
[521/522] Testing AsyncHTTPClientTests.NoBytesSentOverBodyLimitTests/testNoBytesSentOverBodyLimit
Build timed out (after 10 minutes). Marking the build as failed.
$ ssh-agent -k

weissi avatar Apr 03 '24 12:04 weissi

@swift-server-bot test failed

weissi avatar Apr 03 '24 12:04 weissi

@swift-server-bot test this please

weissi avatar Apr 03 '24 12:04 weissi

Would love to see a release with this in. I'm just about to release the beta for Soto and this allows me to get rid of the requirement to shutdown the Soto client, as I don't have to manage an HTTPClient

adam-fowler avatar Apr 04 '24 08:04 adam-fowler

I'm just about to release the beta for Soto and this allows me to get rid of the requirement to shutdown the Soto client, as I don't have to manage an HTTPClient

I would caution against Soto relying solely on HTTPClient.shared. It's almost inevitable that you'll need to set up your own HTTP client with special configuration (maybe number of connections to host, connection pool size, ...).

If you design Soto to not have a shutdown, then Soto essentially becomes a singleton-only library which I don't think is a good and scalable design.

What I would recommend is to design Soto to have a correct lifecycle and make shutdown calls always required. But additionally you may offer your own SotoS3Client.shared or similar for the users that are happy with losing control over resources and losing configurability.

Personally I think it's really important that on the server side all important libraries have correct lifecycles and here are some of the reasons why:

  • Resource limit control (maybe you want to allow 100k connections at the same time, maybe only 1)
  • Resource control (by running shutdown you can be sure that all resources have been terminated, with singletons that's impossible)
  • Resource isolation (Often servers have more than one component in one address space. But frequently these components have a differing level of importance. For example you may have your service & and admin HTTP endpoint. There may come a time when you discover that your admin endpoint is eating up resources that your service needs. With correctly managed lifecycles and no singletons you can now configure your admin endpoint to get its own EventLoopGroup (just 1 thread), its own instances of HTTPClient (configured with smaller limits), ...)
  • other configuration like Network.framework vs. BSD sockets that may differ in different parts of a program

weissi avatar Apr 04 '24 10:04 weissi

I would caution against Soto relying solely on HTTPClient.shared. It's almost inevitable that you'll need to set up your own HTTP client with special configuration (maybe number of connections to host, connection pool size, ...).

Soto would not rely on the shared instance. You will still be able to provide your own HTTPClient instance. But I was mistaken I thought the only resources Soto relied on that needed a shutdown were controlled by HTTPClient that is not the case.

But I would like to be in the position where Soto is not controlling the lifecycle of of a HTTPClient. Currently the options for Soto are provide your own HTTPClient or Soto will create one for you. I would prefer it to be provide your own HTTPClient or Soto will use the shared instance.

adam-fowler avatar Apr 04 '24 10:04 adam-fowler

I would caution against Soto relying solely on HTTPClient.shared. It's almost inevitable that you'll need to set up your own HTTP client with special configuration (maybe number of connections to host, connection pool size, ...).

Soto would not rely on the shared instance. You will still be able to provide your own HTTPClient instance. But I was mistaken I thought the only resources Soto relied on that needed a shutdown were controlled by HTTPClient that is not the case.

But I would like to be in the position where Soto is not controlling the lifecycle of of a HTTPClient. Currently the options for Soto are provide your own HTTPClient or Soto will create one for you. I would prefer it to be provide your own HTTPClient or Soto will use the shared instance.

Yes, all are options. I'd just make sure that the API contracts are spelled with an always/never. Like in HTTPClient (I hope we removed exceptions):

  • If you call HTTPClient.init() you always have to call client.shutdown()
  • If you use HTTPClient.shared you never have/should to call shutdown()

In the past we tried this misguided EventLoopGroupProvider thing which was just not a good idea. The problem is that you end up with objects that you sometimes need to shut down, depending on what value you passed into the init, that just doesn't seem right.

I'd suggest a similar design in Soto: Either lifecycle or singleton, but not a combination thereof.


Aside: I would imagine that over time Soto will have to manage other resources that aren't just a HTTPClient too. If you already have a shutdown mechanism: easy: add it there. If not: screwed :|. And libraries of the size/shape of Soto will always need other resources in my experience. The only way out is to lean into singletons all the way but then we lose resource control which is fine for certain smaller applications but certainly not always true (+ it's a pain to test).

weissi avatar Apr 04 '24 10:04 weissi