Add keepalive_requests and keepalive_time parameters for connection
Is your feature request related to a problem?
In the k8s environment, when the pod (behind ClusterIP Service) is scaled up, the new pod has almost no requests.
Describe the solution you'd like
When the k8s pod is scaled up, the connection number between the new and old pods remains relatively balanced
Describe alternatives you've considered
For example, nginx sets an upper limit on the number of requests and a time limit for each connection. If the limit is exceeded, the connection is disconnected, so that new pods of the service will also have the opportunity to receive new requests. keepalive_requests keepalive_time
Related component
Client
Additional context
When I was stress testing the backend service, after the k8s pod was scaled up, the CPU usage of the new and old pods differed greatly because the new pods had much fewer connections than the old pods.
Code of Conduct
- [x] I agree to follow the aio-libs Code of Conduct
This doesn't look related to aiohttp.
This doesn't look related to aiohttp.
Hi @webknjaz , I think when the connection pool is basically sufficient and requests are continuous(<< keepalive_timeout), requests may always use old connections, and aiohttp client may rarely have the opportunity to establish new connections with new server instances (pods in k8s), resulting in load imbalance. If the connection is reestablished on a regular basis or based on the number of requests like nginx, this problem should be solved.
Are you saying you're actually using aiohttp and this is about the client side?
Are you saying you're actually using aiohttp and this is about the client side?
@webknjaz Yes, I use aiohttp ClientSession (with the default keepalive_timeout 15s), the server is built with fastapi and uvicorn, and the keepalive_timeout is 90s. I think the imbalance of the number of connections on the server side is caused by the aiohttp client not reestablishing the connection regularly, and the requests are always reusing the old connections.
Have you tried tweaking the pool size or setting up force close and similar settings? https://docs.aiohttp.org/en/stable/client_advanced.html#limiting-connection-pool-size / https://docs.aiohttp.org/en/stable/client_reference.html#aiohttp-client-reference-connectors
Have you tried tweaking the pool size or setting up force close and similar settings? https://docs.aiohttp.org/en/stable/client_advanced.html#limiting-connection-pool-size / https://docs.aiohttp.org/en/stable/client_reference.html#aiohttp-client-reference-connectors
@webknjaz Yes, I have tried force close, but it will affect performance because each request will establish a new connection. I think the other configuration that limits the number of connections may not be related to this problem.
Honestly, there's not enough context to make any more guesses..
cc @Dreamsorcerer?
I think keepalive_timeout is the time the connection can be idle for. If the connection is not idle, then it won't close. Looks like there's a limit_per_host, though that's not exactly what you're looking for either.
We also now have a socket_factory, so if you can figure out any socket-level flags that achieve what you want, that would work. But, at the aiohttp level we don't have anything equivalent of those nginx options.
Thanks @Dreamsorcerer , the socket-level flags cannot solve this problem.
@webknjaz @Dreamsorcerer I drew a picture to describe this issue. The issue is that when the connection pool has enough connections, if the server is scaled up, the aiohttp client and the new service instance(pod4 and pod5) will not establish a new connection. The keepalive periodic disconnection mechanism of nginx can solve this problem.
Yes, I already understood the problem. There's no feature in aiohttp to resolve that, except closing the connections on an event (as I'm assuming you control both sides and could therefore send a message to alert when a new deployment occurs).
@Dreamsorcerer Yes, relying on external events can timely know the changes in the number of server instances, but it is more complicated. It would be simpler if it could be disconnected proactively in the client side like nginx. So are you considering adding this feature (e.g. keepalive_requests or keepalive_time) to solve this kind of problem in the future?
I suspect it could have an impact on performance. If you can implement it without a noticeable impact on performance, then we'll certainly consider merging such a change.
I'm a little confused as to why nginx can't send TCP FINs (half-close the streams). That would close the underlying TCP connections while letting the remaining HTTP responses reach the client, right?
I'm a little confused as to why nginx can't send TCP FINs (half-close the streams). That would close the underlying TCP connections while letting the remaining HTTP responses reach the client, right?
I don't think nginx is involved here. nginx settings were just being used as an example of what they wanted aiohttp to do.
Those settings limit the number of requests or amount of time that a keepalive connection will be used for and then will close them. In the context of nginx, that is the connection from nginx (acting as a client) to the backend. So, they want the same behaviour from aiohttp when connecting directly to the backend.
These settings would help to ensure that connections are recycled and thus help distribute the load over multiple backends as new backends are scaled up.
Based on the descriptions so far, I assume there is no proxy server like nginx that holds connections directly to the backends (which would easily solve the issue). Without adding this feature to aiohttp, they would need to trigger the connections to close (from either end) when receiving a notification that a new backend has been started.
Ah, I didn't realize they meant the client side of the reverse proxy setup..
FWIW, my understanding is that k8s distributions can be configured to scale the old pods down, sending several time-spaced signals. First SIGINT, then SIGTERM and some time later SIGKILL. And in such environments, apps in the pods should handle the signals gracefully, shutting down connections, cleaning the resources etc.
Containers not propagating signals isn't something new and unsolvable:
- https://hynek.me/articles/docker-signals/
- https://github.com/fastapi/fastapi/discussions/10609#discussioncomment-8494216
FWIW, my understanding is that k8s distributions can be configured to scale the old pods down, sending several time-spaced signals.
Yes, but in this case, they are simply scaling up, so they have no need to shutdown the old ones. Otherwise, every time they scale up, they'd need to deploy n+1 pods and then scale down n pods in order to force the redistribution of connections.
Ah, fair enough!