linkerd2 icon indicating copy to clipboard operation
linkerd2 copied to clipboard

Configurable Inbound/Outbound HTTP/1 Connection Idle Timeouts

Open txabman42 opened this issue 2 years ago • 6 comments

Set LINKERD2_PROXY_INBOUND_HTTP1_CONNECTION_POOL_IDLE_TIMEOUT and LINKERD2_PROXY_OUTBOUND_HTTP1_CONNECTION_POOL_IDLE_TIMEOUT during injection

This MR introduces the ability to configure inbound and outbound HTTP/1 connection pool idle timeouts on a per-service. Currently, linkerd2-proxy has default idle timeouts of 3 seconds for both inbound and outbound HTTP/1 connections.

For example, this change is beneficial for services requiring persistent TCP connections across operational rounds.

There was the intention to do something similar, as commented some time ago here https://github.com/linkerd/linkerd2/issues/9273#issuecomment-1241116227:

Proposal 2.13.0 Make this timeout configurable via the policy controller, controlled by an annotation set on a Server or Pod.

This MR aims to add the corresponding annotations in order to make it configurable via annotations in the pod.

txabman42 avatar Nov 21 '23 18:11 txabman42

Thanks @txabman42! Structurally, this change looks good but I want to make sure this won't conflict with future plans to make this configurable through the proxy API (as indicated by https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/app/src/env.rs#L286-L294). @olix0r do you have any thoughts on how near-term making this configurable through the API is?

adleong avatar Nov 21 '23 19:11 adleong

@olix0r could you take an eye on this MR?

txabman42 avatar Nov 27 '23 07:11 txabman42

@mateiidavid could you review it again? TY in advance

txabman42 avatar Dec 26 '23 07:12 txabman42

@txabman42 Happy new year! Think you can resolve the conflict and take a look at the failing test here? 🙂

kflynn avatar Jan 04 '24 15:01 kflynn

@kflynn Conflict resolved and tests fixed, and happy new year!!!

txabman42 avatar Jan 09 '24 08:01 txabman42

Hi, @txabman42! Thanks for your patience. It's taken us quite a bit of time to review this since we've been focused on upcoming releases. The change itself looks good, but I think we'd like to have a bit more time to think through the contract that we're introducing here. This includes the annotation names, the annotation key domain, and whether this should be configured globally, per-workload, or both.

In the mean time, since this clearly addresses a problem you're having, I wanted to suggest a stop gap solution. We've introduced a feature that allows us to use static configuration overrides (through environment variables). These would apply globally; you can use them to tweak the idle timeouts.

It's not documented but you can have a look at:

mateiidavid avatar Feb 09 '24 12:02 mateiidavid

@txabman42 going to close this PR since it's been open for a long time. Please let me know if the workaround I mentioned above is good for you. If you want this re-open, let us know. :)

mateiidavid avatar May 03 '24 10:05 mateiidavid