consul icon indicating copy to clipboard operation
consul copied to clipboard

Configurable max connections for ingress-gateway

Open arushi315 opened this issue 3 years ago • 9 comments

Team, I have a service deployed in a mesh and I am leveraging ingress gateway to route traffic from outside to my service. My service serves HTTP traffic and have long running connections to the service, service can have up to 25k open connections.

I am simulating 20k open connections for performance evaluation and running into this error: Sending local reply with details upstream_reset_before_response_started{overflow} and receiving 503 response for the request.

Here is a diagram which shows the communication flow: Screen Shot 2022-06-06 at 3 37 15 PM

As I understand, I would need to update the max connection configuration at two places:

  • connect-proxy (running as a sidecar task for Service A)
  • Ingress gateway

Here is the configuration that I have tried but no luck with resolving the issue:

  • Configuration for connect-proxy:
        connect {
        sidecar_service {
          proxy {
            config {
              local_request_timeout_ms = 0
              envoy_local_cluster_json = <<EOF
              {
  "@type": "type.googleapis.com/envoy.config.cluster.v3.Cluster",
  "name": "local_app",
  "type": "STATIC",
  "connect_timeout": "5s",
  "circuit_breakers": {
    "thresholds": [
      {
        "priority": "DEFAULT",
        "max_connections": 10000,
        "max_requests": 10000
      }
    ]
  },
  "load_assignment": {
    "cluster_name": "local_app",
    "endpoints": [
      {
        "lb_endpoints": [
          {
            "endpoint": {
              "address": {
                "socket_address": {
                  "address": "127.0.0.1",
                  "port_value": 2001
                }
              }
            }
          }
        ]
      }
    ]
  }
}
EOF
            }
          }
        }
      }

I was able to verify the max connection change in config_dump on connect-proxy

  • Configuration for ingress-gateway
consul config read -kind service-defaults -name ingress
{
    "Kind": "service-defaults",
    "Name": "ingress",
    "TransparentProxy": {},
    "MeshGateway": {},
    "Expose": {},
    "UpstreamConfig": {
        "Defaults": {
            "Limits": {
                "MaxConnections": 10000,
                "MaxPendingRequests": 512,
                "MaxConcurrentRequests": 10000
            },
            "MeshGateway": {}
        }
    }
}

config_dump on ingress gateway did not update.

Is this configuration even supported? If currently not, is there any workaround to first at least test it out?

Appreciate any assistance on this ticket.

Versions: Consul - 1.10.8 Nomad - 1.2.6

Related ticket: https://github.com/hashicorp/consul/issues/12373

arushi315 avatar Jun 06 '22 20:06 arushi315

Hey @arushi315

Sorry to hear you're experiencing this issue. Unfortunately, we don't currently support escape hatch overrides to the envoy proxy in ingress gateways. We are tracking this feature though in https://github.com/hashicorp/consul/issues/8722 so I encourage you to add a 👍 to that issue to throw your support behind it.

Have you tried spinning up multiple instances of your ingress gateway to spread out the connections?

EDIT: Escape hatch overrides are supported for cluster config , just not listeners.

Amier3 avatar Jun 08 '22 19:06 Amier3

Thanks @Amier3 for the response. The single instance of my service can support up to 25k connections and we will have multiple instance running for the service. With ingress only able to support 1024, it will not be feasible and not ideal to bring up corresponding ingress instances.

Is there any configuration workaround that we could may be explore in the meantime?

arushi315 avatar Jun 08 '22 19:06 arushi315

Hi @Amier3 Just checking back to see if you have any suggestion on the workaround for the time being?

arushi315 avatar Jun 21 '22 20:06 arushi315

Hey @arushi315

Thanks for the ping, would really like to help you find a workaround for this. I saw you put your Nomad version in the original post -- are you using Nomad to run these apps? Knowing if it's nomad, vms, or kubernetes might help me narrow down some options.

Amier3 avatar Jun 23 '22 20:06 Amier3

Hi @Amier3 Thanks for getting back to me. I am using nomad to run all the services.

arushi315 avatar Jun 23 '22 21:06 arushi315

Hey @arushi315

Sorry for the delay. Unfortunately I wasn't able to find a workaround that doesn't involve using lots of replicas. One of the potential solutions i was looking into was using API gateway instead of ingress gateway , but we're still in the process of supporting API gateway for Nomad.

The good news is I did get this issue on the roadmap, so keep an eye out for the max value to be increased significantly within the next couple releases. In the meantime i'd suggest experimenting with replicas to see if your apps perform as desired with ~5-10 replicas/5k-10k max connections per service.

I'm sorry we weren't able to figure a workaround and let me know if you have anymore questions 👍

Amier3 avatar Jul 05 '22 15:07 Amier3

Thanks @Amier3 for your response. Increasing replicas is not going to be a feasible solution for us but for the time being we will not deploy our service in the mesh and figure out another architecture. Once this setting is configurable for ingress, we will re-evaluate and even look into API gateway.

arushi315 avatar Jul 05 '22 15:07 arushi315

@DerekStrickland Following the issue https://github.com/hashicorp/nomad/issues/11392#issuecomment-1249584991, if you can include this one too in your backlog that will be great :-)

Lord-Y avatar Sep 18 '22 08:09 Lord-Y

I think the solution is to add a max_connection to the ingress-gateway config-entry. This way, we can configure the max connection to the upstream connect_proxy, e.g.,

The default connections will be 1024 as set by envoy.

 {
   Port = 18082
   Services = [
     {
       Name = "foo"
     }
   ]
   MaxConnections = 10000          <----- increase the max connections for this service
 },
 {
   Port = 18083
   Protocol = "http"
   Services = [
     {
       Name = "bar"
     }
     MaxConnections = 2000          <----- increase the max connections for this service
   ]
 }

huikang avatar Sep 20 '22 14:09 huikang

Hi @arushi315 we just merged a PR that addresses this https://github.com/hashicorp/consul/pull/14749 this should go into our next series of patch release.

david-yu avatar Oct 04 '22 15:10 david-yu