gateway icon indicating copy to clipboard operation
gateway copied to clipboard

Graceful shutdown is not working as expected with default setup.

Open davem-git opened this issue 6 months ago • 13 comments

Description: I'm working on implementing envoy-gateway as a replacement for our nginx controller. I have some basic tests, a pod that returns a json block when hit an endpoint. Using K6 as a testing sweet. I set up the following test.

import http from 'k6/http';
import { sleep } from 'k6';

export const options = {
  stages: [
    { duration: '2m', target: 50 }, // ramp-up to 50 users
    { duration: '6m', target: 50 }, // stay at 50 users
    { duration: '2m', target: 0 },  // ramp-down to 0 users
  ],
};

export default function () {
  http.get(<url>/);
  sleep(1);
}

When I run this test and start a rollout restart of the envoy pods. I get the following errors

WARN[0070] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49547-><valid public address>:443: read: connection reset by peer"
WARN[0070] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49570-><valid public address>:443: read: connection reset by peer"
WARN[0071] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49587-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49573-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49601-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49594-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49555-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49595-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49589-><valid public address>:443: read: connection reset by peer"
WARN[0082] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49572-><valid public address>:443: read: connection reset by peer"
WARN[0082] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49563-><valid public address>:443: read: connection reset by peer"
WARN[0083] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49566-><valid public address>:443: read: connection reset by peer"
WARN[0488] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49619-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49636-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49623-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49667-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49611-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49604-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49637-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49649-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49642-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49624-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49598-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49830-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49634-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49617-><valid public address>:443: read: connection reset by peer"
WARN[0502] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49643-><valid public address>:443: read: connection reset by peer"
WARN[0502] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49651-><valid public address>:443: read: connection reset by peer"
WARN[0503] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49630-><valid public address>:443: read: connection reset by peer"
WARN[0503] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49660-><valid public address>:443: read: connection reset by peer"
WARN[0503] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49613-><valid public address>:443: read: connection reset by peer"

[optional *Relevant Links*:]

When I do this on nginx I do not get these errors.

I added these to my custom proxy config and it seemed to fix the issue ``sh shutdown: drainTimeout: 600s minDrainDuration: 60s

However there's no documentation on this. I happened to find it with kube-explain

I'm on v1.0.1
>Any extra documentation required to understand the issue.

davem-git avatar Aug 05 '24 18:08 davem-git