envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Add command line flag to skip hot restart stats transfer

Open ravenblackx opened this issue 1 year ago • 4 comments

Commit Message: Add command line flag to skip hot restart stats transfer Additional Description: We had a number of issues with production servers having their main thread stuck trying to get stats from the parent instance. We're not entirely sure what the cause was, but we know stats can get big, we know that's where it was stuck, and we know there was 3 minutes of spare time during which the drain should have completed before the parent instance was hard-killed. It's possible that an out of memory occurred, it's possible the stream had a problem with the volume of data, it's possible there's a bug that causes the parent instance to lock up or crash during the stats retrieval, but one thing we can be confident of is that if we don't go down that branch at all then none of those things will happen. Risk Level: Extremely small; no change to the usual behavior, only a change if a new command-line flag is set. Testing: Minimal. The coverage is the same as it was before, or very slightly better. Docs Changes: Added documentation for the command line option. Release Notes: Yes. Platform Specific Features: Not relevant to Windows because hot restart isn't supported there.

ravenblackx avatar May 09 '24 21:05 ravenblackx

As a reminder, PRs marked as draft will not be automatically assigned reviewers, or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

:cat:

Caused by: https://github.com/envoyproxy/envoy/pull/34069 was opened by ravenblackx.

see: more, trace.

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/). envoyproxy/api-shepherds assignee is @abeyad CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

:cat:

Caused by: https://github.com/envoyproxy/envoy/pull/34069 was opened by ravenblackx.

see: more, trace.

/wait

abeyad avatar May 10 '24 13:05 abeyad

/retest

ravenblackx avatar May 10 '24 17:05 ravenblackx

/assign @yanavlasov

abeyad avatar May 13 '24 16:05 abeyad

nevermind, unassigned Yan as I realized Joshua is already on the reviewer list

abeyad avatar May 13 '24 16:05 abeyad

One possible cleanup would be to add HotRestartOptions as a class and pass that around to all the hot-restart classes rather than all the settings as separate params.

Yeah, I think 3 options is the threshold at which I would do that, so, next time if there is one!

ravenblackx avatar May 13 '24 17:05 ravenblackx