Envoy memory leaks occur in scenarios with high packet loss rates when using HTTPS
Is this the right place to submit this?
- [X] This is not a security vulnerability or a crashing bug
- [X] This is not a question about how to use Istio
Bug Description
We have observed that when using versions of Istio 1.19 and above, Envoy exhibits memory leaks under HTTPS scenarios with high packet loss rates over the network:
Our investigation revealed that the "envoy_listener_downstream_pre_cx_active" metric for Envoy aligns with the upward trend in memory usage:
Upon further investigation, we discovered this Istio PR, which sets the listener_filters_timeout to 0s. This configuration disables Envoy's timer for checking execution timeouts on listener filters.
The logic here is flawed. The timer should not be canceled as long as there exists a filter in the listener filters. In our scenario, under the HTTPS context, the TLSInspector filter is required, which necessitates reading the request data until the TLS Client Hello parsing is completed.
In an environment with severe packet loss on the network, the server may not receive a complete TLS Client Hello, leading to ActiveTcpSocket objects in Envoy never being reclaimed, resulting in a memory leak:
https://github.com/envoyproxy/envoy/blob/5dd33ce39e2977cb85a9c72dbf742e10c8483257/source/common/listener_manager/active_tcp_socket.cc#L153-L154
Version
control plane version: >= 1.19
Additional Information
No response