gateway
gateway copied to clipboard
Request/Response Buffering API support
Feature
I would like a direct API in envoy gateway to enable request and response buffering. As of today this can be achieved by using either Buffer filter to enable request buffering or using File System Buffer filter to enable request and response buffering with help of EnvoyPatchPolicy.
I don't wish to use EnvoyPatchPolicy in production hence would prefer an api in envoy gateway.
Use Case
One of our main reasons for setting up envoy gateway is to have a layer between our upstream servers and aws ALB which can buffer both requests and responses to prevent slow DDoS attacks. The reasons are explained further in detail in point 1 and 2 of What is it Good For section of File System Buffer documentation as well :
- To shield a server from intentional or unintentional denial of service via slow requests. Normal requests open a connection and stream the request. If the client streams the request very slowly, the server may have its limited resources held by that connection for the duration of the slow stream. With one of the “always buffer” configurations for requests, the connection to the server is postponed until the entire request has been received by Envoy, guaranteeing that from the server’s perspective the request will be as fast as Envoy can deliver it, rather than at the speed of the client.
- Similarly, to shield a server from clients receiving a response slowly. For this case, an “always buffer” configuration is not a requirement. The standard Envoy behaviour already implements a configurable memory buffer for this purpose, that will allow the server to flush until that buffer hits the “high watermark” that provokes a request for the server to slow down.
Some of our upstream servers run on a request per thread model, so above types of attacks become very critical for us as it can quickly lead to server downtimes.
This issue has been automatically marked as stale because it has not had activity in the last 30 days.
isnt this already supported https://gateway.envoyproxy.io/docs/tasks/traffic/client-traffic-policy/#configure-downstream-per-connection-buffer-limit ? cc @yaelSchechter
@arkodg i think bufferLimit field in ClientConnection defines the buffer size of the connection with downstream(by default which is 32KB ref). I think it corresponds to per_connection_buffer_limit_bytes in config.listener.v3.Listener. It's my guess based on https://github.com/envoyproxy/gateway/blob/84e348f3b90c9f0d7c690b38e8d355eb9b30d7ba/internal/gatewayapi/clienttrafficpolicy.go#L902 and https://github.com/envoyproxy/gateway/blob/84e348f3b90c9f0d7c690b38e8d355eb9b30d7ba/internal/xds/translator/listener.go#L155
Buffer /File System Buffer filters on the other hand are for requests coming on those connections, and used to define if the request should be buffered in memory/disk or not until whole req(including the body) is received and the size they specify are of the requests and not connection buffer. Please correct me if I am wrong here.
This issue has been automatically marked as stale because it has not had activity in the last 30 days.
maps to https://github.com/kubernetes/ingress-nginx/blob/main/docs/user-guide/nginx-configuration/annotations.md#custom-max-body-size in ingress-nginx
Started a PR for the API - https://github.com/envoyproxy/gateway/pull/5257
Is ClientTrafficPolicy the right place for this?
glad this got picked up. We have been using patch policy for now to achieve this. Thanks @jukie .
I saw the proposed api. Few of my thoughts:
- Envoy Proxy supports enabling buffer filter at cluster level as well and i feel eg should also support that. Doing this in CTP would not let us do that. Maybe be BTP is a write place 🤔
- Regarding
FileSystemBuffer, FileSystemBufferFilterConfig (proto) says
This API feature is currently work-in-progress. API features marked as work-in-progress are not considered stable, are not covered by the threat model, are not supported by the security team, and are subject to breaking changes. Do not use this feature without understanding each of the previous points.
So i am not sure if its safe to add support for it in eg. When i had raised this issue i had also tried to use file buffer in my local standalone envoy proxy setup(without eg) and I wasn't able to make it work. It's possible that i did something wrong, but just pointing it here for reference.
I think it probably makes sense to add to both?
Is my understanding of the Envoy implementation correct that enabling at the Listener would protect against slow clients sending requests and Cluster protects against slow clients upon response?
Updated my PR to drop the FileSystemBuffer filter and adjust around to add Buffer filter to both BackendConnection (BackendTrafficPolicy) and ClientConnection (ClientTrafficPolicy). Could use some bike-shedding on field names and descriptions.
@luvk1412 could you add some clarity to the above please? I'm not sure my logic is correct and I'm also wondering if file system buffer is the only way to get response buffering (docs link).
Currently I'm planning on updating my PR with support for request buffering via the regular buffer filter and attaching to listeners via CTP.
For the EnvoyGateway maintainers, is there a stance is on WIP Envoy features like FileSystem Buffers and whether they should be considered?
@jukie IMO keeping in both CTP and BTP doesn't sound right but i can be wrong. I would ideally have it at BTP and if BTP is targeted to a cluster, use extensions.filters.http.buffer.v3.BufferPerRoute and if targeted to a listener use extensions.filters.http.buffer.v3.Buffer internally and all of this is hidden from end user. I also feel this should not be a part of BackendConnection but rather be at root. The filter applies on each incoming request and has no relation to the connection(shared across multiple requests) itself. It's more like rateLimit/timeout/circuitBreaker filters which apply per request and hence are at root IMO.
Would like @arkodg or someone from eg team to put in suggestions for the api design.
Regarding FileSystem Buffer, there is an open issue https://github.com/envoyproxy/envoy/issues/19026 which hasn't seen much progress since some time.
I think this should be part of the CTP as it's configuring the Client -> Envoy part, buffering client requests in Envoy, before proceeding to the Backend. Then as @luvk1412 mentioned use the Buffer when targeting a Gateway/Listener, otherwise use a BufferPerRoute filter.
Great, thanks!
Upon further reading I realised that a CTP can't target HTTPRoutes, only Gateways, so it wouldn't be possible to setup BufferPerRoute correctly there. I think @luvk1412 had it right all along 😁
thanks for brainstorming on the location ! CTP vs BTP, adding some more questions to help us narrow down on the location @markwinter @luvk1412 hoping you can help answer these (w/o thinking about the implementation details :) )
- Which persona is likely to configure this ? Platform engg/admin or App Devs or both ? if its only the platform engg team, then CTP sounds like the better place since it can only target a
Gateway, if its both then it should beBTP. - Will these requestBuffer limits vary across Apps/Backends ? If so then
BTPis the right place
Which persona is likely to configure this ? Platform engg/admin or App Devs or both ?
I would say both. In our setup as of today we have this attached to gateway with a certain global buffering limit. But we have seen several use cases where some of the devs want to override the global value for there clusters, we have just not been able to do so as doing it for per cluster via Patch Policy would get tedious as we currently do for global.
Will these requestBuffer limits vary across Apps/Backends ?
According to above use case i explained, answer to this is yes.
Based on these use cases in my mind, I had felt initially that BTP might be the right place @arkodg
At the moment we're planning to enable this only for particular routes/applications. I think as there can be an increased latency impact from using it, it also makes sense to be able to make it configurable per application too based on whether the application team needs it or not.
thanks @markwinter @luvk1412 , let's continue with BTP then
This issue has been automatically marked as stale because it has not had activity in the last 30 days.
keeping this open to track docs work
@arkodg is this supported for GRPCRoute or only HTTPRoutes? The docs seem to point to only HTTPRoute