envoy
envoy copied to clipboard
Enhance ext_proc filter to support MXN streaming
This PR is for issue: https://github.com/envoyproxy/envoy/issues/32090. One of the use case is, like compression by the external processing.
This is to let the side stream server be able to buffer M request body chunks from Envoy first, processing them, then send N chunks back to Envoy in the STREAMED mode.
The ext_proc MXN streaming works this way:
- Enable the MXN streaming by configuring the body mode to be MXN in the ext_proc filter config.
- Config the trailer mode to be SEND in the ext_proc filter config.
- Config the header mode to be SEND in the ext_proc filter config (this is optional).
With above config, Envoy will send header(if 3 is configured ), body and trailer(if present) to the side stream server as they arrival. The server can buffer the header, the entire or partial of the body (M chunks) and the trailer(if present) then streaming the mutated header, mutated body(may need to split into N chunks), the mutated trailer(if present), back to Envoy.
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).
/assign @gbrail @htuch @jmarantz @tyxia @yanavlasov
@gbrail cannot be assigned to this issue.
/assign @tyxia
As codeowner for first pass.
/assign @tyxia
As codeowner for first pass.
The design is under internal review and discussion.
My opinion here is that we should remove the ext_proc internal buffering here, which is essentially ask in https://github.com/envoyproxy/envoy/issues/32090. But it is also good to hear other reviewers' opinions. (i.e., not a gating factor if other reviewers approve it)
Adding wait here to avoid daily ping on maintainer on-caller.
/wait
I agree; I think internal buffering or MxN complicates the model and how to reason about it.
IMHO,, I am not sure if no internal buffer, how to support the case if the side stream server ask Envoy to send the original data as it is for certain sequence of data during the streaming. However, it is okay for me to put this on wait.
I am not sure if no internal buffer, how to support the case if the side stream server ask Envoy to send the original data
IMHO sidestream server can just send back the original data, as the original data has been forwarded sidestream server.
The injected data is totally under sidestream server's control : add, remove, compress, or keep (even just keep certain part) , etc.
I am not sure if no internal buffer, how to support the case if the side stream server ask Envoy to send the original data
IMHO sidestream server can just send back the original data, as the original data has been forwarded sidestream server.
The injected data is totally under sidestream server's control : add, remove, compress, or keep (even just keep certain part) , etc.
Asking the side stream server to send back the original data has network bandwidth and CPU processing implications. Is that a concern? IIUC, to avoid that is one of the reason Envoy ext_proc filter has this buffer in the first place.
Asking the side stream server to send back the original data has network bandwidth and CPU processing implications. Is that a concern? IIUC, to avoid that is one of the reason Envoy ext_proc filter has this buffer in the first place.
To your initial point, i am wondering if we are in (or a bit closer to) agreement that there is no hard blocker from functionality perspective to remove internal buffer? Besides, from design perspective, internal buffer is probably hard to be reasoned about, as pointed out by Josh
Regarding performance, I agree it could help with no mutation case (though this case alone could be wrong use point3 below) but more broadly:
- First, no internal buffering solves the one major perf issue that large amount of data is buffered in Envoy. This issue will be amplified in this mode/use case when sidestream server wants to buffer data, delays response.
- Second, transmission overhead depends on customer use case. If they compress the data, the overhead is already smaller than existing request path. If they want to amplify data, internal buffering also will not help here
- Third, it probably wrong use of this mode for no mutation. They can use observability mode for such purpose. Though it is still supported and up to customer's choice. i.e., not the blocking issue of removing the internal buffer.
@tyxia , thanks for the detail comments!
For 3), I think there are use cases like security validation, which ext_proc server will examine the data content. If the data contains bad content, it sends back an immediate response to tell Envoy to send back a local reply and close the downstream client connection. If the data is legit, it sends back an empty body response to tell Envoy to send the original data as it is. So, with this internal buffer, it has network bandwidth and performance gain in such case. IIUC this case can not use observability mode even no data mutation.
For 1) and 2), the ext_proc filter chunk_queue only buffers the downstream data to a certain point(until watermark is raised). I am not aware this has any functional problems today with synchronous mode, but please correct me if I am wrong.
@yanjunxiang-google Thanks for reply!
I think your security validation case is already supported by existing mode and doesn't effectively use this feature because it doesn't mutate the data at all. EDIT: Actually internal buffer also disables your security validation case when it is large data, due to watermark by internal buffer.
I hope we can jump out of one particular use case and bring back to broad view and motivation of this feature: which I understand it as arbitrary mutation. This means sideStream server injected data is independent of mainstream data. And still buffering mainstream data in Envoy is redundant.
But again, i will not be the single blocking factor here if other reviewers approve it :) Thus, It is probably good to convince other reviewers.
BTW, having a meeting with all reviewers probably will be more productive here.
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
hello, @yanjunxiang-google any updates on this PR, thanks?
hello, @yanjunxiang-google any updates on this PR, thanks? Thanks for checking. We are changing the design a little bit for this feature. Will be updating the PR soon.
PR needs main merge - CI is breaking due to too much change
/assign @yanavlasov @tyxia @htuch
/assign @jmarantz
Adding @adisuissa For API changes
KInd Ping!
Kind ping!
/retest
Kind Ping!
Will wait for other approvals and then submit.
/wait-any
/wait
/wait (for #36999)
Kind Ping!
@adisuissa I did an upstream merge. It needs your API approval again. Thanks!
LGTM, as a WIP/good start.
I think those open comments above need to be addressed (and some more tests/loadtest) to complete this feature.