Support Body Transformations
Title: Support Body Transformations
Description:. Envoy supports manipulating/transforming headers, would be great to also support transforming the request and response body to be able to
- sanitize fields
- support API conversions
Relevant Links:
Adding a list of other proxy implementations that support this
- Amazon API Gateway https://docs.aws.amazon.com/apigateway/latest/developerguide/rest-api-data-transformations.html
- Tyk https://tyk.io/docs/transform-traffic/request-body/
- KrakenD https://www.krakend.io/docs/enterprise/backends/body-generator/
- Gloo Edge https://docs.solo.io/gloo-edge/latest/guides/traffic_management/request_processing/transformations/
- Kong https://docs.konghq.com/hub/kong-inc/request-transformer/
- Apache Apisix https://apisix.apache.org/docs/apisix/plugins/body-transformer/
- Apigee https://cloud.google.com/apigee/docs/api-platform/develop/shaping-and-converting-messages
would be great if this feature/filter can also support copying/setting fields from the body into the header, allowing routing based on request body which is a AI LLM Gateway use case, more in this doc cc @robscott
Although it is not encouraged that to mutate request body because it will need to buffer whole body and break the streamlined processing. But I also admit that there are lots of related requirements.
So, SGTM. And I can help to review the design and to sponsor this new extension if someone wants to take this. (note: a design proposal is necessary first).
@wbpcode I will take a shot at this!
/assign
Although it is not encouraged that to mutate request body because it will need to buffer whole body and break the streamlined processing. But I also admit that there are lots of related requirements.
So, SGTM. And I can help to review the design and to sponsor this new extension if someone wants to take this. (note: a design proposal is necessary first).
I think the technical challenge is to make mutations streaming. It's feasible as long as the body is structured. The buffering approach forces large connection buffers in multiplexed protocols, which is not scaleable for multi-tenant gateways.
Although it is not encouraged that to mutate request body because it will need to buffer whole body and break the streamlined processing. But I also admit that there are lots of related requirements.
So, SGTM. And I can help to review the design and to sponsor this new extension if someone wants to take this. (note: a design proposal is necessary first).
I think the technical challenge is to make mutations streaming. It's feasible as long as the body is structured. The buffering approach forces large connection buffers in multiplexed protocols, which is not scaleable for multi-tenant gateways.
According to my exp, in the scenarios where this feature is required, the body basically is a JSON. It's almost impossible to make mutations streaming for that.
At least for now, I don't know how to make a general solution for long live stream which has unlimited body length. So, I am inclined to ignore them first.
+1 to @wbpcode's suggestion of keeping streaming, out of scope
+1 to @wbpcode's suggestion of keeping streaming, out of scope
Yes, I didn't mean to require it. But we should be explicit about the limitations of a buffering approach (e.g. connection buffers must be at least the number of streams per connections \times max buffered bytes). E.g. 1MB JSONs with 100 H2 streams require up to 100MB connection buffers.
https://docs.google.com/document/d/1odMAistdE8OrJHqKvV4ou9vsZxB1Wepy0cxAyBJ-Bes/edit so @arkodg and I quickly wrote a simple proposal. @wbpcode could you take a look when you get a chance? Thanks in advance!
FWIW, ext_proc filter supports body mutation, as well as header and trailer
CC @TAOXUY - I think you also need some body transformations and extractions.
Back to my early buffering point - I think it would be nice to have separate buffering and streaming modes. You don't need to buffer to sanitize JSON fields, and the only meaningful benefit of this over ext_proc is that it must perform significantly better (e.g. hundred of micros P50). overhead, which is hard to do when you fully buffer.
Yeah, our case is extracting protobuf and our needs are satisfied by these 2 filters https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/grpc_field_extraction_filter and https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/proto_message_extraction_filter
I'm wondering what the negative effects of being able to do message transformation with a native filter are?
The positives I anticipate form this is ease of configuration for users, compared to developing their own ext proc, as well as reduced complexity of network calls invoking external processes.
@wbpcode could you take a look at the doc we shared above when you get a change? tia!
I get some free bandwidth and will give this a try to support the substitution formatter based body transformation. At the initial version, it will support following feature:
- transform request and response body based on the https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/substitution_format_string.proto#envoy-v3-api-msg-config-core-v3-jsonformatoptions.
- request/response headers mutations based on the body content.
- filter state mutations based on the body content. (to extract data from body to filter state for logging/stats/other filters but not expose to clients)
- Only support JSON request and JSON response. (will support stream/event in the future.)