envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Ext-proc STREAMED mode with router Retry_policy will send duplicated request body

Open xr opened this issue 1 year ago • 6 comments

Description: The upstream service will get duplicated payload when:

  1. enable ext-proc requestBody:STREAMED mode
  2. enable router retry_policy. (with any policy)

Repro steps:

envoy version: 7b8baff1758f0a584dcc3cb657b5032000bcb3d7/1.31.0/Distribution/RELEASE/BoringSSL

1. envoycfg.yaml

admin:
  address:
    socket_address:
      address: 127.0.0.1
      port_value: 9902
static_resources:

  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          access_log:
          - name: envoy.access_loggers.stdout
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
          http_filters:
          - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          access_log:
          - name: envoy.access_loggers.stdout
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
          http_filters:
          - name: envoy.filters.http.ext_proc
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.ext_proc.v3.ExternalProcessor
              allow_mode_override: false
              failure_mode_allow: false
              processing_mode:
                request_header_mode: SEND
                response_header_mode: SEND
                request_body_mode: STREAMED
                response_body_mode: STREAMED
              message_timeout: 30s
              grpc_service:
                timeout: 60s
                google_grpc:
                  stat_prefix: test-ext-proc
                  target_uri: 127.0.0.1:20001
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
          route_config:
            name: echo_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              retry_policy:
                retry_on: "connect-failure"
                num_retries: 2
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: echo_server

  clusters:
  - name: echo_server
    type: static
    load_assignment:
      cluster_name: echo_server
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 127.0.0.1
                port_value: 4000

2. echo_server code example:

var express = require('express')
var app = express()

app.post('/', function (req, res) {
    let payloadSize = 0;
    req.on('data', (chunk) => {
      payloadSize += chunk.length;
      console.log(`received chunk length: ${chunk.length}, total: ${payloadSize}`);
    });

    req.on('end', () => {
      console.log(`received total payload size: ${payloadSize}`);
      res.send({
        headers: req.headers,
        query: req.query,
        params: req.params,
        url: req.url,
        body: req.body,
        hostname: req.hostname,
        ip: req.ip,
        ips: req.ips,
        method: req.method,
        baseUrl: req.baseUrl,
        originalUrl: req.originalUrl,
        protocol: req.protocol,
        cookies: req.cookies,
        payloadSize: payloadSize,
      });
    });

})
app.listen(4000);


3. side stream server

here is one example of ext-proc side stream server implementation to use: https://gist.github.com/xr/94a6cb60e11028c837f46fac01ae37bd

4. send request

curl --location 'http://localhost:10000/' \
--header 'Content-Type: application/octet-stream' \
--data '@/Path/to/1MB_file.bin'

The size of file i used is 1048576 bytes.

Problem

I got payload size from the upstream server is 2097152 which is doubled the amount.

Logs from ext-proc side server (looks good):

Received request type: *ext_procv3.ProcessingRequest_RequestHeaders
Received headers: headers:{headers:{key:":authority"  raw_value:"localhost:10000"}  headers:{key:":path"  raw_value:"/"}  headers:{key:":method"  raw_value:"POST"}  headers:{key:":scheme"  raw_value:"http"}  headers:{key:"user-agent"  raw_value:"PostmanRuntime/7.37.3"}  headers:{key:"accept"  raw_value:"*/*"}  headers:{key:"cache-control"  raw_value:"no-cache"}  headers:{key:"postman-token"  raw_value:"9e3d4f39-a71a-4a7d-b004-4b63e5f5bec0"}  headers:{key:"accept-encoding"  raw_value:"gzip, deflate, br"}  headers:{key:"content-length"  raw_value:"1048576"}  headers:{key:"content-type"  raw_value:"application/octet-stream"}  headers:{key:"x-forwarded-proto"  raw_value:"http"}  headers:{key:"x-request-id"  raw_value:"46021c15-769d-4a87-b225-9c2238c25125"}}
Received request type: *ext_procv3.ProcessingRequest_RequestBody
Received request body size 65536, endOfStream: false
Received request type: *ext_procv3.ProcessingRequest_RequestBody
Received request body size 917504, endOfStream: false
Received request type: *ext_procv3.ProcessingRequest_RequestBody
Received request body size 65536, endOfStream: false
Received request type: *ext_procv3.ProcessingRequest_RequestBody
Received request body size 0, endOfStream: true
Received end of stream
Received request type: *ext_procv3.ProcessingRequest_ResponseHeaders
Received request type: *ext_procv3.ProcessingRequest_ResponseBody
Stream closed by proxy

Logs from the upstrea server, which is not good, duplicated payload:

received chunk length: 64903, total: 64903
received chunk length: 633, total: 65536
received chunk length: 65529, total: 131065
received chunk length: 65536, total: 196601
received chunk length: 65536, total: 262137
received chunk length: 65536, total: 327673
received chunk length: 65536, total: 393209
received chunk length: 65536, total: 458745
received chunk length: 65536, total: 524281
received chunk length: 65536, total: 589817
received chunk length: 65536, total: 655353
received chunk length: 65536, total: 720889
received chunk length: 65536, total: 786425
received chunk length: 65536, total: 851961
received chunk length: 65536, total: 917497
received chunk length: 65536, total: 983033
received chunk length: 7, total: 983040
received chunk length: 65520, total: 1048560
received chunk length: 16, total: 1048576
received chunk length: 65510, total: 1114086
received chunk length: 65536, total: 1179622
received chunk length: 65536, total: 1245158
received chunk length: 65536, total: 1310694
received chunk length: 65536, total: 1376230
received chunk length: 65536, total: 1441766
received chunk length: 65536, total: 1507302
received chunk length: 65536, total: 1572838
received chunk length: 65536, total: 1638374
received chunk length: 65536, total: 1703910
received chunk length: 65536, total: 1769446
received chunk length: 65536, total: 1834982
received chunk length: 65536, total: 1900518
received chunk length: 65536, total: 1966054
received chunk length: 65536, total: 2031590
received chunk length: 65536, total: 2097126
received chunk length: 26, total: 2097152
received total payload size: 2097152

envoy logs

[2024-09-13 15:47:55.440][64800441][debug][main] [source/server/server.cc:237] flushing stats
[2024-09-13 15:47:55.988][64800452][debug][http] [source/common/http/conn_manager_impl.cc:385] [Tags: "ConnectionId":"0"] new stream
[2024-09-13 15:47:55.988][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1135] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] request headers complete (end_stream=false):
':authority', 'localhost:10000'
':path', '/'
':method', 'POST'
'user-agent', 'PostmanRuntime/7.37.3'
'accept', '*/*'
'cache-control', 'no-cache'
'postman-token', '5fb57b61-d5b4-43f5-8488-7893728fcbd7'
'accept-encoding', 'gzip, deflate, br'
'connection', 'keep-alive'
'content-length', '1048576'
'content-type', 'application/octet-stream'

[2024-09-13 15:47:55.988][64800452][debug][connection] [./source/common/network/connection_impl.h:98] [Tags: "ConnectionId":"0"] current connecting state: false
[2024-09-13 15:47:55.988][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:338] Opening gRPC stream to external processor
[2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:29] Traffic direction INBOUND: 30000 ms timer enabled
[2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:433] Sending headers message
[2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:408] Watermark raised on decoding
[2024-09-13 15:47:55.989][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1878] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] Read-disabling downstream stream due to filter callbacks.
[2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:1074] Received request headers response
[2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:87] applying headers response. body mode = STREAMED
[2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:55] Traffic direction INBOUND: timer disabled
[2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:159] Sending first chunk using buffered data (65536)
[2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:862] Sending a body chunk of 65536 bytes, end_stream false
[2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:29] Traffic direction INBOUND: 30000 ms timer enabled
[2024-09-13 15:47:55.990][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:416] Watermark lowered on decoding
[2024-09-13 15:47:55.990][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1868] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] Read-enabling downstream stream due to filter callbacks.
[2024-09-13 15:47:55.990][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:392] Continuing processing
[2024-09-13 15:47:55.990][64800452][debug][router] [source/common/router/router.cc:525] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] cluster 'echo_server' match for URL '/'
[2024-09-13 15:47:55.990][64800452][debug][router] [source/common/router/router.cc:750] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] router decoding headers:
':authority', 'localhost:10000'
':path', '/'
':method', 'POST'
':scheme', 'http'
'user-agent', 'PostmanRuntime/7.37.3'
'accept', '*/*'
'cache-control', 'no-cache'
'postman-token', '5fb57b61-d5b4-43f5-8488-7893728fcbd7'
'accept-encoding', 'gzip, deflate, br'
'content-type', 'application/octet-stream'
'x-forwarded-proto', 'http'
'x-request-id', '8708f5a3-3446-4a9f-8ed8-2775599dde58'
'x-envoy-expected-rq-timeout-ms', '15000'

[2024-09-13 15:47:55.990][64800452][debug][pool] [source/common/http/conn_pool_base.cc:78] queueing stream due to no available connections (ready=0 busy=0 connecting=0)
[2024-09-13 15:47:55.990][64800452][debug][pool] [source/common/conn_pool/conn_pool_base.cc:291] trying to create new connection
[2024-09-13 15:47:55.990][64800452][debug][pool] [source/common/conn_pool/conn_pool_base.cc:145] creating a new connection (connecting=0)
[2024-09-13 15:47:55.990][64800452][debug][connection] [./source/common/network/connection_impl.h:98] [Tags: "ConnectionId":"4"] current connecting state: true
[2024-09-13 15:47:55.990][64800452][debug][client] [source/common/http/codec_client.cc:57] [Tags: "ConnectionId":"4"] connecting
[2024-09-13 15:47:55.990][64800452][debug][connection] [source/common/network/connection_impl.cc:1017] [Tags: "ConnectionId":"4"] connecting to 127.0.0.1:4000
[2024-09-13 15:47:55.990][64800452][debug][connection] [source/common/network/connection_impl.cc:1036] [Tags: "ConnectionId":"4"] connection in progress
[2024-09-13 15:47:55.990][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:862] Sending a body chunk of 917504 bytes, end_stream false
[2024-09-13 15:47:55.991][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:29] Traffic direction INBOUND: 30000 ms timer enabled
[2024-09-13 15:47:55.991][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:1074] Received request body response
[2024-09-13 15:47:55.991][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:225] Processing body response
[2024-09-13 15:47:55.991][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:55] Traffic direction INBOUND: timer disabled
[2024-09-13 15:47:55.991][64800452][debug][connection] [source/common/network/connection_impl.cc:746] [Tags: "ConnectionId":"4"] connected
[2024-09-13 15:47:55.991][64800452][debug][client] [source/common/http/codec_client.cc:88] [Tags: "ConnectionId":"4"] connected
[2024-09-13 15:47:55.991][64800452][debug][pool] [source/common/conn_pool/conn_pool_base.cc:328] [Tags: "ConnectionId":"4"] attaching to next stream
[2024-09-13 15:47:55.991][64800452][debug][pool] [source/common/conn_pool/conn_pool_base.cc:182] [Tags: "ConnectionId":"4"] creating stream
[2024-09-13 15:47:55.991][64800452][debug][router] [source/common/router/upstream_request.cc:595] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] pool ready
[2024-09-13 15:47:55.991][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:862] Sending a body chunk of 65536 bytes, end_stream false
[2024-09-13 15:47:55.992][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:29] Traffic direction INBOUND: 30000 ms timer enabled
[2024-09-13 15:47:55.992][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1118] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] request end stream
[2024-09-13 15:47:55.992][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:862] Sending a body chunk of 0 bytes, end_stream true
[2024-09-13 15:47:55.992][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:29] Traffic direction INBOUND: 30000 ms timer enabled
[2024-09-13 15:47:55.992][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:1074] Received request body response
[2024-09-13 15:47:55.992][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:225] Processing body response
[2024-09-13 15:47:55.993][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:55] Traffic direction INBOUND: timer disabled
[2024-09-13 15:47:55.993][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:1074] Received request body response
[2024-09-13 15:47:55.993][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:225] Processing body response
[2024-09-13 15:47:55.993][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:55] Traffic direction INBOUND: timer disabled
[2024-09-13 15:47:55.993][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:1074] Received request body response
[2024-09-13 15:47:55.993][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:225] Processing body response
[2024-09-13 15:47:55.993][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:55] Traffic direction INBOUND: timer disabled
[2024-09-13 15:47:55.993][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:392] Continuing processing
[2024-09-13 15:47:55.993][64800452][debug][router] [source/common/router/router.cc:877] The request payload has at least 2097152 bytes data which exceeds buffer limit 1048576. Give up on the retry/shadow.
[2024-09-13 15:47:55.993][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1878] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] Read-disabling downstream stream due to filter callbacks.
[2024-09-13 15:47:55.993][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1868] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] Read-enabling downstream stream due to filter callbacks.
[2024-09-13 15:47:55.993][64800452][debug][connection] [source/common/network/connection_impl.cc:594] [Tags: "ConnectionId":"4"] onAboveWriteBufferHighWatermark
[2024-09-13 15:47:55.993][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1878] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] Read-disabling downstream stream due to filter callbacks.
[2024-09-13 15:47:55.993][64800452][debug][client] [source/common/http/codec_client.cc:142] [Tags: "ConnectionId":"4"] encode complete
[2024-09-13 15:47:55.994][64800452][debug][connection] [source/common/network/connection_impl.cc:583] [Tags: "ConnectionId":"4"] onBelowWriteBufferLowWatermark
[2024-09-13 15:47:55.994][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1868] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] Read-enabling downstream stream due to filter callbacks.
[2024-09-13 15:47:55.996][64800452][debug][router] [source/common/router/router.cc:1551] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] upstream headers complete: end_stream=false
[2024-09-13 15:47:55.996][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:29] Traffic direction OUTBOUND: 30000 ms timer enabled
[2024-09-13 15:47:55.996][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:433] Sending headers message
[2024-09-13 15:47:55.996][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:474] Watermark raised on encoding
[2024-09-13 15:47:55.996][64800452][debug][http] [source/common/http/filter_manager.cc:1736] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] Disabling upstream stream due to filter callbacks.
[2024-09-13 15:47:55.997][64800452][debug][client] [source/common/http/codec_client.cc:129] [Tags: "ConnectionId":"4"] response complete
[2024-09-13 15:47:55.997][64800452][debug][pool] [source/common/http/http1/conn_pool.cc:53] [Tags: "ConnectionId":"4"] response complete
[2024-09-13 15:47:55.997][64800452][debug][pool] [source/common/conn_pool/conn_pool_base.cc:215] [Tags: "ConnectionId":"4"] destroying stream: 0 remaining
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:1074] Received response headers response
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:87] applying headers response. body mode = STREAMED
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:55] Traffic direction OUTBOUND: timer disabled
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:862] Sending a body chunk of 647 bytes, end_stream true
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:29] Traffic direction OUTBOUND: 30000 ms timer enabled
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:482] Watermark lowered on decoding
[2024-09-13 15:47:55.997][64800452][debug][http] [source/common/http/filter_manager.cc:1741] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] Enabling upstream stream due to filter callbacks.
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:1074] Received response body response
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:225] Processing body response
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:55] Traffic direction OUTBOUND: timer disabled
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:392] Continuing processing
[2024-09-13 15:47:55.997][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1825] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] encoding headers via codec (end_stream=false):
':status', '200'
'x-powered-by', 'Express'
'access-control-allow-origin', '*'
'content-type', 'application/json; charset=utf-8'
'etag', 'W/"287-ZAOJ/ey45dda2/J3tn21U+HNqLg"'
'date', 'Fri, 13 Sep 2024 12:47:55 GMT'
'x-envoy-upstream-service-time', '3'
'server', 'envoy'

[2024-09-13 15:47:55.997][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1937] [Tags: "ConnectionId":"0","StreamId":"8704630574034820302"] Codec completed encoding stream.
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:392] onDestroy
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:55] Traffic direction INBOUND: timer disabled
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:55] Traffic direction OUTBOUND: timer disabled
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:374] Calling close on stream
[2024-09-13 15:47:55.997][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/client_impl.cc:64] Closing gRPC stream
[2024-09-13 15:47:55.997][64800452][debug][grpc] [source/common/grpc/google_async_client_impl.cc:283] resetStream
[2024-09-13 15:47:55.997][64800452][debug][grpc] [source/common/grpc/google_async_client_impl.cc:456] Stream cleanup with 2 in-flight tags
[2024-09-13 15:47:55.997][64800452][debug][grpc] [source/common/grpc/google_async_client_impl.cc:445] Deferred delete
[2024-09-13 15:47:55.997][64800452][debug][grpc] [source/common/grpc/google_async_client_impl.cc:194] GoogleAsyncStreamImpl destruct
[2024-09-13 15:48:00.440][64800441][debug][main] [source/server/server.cc:237] flushing stats
[2024-09-13 15:48:00.997][64800452][debug][connection] [source/common/network/connection_impl.cc:714] [Tags: "ConnectionId":"4"] remote close
[2024-09-13 15:48:00.997][64800452][debug][connection] [source/common/network/connection_impl.cc:276] [Tags: "ConnectionId":"4"] closing socket: 0
[2024-09-13 15:48:00.997][64800452][debug][client] [source/common/http/codec_client.cc:107] [Tags: "ConnectionId":"4"] disconnect. resetting 0 pending requests
[2024-09-13 15:48:00.997][64800452][debug][pool] [source/common/conn_pool/conn_pool_base.cc:495] [Tags: "ConnectionId":"4"] client disconnected, failure reason:
[2024-09-13 15:48:00.998][64800452][debug][pool] [source/common/conn_pool/conn_pool_base.cc:463] invoking 1 idle callback(s) - is_draining_for_deletion_=false
[2024-09-13 15:48:01.514][64800441][debug][http] [source/common/http/conn_manager_impl.cc:385] [Tags: "ConnectionId":"2"] new stream
[2024-09-13 15:48:01.514][64800441][debug][http] [source/common/http/conn_manager_impl.cc:1135] [Tags: "ConnectionId":"2","StreamId":"2355020986288252190"] request headers complete (end_stream=false):
':authority', 'localhost:9902'
':path', '/logging'
':method', 'POST'
'connection', 'keep-alive'
'content-length', '17'
'cache-control', 'max-age=0'
'sec-ch-ua', '"Not)A;Brand";v="99", "Google Chrome";v="127", "Chromium";v="127"'
'sec-ch-ua-mobile', '?0'
'sec-ch-ua-platform', '"macOS"'
'upgrade-insecure-requests', '1'
'origin', 'http://localhost:9902'
'content-type', 'application/x-www-form-urlencoded'
'user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36'
'accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7'
'sec-fetch-site', 'same-origin'
'sec-fetch-mode', 'navigate'
'sec-fetch-user', '?1'
'sec-fetch-dest', 'document'
'referer', 'http://localhost:9902/'
'accept-encoding', 'gzip, deflate, br, zstd'
'accept-language', 'en-GB,en-US;q=0.9,en;q=0.8,zh-CN;q=0.7,zh;q=0.6'

[2024-09-13 15:48:01.514][64800441][debug][connection] [./source/common/network/connection_impl.h:98] [Tags: "ConnectionId":"2"] current connecting state: false
[2024-09-13 15:48:01.514][64800441][debug][http] [source/common/http/conn_manager_impl.cc:1118] [Tags: "ConnectionId":"2","StreamId":"2355020986288252190"] request end stream
[2024-09-13 15:48:01.514][64800441][debug][admin] [source/server/admin/admin_filter.cc:85] [Tags: "ConnectionId":"2","StreamId":"2355020986288252190"] request complete: path: /logging
[2024-09-13 15:48:01.514][64800441][info][misc] [source/common/common/logger.cc:220] change all log levels: level='info'

Other Findings:

  1. change to request_body_mode: BUFFERED or BUFFERED_PARTIAL both works, no duplicated payload
  2. remove retry_policy or set retry_on to empty string "" works.

xr avatar Sep 13 '24 12:09 xr

@yanjunxiang-google @tyxia any ideas? thanks!

xr avatar Sep 13 '24 13:09 xr

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] avatar Oct 14 '24 00:10 github-actions[bot]

@xr what if you don't configure router Retry_policy, would you still see the duplicate body?

tyxia avatar Oct 16 '24 21:10 tyxia

ok You mentioned that remove retry_policy or set retry_on to empty string "" works. . Thus the answer to my comment above is : no, this issue only happens when retry is configured.

Can we try to have a simple integration test to reproduce this issue?

Also, add @stevenzzzz who implemented retry in ext_proc.

tyxia avatar Oct 16 '24 21:10 tyxia

the retry_policy in the pasted config seems irrelevant here. tho the retry_policy I added to ext_proc was through the base::grpcService protocol, eventually it will get to the router as well.

I think @xr you might want to check why there was a header_response received before the request header was sent, in the log I saw there was a line suggesting the header response was received [2024-09-13 15:47:55.989][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/ext_proc.cc:1074] Received request headers response

what is your server, is it a standard grpc server?

stevenzzzz avatar Oct 17 '24 14:10 stevenzzzz

I misread the initial issue that this retry is not the envoy_grpc retry that @stevenzzzz added. Sorry. But it is still a interesting clue to explore.

@xr header_response received before the request header was sent in the comment above is false positive. But besides the questions asked above, (1) do you see this behavior with small body? Overall i think it is a good to have a simpler integration test repro with small body. (2) does your ext_proc server amplify the body in anyway?

tyxia avatar Oct 17 '24 18:10 tyxia

Hello @tyxia @stevenzzzz thanks for the answers.

I created a repo for you guys to reproduce easily: https://github.com/xr/envoy-issue

as you can see:

./test.sh 1000 -> ok
./test.sh 2000 -> ok
./test.sh 10000 -> ok
./test.sh 50000 -> ok
./test.sh 100000 -> ok
./test.sh 1000000 -> NOT OK, got duplicated payload and received 2000000 from the upstream

for smaller payload, it was OK and the example ext_proc server is just a dummy service so I don't think it amplified the body anyway.

xr avatar Nov 05 '24 08:11 xr

@xr You mentioned the issue is seen on Envoy version 1.31.0. Do you see this issue in previous Envoy releases? Or do you have data which release this issue start to show up?

yanjunxiang-google avatar Nov 06 '24 17:11 yanjunxiang-google

@yanjunxiang-google haven't tested the earlier envoy version, iirc 1.30 the same behaviour.

xr avatar Nov 08 '24 07:11 xr

@xr Could you please re-run the same test with Envoy "trace" level logs enabled? Your Envoy logs has "debug" level logs enabled which is missing some important logs.

yanjunxiang-google avatar Nov 27 '24 22:11 yanjunxiang-google

@xr

Had you re-run the test with "trace" level logs enabled? I am interested to see below logs: https://github.com/envoyproxy/envoy/blob/eb0bce8c4f71af817bb75a91ca5d623cab1368e1/source/extensions/filters/http/ext_proc/processor_state.cc#L517 i.e, how much data Ext_proc filter is injecting to the router filter. It should not be 2097152 bytes, thus I am very surprised to see the router filter log is complaining it:

[2024-09-13 15:47:55.993][64800452][debug][ext_proc] [source/extensions/filters/http/ext_proc/processor_state.cc:392] Continuing processing [2024-09-13 15:47:55.993][64800452][debug][router] [source/common/router/router.cc:877] The request payload has at least 2097152 bytes data which exceeds buffer limit 1048576. Give up on the retry/shadow. [2024-09-13 15:47:55.993][64800452][debug][http] [source/common/http/conn_manager_impl.cc:1878] [Tags: "Connection

yanjunxiang-google avatar Dec 02 '24 19:12 yanjunxiang-google

@xr kind ping!

yanjunxiang-google avatar Dec 11 '24 18:12 yanjunxiang-google

@yanjunxiang-google thanks for taking a look, sorry was missed the ping before, here are some logs:

trace.log

I have triggered payload size and here is the results from this time.

./test.sh 1000 -> ok
./test.sh 10000 -> ok
./test.sh 100000 -> ok
./test.sh 200000 -> ok
./test.sh 300000 -> ok
./test.sh 400000 -> not ok, upstream receive 800000

And as a reminder, if you want to reproduce by yourself, please check this repo here: https://github.com/xr/envoy-issue

xr avatar Dec 16 '24 08:12 xr

it might be this line: https://github.com/envoyproxy/envoy/blob/b0d58be31c2d7fe3ea8fd620c7aedb6b09a4bb89/source/extensions/filters/http/ext_proc/ext_proc.cc#L589C1-L590C1

I guess your side stream grpc didn't return anything to Envoy? could you try in the response add/return 10 bytes for each rpc? if the above link is the culprit, I think you will see not 2x data, but the original data + 10bytes*No.-of-rpcs

stevenzzzz avatar Jan 03 '25 20:01 stevenzzzz

hello @stevenzzzz , the test side stream grpc just return continue back to the envoy: https://github.com/xr/envoy-issue/blob/d44520e983d7cfa19f14312e433f5bd73f43b7af/ext-proc/main.go#L88

could you try in the response add/return 10 bytes for each rpc? IIUC, i tried to return "something" from the side stream grpc to envoy like:

case *extproc.ProcessingRequest_ResponseHeaders:
				res.Response = &extproc.ProcessingResponse_ResponseHeaders{
					ResponseHeaders: &extproc.HeadersResponse{
						Response: &extproc.CommonResponse{
							Status: extproc.CommonResponse_CONTINUE,
							HeaderMutation: &extproc.HeaderMutation{
								SetHeaders: []*v3.HeaderValueOption{
									{
										Header: &v3.HeaderValue{
											Key:      "test",
											RawValue: []byte("1234567890"),
										},
										AppendAction: v3.HeaderValueOption_APPEND_IF_EXISTS_OR_ADD,
									},
								},
							},
						},
					},
				}

			case *extproc.ProcessingRequest_ResponseBody:
				res.Response = &extproc.ProcessingResponse_ResponseBody{
					ResponseBody: &extproc.BodyResponse{
						Response: &extproc.CommonResponse{
							Status: extproc.CommonResponse_CONTINUE,
							BodyMutation: &extproc.BodyMutation{
								Mutation: &extproc.BodyMutation_Body{
									Body: []byte("Hello, World!"),
								},
							},
						},
					},
				}

but still get the duplicated payload, interestingly, when i run:

./test.sh 700000

sometime i got 700000 and sometime i got 1400000 from the upstream.

xr avatar Jan 09 '25 14:01 xr

I was able to reproduce the issue locally and is working on a fix @xr

yanjunxiang-google avatar Jan 10 '25 03:01 yanjunxiang-google

@xr

Could you please patch below code change in your Envoy 1.31 repo and build the Envoy binary and re-run the test? diff --git a/source/extensions/filters/http/ext_proc/processor_state.cc b/source/extensions/filters/http/ext_proc/processor_state.cc index 88fb2b2e63..90a7e09cde 100644 --- a/source/extensions/filters/http/ext_proc/processor_state.cc +++ b/source/extensions/filters/http/ext_proc/processor_state.cc @@ -271,10 +271,8 @@ absl::Status ProcessorState::handleBodyResponse(const BodyResponse& response) { MutationUtils::applyBodyMutations(common_response.body_mutation(), chunk_data); } should_continue = chunk->end_stream; -- if (chunk_data.length() > 0) { ENVOY_LOG(trace, "Injecting {} bytes of data to filter stream", chunk_data.length()); injectDataToFilterChain(chunk_data, chunk->end_stream); -- }

   if (queueBelowLowLimit()) {
     clearWatermark();

i.e., just simply remove the above "if (chunk_data.length() > 0) {" check.

yanjunxiang-google avatar Jan 13 '25 15:01 yanjunxiang-google

I was able to reproduce the issue with an integration test as in #37994 , and verify the issue is indeed fixed with the above change.

yanjunxiang-google avatar Jan 16 '25 00:01 yanjunxiang-google