envoy icon indicating copy to clipboard operation
envoy copied to clipboard

WASM Calling `proxy_send_local_response` twice will stuck remote http client(e.g. curl) forever until timeout or interrupted

Open orangetangerine opened this issue 2 years ago • 14 comments

Title: WASM Calling proxy_send_local_response twice will stuck remote http client(e.g. curl) forever until timeout or interrupted

Description: I have a wasm running on envoy, and I found http request will be stuck if I call proxy_send_local_response both in on_http_request_headers and on_http_response_headers.

Repro steps: Here is the shortest wasm code:

use log::info;
use proxy_wasm::traits::*;
use proxy_wasm::types::*;

proxy_wasm::main! {{
    proxy_wasm::set_log_level(LogLevel::Debug);
    proxy_wasm::set_root_context(|_| -> Box<dyn RootContext> { Box::new(HttpAuthProxyRoot{})});
}}


struct HttpAuthProxyRoot {}

impl RootContext for HttpAuthProxyRoot {
    fn create_http_context(&self, context_id: u32) -> Option<Box<dyn HttpContext>> {
        Some(Box::new(HttpAuthProxy::new(context_id)))
    }

    fn get_type(&self) -> Option<ContextType> {
        Some(ContextType::HttpContext)
    }
}

impl Context for HttpAuthProxyRoot {}

#[derive(Default)]
struct HttpAuthProxy {
    context_id: u32,
}

impl HttpAuthProxy {
    fn new(context_id: u32) -> HttpAuthProxy {
        HttpAuthProxy { context_id }
    }

    fn test_response(&self) -> Action {
        self.send_http_response(401, vec![], None);
        Action::Continue
    }
}

impl HttpContext for HttpAuthProxy {
    fn on_http_request_headers(&mut self, _num_headers: usize, _end_of_stream: bool) -> Action {
        let ret = self.test_response();
        info!("on_http_request_headers done");
        ret
    }

    fn on_http_response_headers(&mut self, _num_headers: usize, _end_of_stream: bool) -> Action {
        let ret = self.test_response();
        info!("on_http_response_headers done");
        ret
    }

    fn on_log(&mut self) {
        info!("#{} completed", self.context_id);
    }
}

impl Context for HttpAuthProxy {}

envoy docker-compose configuration:

services:
  envoy:
    image: envoyproxy/envoy:v1.24.0
    command: -c /etc/envoy/envoy.yaml -l debug
    hostname: envoy
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./target/wasm32-wasi/release:/etc/envoy/proxy-wasm-plugins
    networks:
      - envoymesh
networks:
  envoymesh: {}

and envoy.yaml:

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 15000
static_resources:
  clusters:
    - name: test_cluster
      type: STATIC
      connect_timeout: 0.250s
      load_assignment:
        cluster_name: test_cluster
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 15000
  listeners:
      address:
        socket_address:
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                codec_type: AUTO
                route_config:
                  name: local_routes
                  virtual_hosts:
                    - name: local_service
                      domains:
                        - "*"
                      routes:
                        - match:
                            prefix: "/"
                          direct_response:
                            status: 200
                            body:
                              inline_string: "Request /hello and be welcomed!\n"
                http_filters:
                  - name: envoy.filters.http.wasm
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: "http_headers"
                          vm_config:
                            runtime: "envoy.wasm.runtime.v8"
                            code:
                              local:
                                filename: "/etc/envoy/proxy-wasm-plugins/plugin.wasm"
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

Logs: image

And here is log on client side: image

I found out that it's caused by this PR https://github.com/envoyproxy/envoy/pull/23049

It occurs since envoy 1.24+ (also Istio 1.16+). If I revert these codes and build, it works again.

I don't know if this behaviour is expected or changed by accident.

It only occurs in my QA cluster (with a higher version of envoy, 1.24+)

orangetangerine avatar Aug 04 '23 07:08 orangetangerine

@mpwarres

htuch avatar Aug 04 '23 20:08 htuch

I can tackle this

keithmattix avatar Jul 30 '24 22:07 keithmattix

/assign @keithmattix

keithmattix avatar Jul 30 '24 22:07 keithmattix

/assign @krinkinmu

krinkinmu avatar Oct 15 '24 15:10 krinkinmu

krinkinmu is not allowed to assign users.

:cat:

Caused by: a https://github.com/envoyproxy/envoy/issues/28826#issuecomment-2414422342 was created by @krinkinmu.

see: more, trace.

/assign @krinkinmu

keithmattix avatar Oct 15 '24 16:10 keithmattix

@krinkinmu cannot be assigned to this issue.

:cat:

Caused by: a https://github.com/envoyproxy/envoy/issues/28826#issuecomment-2414542423 was created by @keithmattix.

see: more, trace.

@alyssawilk any idea what the issue is here?

keithmattix avatar Oct 15 '24 17:10 keithmattix

I was able to reproduce the issue on a recent head build and with C++ based filter. Going through the issue with debugger suggest that roughly the following is happening:

  1. onRequestHeaders triggers local reply
  2. local reply goes throgh the filter chain
  3. it triggers onResponseHeaders that also attempts to send a local reply again, while we are still processing the previous local reply
  4. Second local reply request gets ignored (as expected), but because the plugin tries to trigger a local reply, proxy wasm pauses filter chain processing (because that's how it always reacts to local replies, see https://github.com/proxy-wasm/proxy-wasm-cpp-host/blob/3212034ee6e75d54cbf91b65cc659772166442f4/src/exports.cc#L158 and https://github.com/proxy-wasm/proxy-wasm-cpp-host/blob/3212034ee6e75d54cbf91b65cc659772166442f4/src/context.cc#L513-L519) and at that point Envoy get stuck.

As for the fix, I think the filter itself can avoid triggering a second local reply if it's processing a local reply already. That being said, I don't think that Envoy hanging in this case is desirable behavior, regardless of the filter logic, so I think Envoy could do better in this case.

I see a couple of options:

  1. Given that second local reply was ignored, maybe we should just avoid pausing filter chain processing in this case, given that we didn't actually trigger a local send reply
  2. Actually trigger the second local reply and not ignore it.

NOTE: mostly for myself, a third option would be to not send local replies through the filter chain, but I don't think we should change the behavior that was around for years so I'm dismissing this option right away.

I find option 2 rather confusing, what is the desired behaviour be in this case? The second local reply overwrites the first one? Because of this I prefer option 1, so I'm starting to work on the code change to implement option 1.

krinkinmu avatar Oct 16 '24 13:10 krinkinmu

One more thing to mention, the fix should be on the proxy-wasm-cpp-host side, I think.

krinkinmu avatar Oct 16 '24 13:10 krinkinmu

/assign @krinkinmu

krinkinmu avatar Oct 16 '24 14:10 krinkinmu

krinkinmu is not allowed to assign users.

:cat:

Caused by: a https://github.com/envoyproxy/envoy/issues/28826#issuecomment-2417006232 was created by @krinkinmu.

see: more, trace.

@alyssawilk any idea what the issue is here?

@keithmattix I thought that it was a propagation delay of some sort, but it does not seem to be the case anymore. Looking at what repokitteh does, I think that it is supposed to just call github API (see https://github.com/repokitteh/modules/blob/master/assign.star#L8-L10 and https://docs.github.com/en/rest/issues/assignees?apiVersion=2022-11-28#check-if-a-user-can-be-assigned).

However, running the API check locally works for me:

$ gh api \                                     
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/envoyproxy/envoy/assignees/krinkinmu
$ echo $?
0

and for the reference, this is how I think it looks when I don't have permissions

$ gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/proxy-wasm/proxy-wasm-cpp-sdk/assignees/krinkinmu
{
  "message": "Not Found",
  "documentation_url": "https://docs.github.com/rest/issues/assignees#check-if-a-user-can-be-assigned",
  "status": "404"
}
gh: Not Found (HTTP 404)
$ echo $?
1

I'm confused why it does not work, but maybe we should discuss the repo access issues outside of this bug?

krinkinmu avatar Oct 16 '24 14:10 krinkinmu

I prepared a fix for proxy-wasm-host-cpp that resolves the issue: https://github.com/proxy-wasm/proxy-wasm-cpp-host/pull/423. Will see if we can get consensus on the approach and get it merged there.

krinkinmu avatar Oct 16 '24 17:10 krinkinmu

/assign @krinkinmu

krinkinmu avatar Oct 30 '24 14:10 krinkinmu