envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Support upstream http filters with tcp tunneling

Open vikaschoudhary16 opened this issue 1 year ago • 14 comments

As suggested in this comment, Router::UpstreamRequest is being initialized and owned by TcpProxy::HttpUpstream.

Commit Message: Additional Description: Risk Level: Testing: Docs Changes: Release Notes: Platform Specific Features: [Optional Runtime guard:] [Optional Fixes #Issue] [Optional Fixes commit #PR or SHA] [Optional Deprecated:] [Optional API Considerations:]

vikaschoudhary16 avatar May 04 '23 13:05 vikaschoudhary16

CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).

:cat:

Caused by: https://github.com/envoyproxy/envoy/pull/27183 was opened by vikaschoudhary16.

see: more, trace.

@vikaschoudhary16 please avoid force pushing in the future as it removes comment anchors to source code. It is ok to just keep pushing new commits to the PR.

Conceptually looks good to me. Please also add tests.

yanavlasov avatar May 04 '23 14:05 yanavlasov

/wait

yanavlasov avatar May 05 '23 14:05 yanavlasov

Thanks @vikaschoudhary16, it's nice to see that there's more demand for this feature and awesome you were able to make progress with this!

One thing I'm wondering about is - I understand that since TcpProxy is the router itself as well as the downstream filter (kind of HCM + router combined in terms of responsibilities), the HTTP filters that can be applied to tunneled sessions can only be upstream HTTP filters (which is fine). However as far as I can see, there are two occurrences where these filters could be applied in the context:

  1. In the upstream cluster HTTP filters definition
  2. In the downstream tcp_proxy configuration

The difference between these two is that the first will be applied to all tcp_proxies that choose that cluster, so if there are many listeners that are all using the same upstream cluster definition, they will all go through this filter. The second is allowing a more permissive granularity, where only some of the tcp_proxies get that HTTP filter applied to, even if all are targeting the same cluster.

In terms of API, the second could be something like this:

- name: tcp_proxy
  typed_config:
    type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
    tunneling_config:
      hostname: host.com:443
      upstream_http_filters:
        - f1...

My question is, which approach did you take in this implementation (or both), and if it's one of them, is it currently implemented in a way that could be extended to support the additional approach sometime in the future?

Thanks again for contributing on this, that's awesome.

ohadvano avatar May 05 '23 15:05 ohadvano

Thanks @vikaschoudhary16, it's nice to see that there's more demand for this feature and awesome you were able to make progress with this!

One thing I'm wondering about is - I understand that since TcpProxy is the router itself as well as the downstream filter (kind of HCM + router combined in terms of responsibilities), the HTTP filters that can be applied to tunneled sessions can only be upstream HTTP filters (which is fine). However as far as I can see, there are two occurrences where these filters could be applied in the context:

  1. In the upstream cluster HTTP filters definition
  2. In the downstream tcp_proxy configuration

The difference between these two is that the first will be applied to all tcp_proxies that choose that cluster, so if there are many listeners that are all using the same upstream cluster definition, they will all go through this filter. The second is allowing a more permissive granularity, where only some of the tcp_proxies get that HTTP filter applied to, even if all are targeting the same cluster.

In terms of API, the second could be something like this:

- name: tcp_proxy
  typed_config:
    type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
    tunneling_config:
      hostname: host.com:443
      upstream_http_filters:
        - f1...

My question is, which approach did you take in this implementation (or both), and if it's one of them, is it currently implemented in a way that could be extended to support the additional approach sometime in the future?

Thanks again for contributing on this, that's awesome.

Hi, I took first approach i.e upstream cluster and no api change. Though I think it should be possible to extend it and let filters, if specified, in the tunneling config override the upstream cluster filters.

vikaschoudhary16 avatar May 05 '23 16:05 vikaschoudhary16

Hi, I took first approach i.e upstream cluster and no api change. Though I think it should be possible to extend it and let filters, if specified, in the tunneling config override the upstream cluster filters.

Thanks, how about a use case where I want them to be combined? Let's say there's some upstream HTTP filter that I want to be applied to all flows, so I put it in the upstream cluster config, and on some of the tcp_proxies I want to add another HTTP filter. Would it be supportable? I could try to extend this feature sometime after you merge this, I think I'll have capacity for this next month, I just wanted to make sure if this implementation could be the base for this.

ohadvano avatar May 05 '23 16:05 ohadvano

I'd get things working e2e with the tcp proxy integration test as a next step.

  1. tcp_proxy_test.cc are working with feature disabled.
  2. ~access logging related few tests are failing in tcp_proxy_test.cc when feature is enabled. Looking into those.~ tcp_proxy_test.cc are working with feature enabled.
  3. upstream_test.cc tests are working with feature disabled.
  4. WIP to enable feature and add more required tests in the upstream_test.cc

vikaschoudhary16 avatar May 09 '23 23:05 vikaschoudhary16

sorry, had to force push because forgot to use -s in one of the older commits

vikaschoudhary16 avatar May 15 '23 08:05 vikaschoudhary16

/retest

vikaschoudhary16 avatar May 29 '23 14:05 vikaschoudhary16

Is the coverage failure real? /wait on split out PRs ;-)

coverage failure is because of unused apis that are beind added, though unused, to implement RouterFilterInterface.

vikaschoudhary16 avatar Jun 13 '23 23:06 vikaschoudhary16

yeah I think we're going to have to hook some of those up (e.g. flow control really important) but I think we can go through function by function as we're closer in. @RyanTheOptimist and @KBaichoo can hopefully help go through those while I'm out

alyssawilk avatar Jun 14 '23 17:06 alyssawilk

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

github-actions[bot] avatar Jul 14 '23 20:07 github-actions[bot]

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

github-actions[bot] avatar Jul 22 '23 00:07 github-actions[bot]

This is not stale. I have been working on adding smaller changes first in separate PRs.

vikaschoudhary16 avatar Jul 25 '23 03:07 vikaschoudhary16

@vikaschoudhary16 can you check if this change caused coverage to decrease, please?

/wait-any

yanavlasov avatar Feb 28 '24 16:02 yanavlasov

@vikaschoudhary16 can you check if this change caused coverage to decrease, please?

/wait-any

yeah overall it dropped from 96.1 to 96. I am working to enable feature in tunneling integration tests. May be that will improve coverage too

vikaschoudhary16 avatar Feb 29 '24 06:02 vikaschoudhary16

@yanavlasov ci green now. Can you please take a look at changes?

vikaschoudhary16 avatar Mar 01 '24 12:03 vikaschoudhary16

Looks good to me. Defer to @alyssawilk for final pass

yanavlasov avatar Mar 06 '24 18:03 yanavlasov

friendly ping @alyssawilk

nezdolik avatar Mar 12 '24 18:03 nezdolik

argh sorry this wasn't assigned to me so wasn't showing up on my slide. Looking now

alyssawilk avatar Mar 13 '24 16:03 alyssawilk

My only two remaining comments are nits - if you prefer I'm happy to merge as is (so sorry about review delay!) and you can fix in your next follow-up.

Along with addressing the TODOs I think we're going to want to to add a BUNCH of e2e tests testing filter state but those can all be follow-up as well.

Hey, thanks for taking a look. Anyways I will be working on follow-up PRs to add e2e tests and address TODOs, so I would address these two nits also in follow-up if you are ok.

vikaschoudhary16 avatar Mar 15 '24 02:03 vikaschoudhary16

ah, will have to rebase so will address nits as well

vikaschoudhary16 avatar Mar 16 '24 04:03 vikaschoudhary16