etcd icon indicating copy to clipboard operation
etcd copied to clipboard

grpcproxy: use metadata instead of context withvalue in with client auth token

Open krijohs opened this issue 1 year ago • 22 comments

Change to use metadata instead of context.WithValue to ensure each proxy watcher client has a new stream created with its token. Previously context.WithValue resulted in streamKeyFromCtx returning an empty string in the clientv3 watcher, causing stream reuse. When new clients connected to proxy after the token expired (token for the initial client which connected) the reused stream's context would still contain the expired token. This caused auth failures when isWatchPermitted on cluster checked the stream's context resulting in hanging proxy watcher clients.

Issue can be reproduced by setting a low --auth-token-ttl on cluster and connect 1 client first to proxy and then connect a second one after token expired.

krijohs avatar Dec 09 '24 15:12 krijohs

Hi @krijohs. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Dec 09 '24 15:12 k8s-ci-robot

Added test case which reproduces the issue, with included change it passes but without fails

krijohs avatar Dec 19 '24 10:12 krijohs

Have anyone had the chance to have a look at this? pinging from reveiwers in owners file, @fuweid @ivanvc

krijohs avatar Jan 16 '25 10:01 krijohs

Hi @krijohs, thanks for your pull request. Ideally, we would want to discuss the issue and possible solutions before a pull request. Could you please open an issue so other members with more expertise in this area can jump in?

Thanks again.

ivanvc avatar Jan 18 '25 05:01 ivanvc

Hello @ivanvc ok, got it will open an issue so possible solutions can be discussed, thanks.

krijohs avatar Jan 22 '25 09:01 krijohs

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 26 '25 05:04 stale[bot]

/reopen

siyuanfoundation avatar Aug 14 '25 18:08 siyuanfoundation

/ok-to-test

ivanvc avatar Aug 14 '25 18:08 ivanvc

Codecov Report

:x: Patch coverage is 40.00000% with 3 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 69.33%. Comparing base (431a65a) to head (1f5402b). :warning: Report is 461 commits behind head on main.

Files with missing lines Patch % Lines
server/proxy/grpcproxy/util.go 40.00% 3 Missing :warning:
Additional details and impacted files
Files with missing lines Coverage Δ
server/proxy/grpcproxy/util.go 41.93% <40.00%> (+21.93%) :arrow_up:

... and 58 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #19033      +/-   ##
==========================================
+ Coverage   69.21%   69.33%   +0.12%     
==========================================
  Files         419      422       +3     
  Lines       34745    34842      +97     
==========================================
+ Hits        24049    24158     +109     
+ Misses       9300     9292       -8     
+ Partials     1396     1392       -4     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 431a65a...1f5402b. Read the comment docs.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Aug 16 '25 08:08 codecov[bot]

/retest

krijohs avatar Aug 18 '25 10:08 krijohs

@krijohs, could you please rebase your branch with the latest upstream main branch? I'm having issues running the tests because the base is outdated. Thanks :)

ivanvc avatar Aug 20 '25 20:08 ivanvc

@ivanvc sure no problem, just rebased and pushed

krijohs avatar Aug 21 '25 08:08 krijohs

Hi @ivanvc just checking if anyone have had time to look at this PR?

krijohs avatar Nov 18 '25 12:11 krijohs

Pls squash the commit, thx

ahrtr avatar Nov 30 '25 11:11 ahrtr

Pls squash the commit, thx

sure, just squashed and pushed

krijohs avatar Nov 30 '25 12:11 krijohs

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, ivanvc, krijohs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Nov 30 '25 12:11 k8s-ci-robot

@mitake would you be able to take a look at this PR as well? Thanks. It's related to your PR https://github.com/etcd-io/etcd/pull/8289.

ahrtr avatar Nov 30 '25 12:11 ahrtr

@krijohs can you update the e2e test to verify the stream request (e.g. watch) via grpcproxy also work when auth is enabled?

ahrtr avatar Dec 02 '25 09:12 ahrtr

@krijohs can you update the e2e test to verify the stream request (e.g. watch) via grpcproxy also work when auth is enabled?

If im understanding you correctly. The added e2e test TestGRPCProxyWatchersAfterTokenExpiry in this PR already uses authenticated watch streams through the grpc-proxy, all three watches use WithAuth("root", "rootPassword") and WithEndpoints(proxyClientURL) while auth is enabled on the server.

It verifies the proxy forwards auth for streaming requests, if you like i can add a comment to make it more obvious.

krijohs avatar Dec 04 '25 12:12 krijohs

The added e2e test TestGRPCProxyWatchersAfterTokenExpiry in this PR already uses authenticated watch streams through

why the test did not see any issue before you resolve https://github.com/etcd-io/etcd/pull/19033#discussion_r2541813945?

ahrtr avatar Dec 04 '25 15:12 ahrtr

The added e2e test TestGRPCProxyWatchersAfterTokenExpiry in this PR already uses authenticated watch streams through

why the test did not see any issue before you resolve #19033 (comment)?

When grpc proxy sets up a new watch broadcast it calls withClientAuthToken in newWatchBroadcast so the AuthStreamClientInterceptor is not used for watchers from what i can see. I can try and create e2e test that verifies the latest changes made to AuthStreamClientInterceptor

krijohs avatar Dec 07 '25 12:12 krijohs

I think if the token expires, then the test should fail (before you resolved my comment). It should work now as you have already resolved it. Can you double check this using a test case or manually?

Refer to https://github.com/etcd-io/etcd/issues/11954

ahrtr avatar Dec 07 '25 14:12 ahrtr