etcd
etcd copied to clipboard
clientv3: allow setting JWT directly
etcd supports using signed JWTs in a verify-only mode where the server has access to only a public key and therefore can not create tokens but can validate them. For this to work a client must not call Authenticate and must instead submit a pre-signed JWT with their request. The server will validate this token, extract the username from it, and may allow the client access.
This change allows setting the JWT directly and not setting a username and password. If a JWT is provided the client will no longer call Authenticate, which would not work anyhow. It also provides a public method UpdateAuthToken to allow a user of the client to update their auth token, for example, if it expires.
In this flow all token lifecycle management is handled outside of the client as a concern of the client user.
I also added support for passing a token rather than a username and password in etcdctl. I can split this into a separate PR that depends on this one if needed; leaving it here for now though since it's a relatively small change.
The failing test here appears to be unrelated to this change:
2023-10-22T08:17:25.0015167Z /__w/etcd/etcd/bin/etcd (TestWatchQuorumLastVersion-test-2) (54187): {"level":"error","ts":"2023-10-22T08:17:24.98355Z","caller":"embed/etcd.go:855","msg":"setting up serving from embedded etcd failed.","error":"accept tcp 127.0.0.1:20011: use of closed network connection","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:855\ngo.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func3\n\tgo.etcd.io/etcd/server/v3/embed/etcd.go:607"}
Hi @jmhbnz, sure thing! I've added some tests for the functionality. Let me know if there's anything else needed to get this merged.
Thanks @mcrute - Quick heads up the etcd team are pretty busy at kubecon this week so there might be a minor delay getting this new feature reviewed. I will take another look when I get some quiet focus time hopefully end of the week 🙏🏻
Thanks @jmhbnz, I appreciate it. Would really like to get this merged so let me know if there's anything I can do to help move it along. Have a nice KubeCon!
Thanks for the review, I've rebased main under this so hopefully we'll get a green test run.
/ok-to-test
/test pull-etcd-unit-test
Looking at the failing test. I don't see how it's immediately related to this change but it's an auth test so I'm suspicious.
Found the issue. Unconditionally setting authTokenBundle to something non-nil breaks the no-auth tests. Revised to fix this.
Gentle ping @serathius & @ahrtr, I think this one is ready for maintainer review.
@mcrute If we go ahead with this we will probably want to update some documentation on github.com/etcd-io/website , perhaps under:
- https://etcd.io/docs/v3.5/op-guide/authentication/authentication
- https://etcd.io/docs/v3.5/learning/design-auth-v3
Once this is merged I'll create a PR for the docs. I learned a ton digging through this functionality and definitely would like to make sure others can use it with less effort required.
I think we need to add an e2e test as well.
Have had some time constraints but I'll post an update soon. Adding validation + test case for credential mixes as well as an e2e test validating that standalone token auth works.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
Discussed during sig-etcd triage meeting @mcrute do you have time to rebase this pr and add the e2e test? Would be great to get this merged. Thanks!
@jmhbnz I'm planning to pick this back up this week, have been distracted by other priorities but would very much like to get this finished and merged.
@jmhbnz sorry for the delay on this but I've now added an e2e test for this functionality. Let me know if there's anything else needed to get this merged. Thanks!
Ping @serathius & @ahrtr. Do you have a moment to review (and hopefully merge) this PR? I think that it's complete and incorporates all of the feedback.
For this to work a client must not call Authenticate
What will happen if such a client call Authenticate?
For this to work a client must not call Authenticate
What will happen if such a client call Authenticate?
There will be no change in behavior from the current state of the code. If the server has only a public key it will fail with an error because it will not be able to assign a token to the user. If it has a private key then it will return a new token to the user that they will need to assign to the client themselves.
Although my reading of the code is that this would be a pretty odd case for someone to do this since Authenticate is called automatically in the client getToken method. It is also not called if the client was configured with a Token.
Codecov Report
Attention: Patch coverage is 50.00000% with 9 lines in your changes missing coverage. Please review.
Project coverage is 68.85%. Comparing base (
7f8da61) to head (24b9a32). Report is 6 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| etcdctl/ctlv3/command/global.go | 0.00% | 7 Missing :warning: |
| client/v3/config.go | 50.00% | 1 Missing :warning: |
| etcdctl/ctlv3/ctl.go | 0.00% | 1 Missing :warning: |
Additional details and impacted files
| Files with missing lines | Coverage Δ | |
|---|---|---|
| client/v3/client.go | 85.29% <100.00%> (+0.35%) |
:arrow_up: |
| client/v3/config.go | 85.71% <50.00%> (+0.25%) |
:arrow_up: |
| etcdctl/ctlv3/ctl.go | 0.00% <0.00%> (ø) |
|
| etcdctl/ctlv3/command/global.go | 0.00% <0.00%> (ø) |
... and 25 files with indirect coverage changes
@@ Coverage Diff @@
## main #16803 +/- ##
==========================================
- Coverage 68.90% 68.85% -0.06%
==========================================
Files 424 424
Lines 35862 35878 +16
==========================================
- Hits 24712 24703 -9
- Misses 9730 9750 +20
- Partials 1420 1425 +5
Continue to review full report in Codecov by Sentry.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 7f8da61...24b9a32. Read the comment docs.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Rebasing this was a terrible idea 😭. Undoing that bad decision and hopefully these new, unrelated, tests failures disappear.
Rebasing this was a terrible idea
Please rebase this PR, because it is based on a commit of 3 weeks ago. We ran into conflict which wasn't detected by github.
Two generic comments/questions,
- Do you have a real or strong production use case for this change?
- Users need to manage the token themselves. In that case, etcd (client sdk) should never automatically retry when it gets an invalid auth failure response. But unfortunately we don't support disabling the retry for now. Refer to the "Proposal" section in https://github.com/etcd-io/etcd/issues/18424.
Rebasing this was a terrible idea
Please rebase this PR, because it is based on a commit of 3 weeks ago. We ran into conflict which wasn't detected by github.
Once I have a clean run of all the tests I'll rebase. The first rebase introduced a lot of seemingly spurious failures and I would like to make sure I've got completely working and tested code before digging into those.
Two generic comments/questions,
- Do you have a real or strong production use case for this change?
Yes, we use some code similar to this for application configuration and secret bootstrapping. The application runner injects a pre-signed etcd token into the application environment when it starts then a library within the application discovers etcd (using DNS SRV records), connects with the injected token, and reads its configuration and secrets from etcd. There is a separate controller that handles token lifetimes by creating new tokens and passing them to applications over a pre-defined key in etcd so the application can update the token for their etcd connection before expiration (it's a little more complicated than that but that's the general workflow).
- Users need to manage the token themselves. In that case, etcd (client sdk) should never automatically retry when it gets an invalid auth failure response. But unfortunately we don't support disabling the retry for now. Refer to the "Proposal" section in etcd Go & Java client SDK's retry mechanism may break
Serializable#18424.
At least for our use-case that should be okay. We swap tokens well in advance of their expiration. Although having this functionality would be a good safety measure.
Looks like the build was clean. Going to push the rebase now for testing.
@ahrtr all right, I've got all the requested changes incorporated and we've got a clean build with the rebase! Let me know if there's anything else you'd like to see or if we're good to merge this. Thanks.
There is a separate controller that handles token lifetimes by creating new tokens and passing them to applications over a pre-defined key in etcd so the application can update the token for their etcd connection before expiration
If the application doesn't update the token before expiration in time for whatever reason, then you won't have any chance to know the authRevision anymore (of course you can get it by hack way) because you don't have a valid token; but you won't be able to generate a new JWT token because you don't know the latest authRevision. You will run into a deadlock.
One possible solution is to loosen the access control on AuthStatus so that you can always get the latest authRevision no matter whether or not you have a valid token, https://github.com/etcd-io/etcd/blob/3aa56a76bd7cda7a573bdc858af705810466dfb6/client/v3/auth.go#L70
Refer to, https://github.com/etcd-io/etcd/blob/3aa56a76bd7cda7a573bdc858af705810466dfb6/server/etcdserver/apply/apply_auth.go#L175-L176
Also we need to update the doc to clarify the use case/workflow for this change.
Just checking back in on this. I'm open to removing loosening the access control on AuthStatus if you think that's a valid approach. I don't think it would reveal anything security sensitive other than possibly the timing of auth store changes. If you're +1 on this idea then I'll implement it as another commit on this PR.
I will also update the docs but since those are in a different repo I was planning to do that after this was merged. Do you want me to open a PR against the docs as well and cross-link them?
I'd really like to get this shipped in September, if possible, so we can stop vendoring a fork of this library.
Overall this is a valid & minor change (only client side change) to me. Followups,
- Loosen the access control on AuthStatus as mentioned in https://github.com/etcd-io/etcd/pull/16803#issuecomment-2290808401. Personally I think that we should loosen it even without this PR. The purpose of
AuthStatusis to check whether auth is enabled or not, it shouldn't require a valid token to access it, otherwise it's a chicken-egg issue. - Once the JWT token is managed by users themselves, then it's out of etcd's control, so the client (SDK) should never automatically retry when it gets an invalid auth failure response as mentioned in https://github.com/etcd-io/etcd/pull/16803#issuecomment-2286260521. This can be resolved separately. We just need to expose an API to let users to disable the retry mechanism.
- Update doc (yes, please raise a separate ticket once we have a consensus)
It's hard to track all these tasks across PRs. Could you raise a ticket to track them?
cc @serathius @jmhbnz @ivanvc