etcd icon indicating copy to clipboard operation
etcd copied to clipboard

watch stream return "permission denied" if token expired

Open cfz opened this issue 5 years ago • 16 comments

i ran into the similar problem as #8914

i'm using v3.4.13 server and v3.4.12 client. i can reproduce the error using following steps:

on one terminal , start etcd server and enable auth and shroten the default token ttl:

etcd --log-level warn --logger zap --auth-token-ttl 10 & 
etcdctl user add root:root
etcdctl user grant-role root root
etcdctl auth enable

on the other termianal, run a the demo program:

cfz@cfz-desktop:~/ws/etcd-watch-demo$ cat main.go
package main

import (
        "context"
        "fmt"
        "sync"
        "time"

        "go.etcd.io/etcd/clientv3"
)

var wg sync.WaitGroup

func main() {
        cfg := clientv3.Config{
                Endpoints: []string{"http://127.0.0.1:2379"},
                Username:  "root",
                Password:  "root",
                LogConfig: nil,
        }

        cli, err := clientv3.New(cfg)
        if err != nil {
                panic(err)
        }

        watch(cli, 1)
        time.Sleep(1 * time.Second)
        watch(cli, 2)
        time.Sleep(11 * time.Second)
        watch(cli, 3)
        time.Sleep(1 * time.Second)
        watch(cli, 4)

        wg.Wait()
}

func watch(cli *clientv3.Client, num int) {
        wg.Add(1)

        go func() {
                defer wg.Done()
                for wr := range cli.Watch(context.Background(), "") {
                        if wr.Err() != nil {
                                fmt.Printf("%#v\n", wr.Err())
                        }
                }

                fmt.Printf("#%v watch existed\n", num)
        }()
}
cfz@cfz-desktop:~/ws/etcd-watch-demo$ go run main.go
&errors.errorString{s:"rpc error: code = PermissionDenied desc = etcdserver: permission denied"}
#3 watch exited
&errors.errorString{s:"rpc error: code = PermissionDenied desc = etcdserver: permission denied"}
#4 watch exited

as shown in the log , the 1st and 2nd watch is ok, while the 3rd and 4th report "permission denied"

meanwile, etcd server will log following:

{"level":"warn","ts":"2020-10-10T20:34:45.010+0800","caller":"auth/store.go:1301","msg":"invalid auth token","token":"JWourXlCoFPTxMHj.8"}
{"level":"warn","ts":"2020-10-10T20:34:46.010+0800","caller":"auth/store.go:1301","msg":"invalid auth token","token":"JWourXlCoFPTxMHj.8"}

i think the causes are:

  • the grpc stream's authentication token is provided at the time when the stream is initialized
  • the client is trying to reusing the same grpc stream(if avaliable) to setup new watch substream.
  • the token attached to the stream would be expired if no other requests to extend its ttl.

cfz avatar Oct 10 '20 13:10 cfz

i think the causes are:

  • the grpc stream's authentication token is provided at the time when the stream is initialized
  • the client is trying to reusing the same grpc stream(if avaliable) to setup new watch substream.
  • the token attached to the stream would be expired if no other requests to extend its ttl.

Yes,you are right. and you can get more information about the token being deleted with command $ bin/etcd --log-level debug --logger zap --auth-token-ttl 10

{"level":"debug","ts":"2020-10-11T00:22:06.647+0800","caller":"auth/store.go:335","msg":"authenticated a user","user-name":"root","token":"xNgkSwPjKojuppwM.11"}
{"level":"debug","ts":"2020-10-11T00:22:06.647+0800","caller":"v3rpc/interceptor.go:181","msg":"request stats","start time":"2020-10-11T00:22:06.569+0800","time spent":"77.981547ms","remote":"127.0.0.1:63080","response type":"/etcdserverpb.Auth/Authenticate","request count":-1,"request size":-1,"response count":-1,"response size":-1,"request content":""}
{"level":"info","ts":"2020-10-11T00:22:06.648+0800","caller":"grpclog/grpclog.go:45","msg":"transport: loopyWriter.run returning. connection error: desc = \"transport is closing\""}
{"level":"info","ts":"2020-10-11T00:22:17.950+0800","caller":"auth/simple_token.go:169","msg":"deleted a simple token","user-name":"root","token":"xNgkSwPjKojuppwM.11"}
{"level":"warn","ts":"2020-10-11T00:22:18.655+0800","caller":"auth/store.go:1301","msg":"invalid auth token","token":"xNgkSwPjKojuppwM.11"}
{"level":"warn","ts":"2020-10-11T00:22:19.658+0800","caller":"auth/store.go:1301","msg":"invalid auth token","token":"xNgkSwPjKojuppwM.11"}

So, I think that is correct. and you?

cfc4n avatar Oct 10 '20 16:10 cfc4n

acutally, i enabled debug log during my investigation :). i saw the log at very beginning, but it takes me a while to understand why the stale token is still being referenced.

cfz avatar Oct 10 '20 17:10 cfz

i noticed a related PR #12264, but i think this issure still exists on master branch. wondering if it's possible to detect the "perssion denied" message from serveWatchClient() and force the client to re-create a watch stream? @cfc4n

cfz avatar Oct 26 '20 09:10 cfz

and for anyone has the same problem: JWT token has the same problem. Using CommonName in certificate as username is a walk around

cfz avatar Oct 26 '20 09:10 cfz

i noticed a related PR #12264, but i think this issure still exists on master branch. @cfz thanks for checking with the master branch, let me check again.

mitake avatar Oct 26 '20 12:10 mitake

This ticket is closed, but I can still reproduce this exactly as described by OP (except import path is /v3 now) using both client and server at v3.5.4. Are the assumptions in the original report not valid?

I just want to watch a key indefinitely, but this seems currently not to work in conjuction with authentication tokens.

jwebb avatar Jun 14 '22 14:06 jwebb

This bug should still be reproducible, due to the problem I described in #13577:

... trying to do clientv3.Watch with an expired token would result in ErrGRPCPermissionDenied, due to the failing authorization check in isWatchPermitted. Furthermore, the client can't auto recover, since shouldRefreshToken rightly returns false for the permission denied error.

The commit from that PR only fixed a special case of the problem, where all the tokens expired after disabling authentication. It doesn't fix the problem where a token expired naturally.

Last I checked, the root cause is that isWatchPermitted is wearing both authorization and authentication hats. The code has to be modified such that isWatchPermitted is only called with a legitimate token and thus only has to worry about authorization.

sayap avatar Jun 15 '22 04:06 sayap

Hmm I see... let me reproduce the issue on my side.

mitake avatar Jun 15 '22 14:06 mitake

Sorry for delayed update, I'll be able to check the issue sometime this week.

mitake avatar Jul 03 '22 14:07 mitake

I could reproduce the bug on the latest main branch... Let me reopen the issue and handle it.

mitake avatar Jul 09 '22 14:07 mitake

This is the branch for the fix: https://github.com/mitake/etcd/tree/watch-auth-err I'll open a PR after finalizing it.

mitake avatar Jul 31 '22 14:07 mitake

The context on server side never change after stream created, so the token won't refresh, subsequent watches will fail after the token expires.

PerRPCCredentials only called once when the stream is created: https://stackoverflow.com/questions/69551307/go-grpc-refresh-token-for-a-bidirectional-stream

  1. We only get the token once when we create the stream.
  2. Client is trying to reusing the same grpc stream(if avaliable) to setup new watch substream.
  3. The token cannot be refreshed on the server, then when the token expires, subsequent watches will fail.

I think the solutions for the future:

  1. Create new watch stream when the token expired.
  2. Do refresh token in user logic, not rely on grpc.

Temporary solutions:

  1. Create a new stream with a new ctxKey, like this: ctx = metadata.AppendToOutgoingContext(context.Background(), "key", time.Now().String())

  2. Set a very long TTL.

kafuu-chino avatar Aug 03 '22 07:08 kafuu-chino

Thanks for commenting @kafuu-chino , actually I think Create new watch stream when the token expired. is a better way.

mitake avatar Aug 07 '22 14:08 mitake

@jwebb @sayap @kafuu-chino I opened this PR: https://github.com/etcd-io/etcd/pull/14322 I checked the above test case can pass now. Could you cross check if you have time?

mitake avatar Aug 07 '22 14:08 mitake

@kafuu-chino I'll also check your PR https://github.com/etcd-io/etcd/pull/14296 later, thanks for opening this!

mitake avatar Aug 07 '22 14:08 mitake

@mitake ok, I will check PR: https://github.com/etcd-io/etcd/pull/14322 later.

kafuu-chino avatar Aug 08 '22 03:08 kafuu-chino

After I update etcd to v3.5.7, the err becomes to this:

rpc error: code = Unauthenticated desc = etcdserver: invalid auth token

It could be 100% reproduced after the client has not been used for a few minutes.

saddit avatar Feb 05 '23 07:02 saddit

@838239178 This is expected because retry behavior was reverted. I'll push an alternative way to handle this as discussed in https://github.com/etcd-io/etcd/issues/15058

lavacat avatar Feb 05 '23 08:02 lavacat

Refer to https://github.com/etcd-io/etcd/pull/17384#issuecomment-1929280671

ahrtr avatar Mar 01 '24 11:03 ahrtr