Bug report criteria

[x] This bug report is not security related, security issues should be disclosed privately via etcd maintainers.
[x] This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
[x] You have read the etcd bug reporting guidelines.
[x] Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.

What happened?

When a client connects to the grpc proxy with an etcd cluster that has authentication enabled, new clients connecting after the auth token TTL has expired will not receive any watch events.

What did you expect to happen?

Clients connecting to the proxy after the first client's token expired should be able to establish their watch successfully and get events.

How can we reproduce it (as minimally and precisely as possible)?

This test reproduces the issue

diff --git a/tests/e2e/etcd_grpcproxy_test.go b/tests/e2e/etcd_grpcproxy_test.go
index 02174e89f..0e109779d 100644
--- a/tests/e2e/etcd_grpcproxy_test.go
+++ b/tests/e2e/etcd_grpcproxy_test.go
@@ -17,6 +17,8 @@ package e2e
 import (
 	"context"
 	"strings"
+	"sync"
+	"sync/atomic"
 	"testing"
 	"time"
 
@@ -142,3 +144,97 @@ func waitForEndpointInLog(ctx context.Context, proxyProc *expect.ExpectProcess,
 
 	return err
 }
+
+func TestGRPCProxyWatchersAfterTokenExpiry(t *testing.T) {
+	ctx, cancel := context.WithCancel(context.Background())
+	defer cancel()
+	cluster, err := e2e.NewEtcdProcessCluster(ctx, t,
+		e2e.WithClusterSize(1),
+		e2e.WithAuthTokenOpts("simple"),
+		e2e.WithAuthTokenTTL(1),
+	)
+	require.NoError(t, err)
+	t.Cleanup(func() { require.NoError(t, cluster.Stop()) })
+
+	cli := cluster.Etcdctl()
+
+	createUsers(ctx, t, cli)
+
+	require.NoError(t, cli.AuthEnable(ctx))
+
+	var (
+		node1ClientURL = cluster.Procs[0].Config().ClientURL
+		proxyClientURL = "127.0.0.1:42379"
+	)
+
+	proxyProc, err := e2e.SpawnCmd([]string{
+		e2e.BinPath.Etcd, "grpc-proxy", "start",
+		"--advertise-client-url", proxyClientURL,
+		"--listen-addr", proxyClientURL,
+		"--endpoints", node1ClientURL,
+	}, nil)
+	require.NoError(t, err)
+	t.Cleanup(func() { require.NoError(t, proxyProc.Stop()) })
+
+	var totalEventsCount int64
+
+	handler := func(events clientv3.WatchChan) {
+		for {
+			select {
+			case ev, open := <-events:
+				if !open {
+					return
+				}
+				if ev.Err() != nil {
+					t.Logf("watch response error: %s", ev.Err())
+					continue
+				}
+				atomic.AddInt64(&totalEventsCount, 1)
+			case <-ctx.Done():
+				return
+			}
+		}
+	}
+
+	withAuth := e2e.WithAuth("root", "rootPassword")
+	withEndpoint := e2e.WithEndpoints([]string{proxyClientURL})
+
+	events := cluster.Etcdctl(withAuth, withEndpoint).Watch(ctx, "/test", config.WatchOptions{Prefix: true, Revision: 1})
+
+	wg := sync.WaitGroup{}
+
+	wg.Add(1)
+	go func() {
+		defer wg.Done()
+		handler(events)
+	}()
+
+	clusterCli := cluster.Etcdctl(withAuth)
+	require.NoError(t, clusterCli.Put(ctx, "/test/1", "test", config.PutOptions{}))
+	require.NoError(t, err)
+
+	time.Sleep(time.Second * 2)
+
+	events2 := cluster.Etcdctl(withAuth, withEndpoint).Watch(ctx, "/test", config.WatchOptions{Prefix: true, Revision: 1})
+
+	wg.Add(1)
+	go func() {
+		defer wg.Done()
+		handler(events2)
+	}()
+
+	events3 := cluster.Etcdctl(withAuth, withEndpoint).Watch(ctx, "/test", config.WatchOptions{Prefix: true, Revision: 1})
+
+	wg.Add(1)
+	go func() {
+		defer wg.Done()
+		handler(events3)
+	}()
+
+	time.Sleep(time.Second)
+
+	cancel()
+	wg.Wait()
+
+	assert.Equal(t, int64(3), atomic.LoadInt64(&totalEventsCount))
+}
diff --git a/tests/framework/e2e/cluster.go b/tests/framework/e2e/cluster.go
index 3a2f83888..ef7e257f0 100644
--- a/tests/framework/e2e/cluster.go
+++ b/tests/framework/e2e/cluster.go
@@ -296,6 +296,10 @@ func WithRollingStart(rolling bool) EPClusterOption {
 	return func(c *EtcdProcessClusterConfig) { c.RollingStart = rolling }
 }
 
+func WithAuthTokenTTL(ttl uint) EPClusterOption {
+	return func(c *EtcdProcessClusterConfig) { c.ServerConfig.AuthTokenTTL = ttl }
+}
+
 func WithDiscovery(discovery string) EPClusterOption {
 	return func(c *EtcdProcessClusterConfig) { c.Discovery = discovery }
 }

Anything else we need to know?

Proposed a potential fix in PR: https://github.com/etcd-io/etcd/pull/19033

Etcd version (please run commands below)

$ etcd --version
etcd Version: 3.5.17
Git SHA: 762e93874
Go Version: go1.22.10
Go OS/Arch: linux/amd64


$ etcdctl version
etcdctl version: 3.5.17
API version: 3.5

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here

Relevant log output

Jan 28 '25 21:01 krijohs

We covered this in our previous triage meeting, and today again. By reading the description, it doesn't sound to be a bug to try to use an expired token to establish a new connection. Or at least, that's what we understand from the description of your bug. Can you please clarify further?

Feb 13 '25 19:02 ivanvc

I can see now that the description might be a big vague. I will try and clarify the issue with more details.

The problem can be reproduced like this:

Client A connects to proxy using user/pass authentication
Client A starts watching a key
Clients A tokens expires
Client B connects to proxy using user/pass authentication
Client B starts watching same key as client A
Both client A and B watchers now hang

The included e2e test included above reproduces it

Feb 18 '25 09:02 krijohs

Thanks for clarifying, @krijohs. It sounds like a valid issue.

cc. @ahrtr

Mar 08 '25 07:03 ivanvc

Hi, just checking if anyone has had time to take a look at this? @ivanvc

May 06 '25 12:05 krijohs

Hi, @krijohs, unfortunately, the grpcproxy has few contributors. I'll bring this topic/issue to the next triage meeting.

Jul 29 '25 18:07 ivanvc

@ivanvc I can have a look at this if you want. I was able to reproduce this on my local.

Aug 07 '25 03:08 nwnt

@nwnt Ive added a PR with possible solution in https://github.com/etcd-io/etcd/pull/19033 which addresses this issue, would be great to get your feedback on the approach.

Aug 09 '25 06:08 krijohs

@krijohs thanks for letting me know. Let me find some time to look at that PR. You should hear back from me in a couple days.

Aug 11 '25 13:08 nwnt

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

Oct 11 '25 00:10 github-actions[bot]

grpc-proxy: subsequent client watchers hangs after auth token expires

Bug report criteria

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Etcd version (please run commands below)

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

Relevant log output