go-redis icon indicating copy to clipboard operation
go-redis copied to clipboard

go-redis/v9: incr, ping commands getting EOF constantly after about a day of successful periodic (once in few seconds) calls

Open chitturs opened this issue 1 year ago • 2 comments

Issue tracker is used for reporting bugs and discussing new features. Please use stackoverflow for supporting issues.

Client: GoLang using redisv9 Server: Azure Redis Cache

Client reads messages from Azure service bus and then updates the Redis Cache using Incr(). This is repeated ad infinitum. Everything works for about a day. The calls are made once in few seconds. After about a day, Incr() returns EOF and never succeeds. At that point, ping() also returns EOF.

Expected Behavior

No failures in Incr() or ping().

Current Behavior

Incr() and ping() fail with EOF.

Possible Solution

It is unclear what the problem is. Suspicion is a broken connection. The connection should never be idle for the idle timeout to kick in. The server has an idle timeout of 10 minutes.

Steps to Reproduce

import (
	  "context"
	  "crypto/tls"
	  "errors"
	  "fmt"
	  "os"
	  "strings"
	  "github.com/Azure/azure-sdk-for-go/sdk/azcore/policy"
	  "github.com/Azure/azure-sdk-for-go/sdk/azidentity"
	  "github.com/Azure/azure-sdk-for-go/sdk/messaging/azservicebus"
	  "github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/redis/armredis/v3"
	  "github.com/redis/go-redis/v9"
)
	  op := &redis.Options{Addr: redisHostName + ":" + fmt.Sprint(*getRes.Properties.SSLPort), Username: msiObjectId, Password: accessToken.Token, TLSConfig: &tls.Config{MinVersion: tls.VersionTLS12}}
client := redis.NewClient(op)

	for _, message := range messages {
		body := message.Body
		redisCacheKey := strings.Trim(string(body), `"`)
		result, err := redisclient.Incr(ctx, redisCacheKey).Result()
		if err != nil {
			if errors.Is(err, context.Canceled) {
				fmt.Println("context was canceled while incrementing cache, return")
				return
			}
			**fmt.Println(fmt.Errorf("redis cache increment failed for key %s, err %v", redisCacheKey, err))**

			// Check if the connection is still alive, https://redis.io/commands/ping/.
			pong, err := redisclient.Ping(ctx).Result()
			fmt.Println(fmt.Errorf("redis cache ping response %s, err %v, timeouts %d", pong, err, redisclient.PoolStats().Timeouts))
			continue
		}

		fmt.Println("Key:", redisCacheKey, "Result:", result)
	}

Context (Environment)

My AKS cluster app is partially broken in functionality. The Redis Cache is used for stats purposes of other activities in the cluster. The stats is now not working after about a day.

chitturs avatar Mar 13 '24 17:03 chitturs

We are seeing similar behavior. In our case we have a status check on our service that does a redisClient.Set(ctx, "status-key", 1, 0) and eventually that starts returning EOF. This is also effecting our other redis calls once the EOF is encountered, but things seem to recover on their own without a restart of the service.

I'm not sure how to debug this, but I also suspect a connection in the pool that is in a bad state.

Should it be possible for a connection in the pool to be unusable?

Update: we have this service running in two environments, one with redis 7.0.11 and one with 6.2.11 -- this is only happening in the redis 7.0.11 environment.

chriscasola avatar Jun 10 '24 12:06 chriscasola

What's your go-redis version? maybe you coud try v9.4.0. It works for me but not v9.5.x or v9.6.x. For what I had investigated so far, maybe you can set DisableIndentity: true in your config. I believe the code here will cause bad conn and thus EOF because we use codis proxy and it does NOT support the client cmd which will close the connection to warn you. image

ccbhj avatar Jul 22 '24 10:07 ccbhj