valkey-py icon indicating copy to clipboard operation
valkey-py copied to clipboard

Random ConnectionError bursts after idle on AWS ElastiCache Valkey (serverless) with valkey-py 6.1

Open raimi-solorzano-mw opened this issue 2 months ago • 5 comments

We’re running valkey-py against AWS ElastiCache Valkey (serverless, Valkey 8) and see a burst of ConnectionError exceptions when traffic resumes after a longer idle period. Connections appear to be dropped while idle, and when load picks up and many new connections get created in a short window, we hit repeated ConnectionErrors for a while before things stabilize.

Client configuration

import os import valkey.asyncio as aiovalkey from valkey.retry import Retry from valkey.backoff import ExponentialBackoff

valkey_client = aiovalkey.Valkey( host=os.getenv("CACHE_URL"), port=os.getenv("CACHE_PORT"), db=0, ssl=True, decode_responses=True, health_check_interval=5, retry=Retry(ExponentialBackoff(), 3), retry_on_timeout=True, )

Expected behavior

After idle, the client should reconnect gracefully and handle a burst of concurrent operations without surfacing many ConnectionErrors to the application (given retries/backoff are configured).

Observed behavior

A wave of ConnectionError exceptions for a short period right after traffic resumes. It’s difficult to reproduce in a controlled environment, but the pattern in production is fairly consistent: idle → burst → repeated ConnectionErrors → eventual recovery. Errors include connection resets/timeouts during initial command execution.

Steps to reproduce (approximate)

Environment

  • Client: valkey-py v6.1.0
  • Server: Valkey 8 (AWS ElastiCache, serverless)
  1. Create the asyncio client with the configuration above.
  2. Leave the process idle with no Valkey commands for an extended period.
  3. Suddenly ramp up to many concurrent GET/SET operations (e.g., fifty tasks).
  4. Observe a spike of ConnectionErrors before connections settle.

Questions

  • Could this be a client-side issue in how the pool reconnects after idle, or is this expected behavior with serverless Valkey closing idle connections?
  • Are there recommended client settings for this scenario?

If there are specific logs or debug flags that would help, I can try to collect them.

Thanks for any pointers on whether this looks like a client bug, a configuration issue, or an expected interaction with AWS ElastiCache serverless, and for recommendations on hardening the client for this idle-to-burst pattern.

raimi-solorzano-mw avatar Sep 25 '25 09:09 raimi-solorzano-mw