valkey-py
valkey-py copied to clipboard
Random ConnectionError bursts after idle on AWS ElastiCache Valkey (serverless) with valkey-py 6.1
We’re running valkey-py against AWS ElastiCache Valkey (serverless, Valkey 8) and see a burst of ConnectionError exceptions when traffic resumes after a longer idle period. Connections appear to be dropped while idle, and when load picks up and many new connections get created in a short window, we hit repeated ConnectionErrors for a while before things stabilize.
Client configuration
import os import valkey.asyncio as aiovalkey from valkey.retry import Retry from valkey.backoff import ExponentialBackoff
valkey_client = aiovalkey.Valkey( host=os.getenv("CACHE_URL"), port=os.getenv("CACHE_PORT"), db=0, ssl=True, decode_responses=True, health_check_interval=5, retry=Retry(ExponentialBackoff(), 3), retry_on_timeout=True, )
Expected behavior
After idle, the client should reconnect gracefully and handle a burst of concurrent operations without surfacing many ConnectionErrors to the application (given retries/backoff are configured).
Observed behavior
A wave of ConnectionError exceptions for a short period right after traffic resumes. It’s difficult to reproduce in a controlled environment, but the pattern in production is fairly consistent: idle → burst → repeated ConnectionErrors → eventual recovery. Errors include connection resets/timeouts during initial command execution.
Steps to reproduce (approximate)
Environment
- Client: valkey-py v6.1.0
- Server: Valkey 8 (AWS ElastiCache, serverless)
- Create the asyncio client with the configuration above.
- Leave the process idle with no Valkey commands for an extended period.
- Suddenly ramp up to many concurrent GET/SET operations (e.g., fifty tasks).
- Observe a spike of ConnectionErrors before connections settle.
Questions
- Could this be a client-side issue in how the pool reconnects after idle, or is this expected behavior with serverless Valkey closing idle connections?
- Are there recommended client settings for this scenario?
If there are specific logs or debug flags that would help, I can try to collect them.
Thanks for any pointers on whether this looks like a client bug, a configuration issue, or an expected interaction with AWS ElastiCache serverless, and for recommendations on hardening the client for this idle-to-burst pattern.