[BUG] API Error (Connection error.) with TypeError (fetch failed)
Preflight Checklist
- [x] I have searched existing issues and this hasn't been reported yet
- [x] This is a single bug report (please file separate reports for different bugs)
- [x] I am using the latest version of Claude Code
What's Wrong?
When making calls to the Claude API, the system is failing with a connection error. The underlying error is reported as a TypeError (fetch failed). The client-side SDK attempts to retry the request with an exponential backoff strategy, but the connection continues to fail across multiple attempts.
Steps to Reproduce Start Claude from scratch or execute a script or application that makes a request to the Claude API.
The issue appears to be intermittent and may be related to network instability or a client-side configuration issue.
Observe the console or log output for API errors.
Expected Behavior The API request should complete successfully on the first attempt or succeed after one or two retries if a transient network issue occurs.
Actual Behavior The API call fails repeatedly, triggering the retry mechanism. Each attempt logs an API Error (Connection error.) followed by a TypeError (fetch failed). The issue persists for at least five attempts, with the retry delay increasing each time.
Environment Details OS: [Please specify your operating system, e.g., macOS Sonoma, Windows 11, Ubuntu 22.04]
Runtime: [Please specify your runtime, e.g., Node.js v20.11.0, Python 3.10]
SDK/Library: [Please specify the library and version, e.g., @anthropic-ai/sdk v0.20.8]
What Should Happen?
claude should operate normally
Error Messages/Logs
Error Logs
text
⎿ API Error (Connection error.) · Retrying in 4 seconds… (attempt 4/10)
⎿ TypeError (fetch failed)
⎿ API Error (Connection error.) · Retrying in 8 seconds… (attempt 5/10)
⎿ TypeError (fetch failed)
Steps to Reproduce
just start the clause from the console
Claude Model
None
Is this a regression?
Yes, this worked in a previous version
Last Working Version
No response
Claude Code Version
Current version: 1.0.120
Platform
Anthropic API
Operating System
macOS
Terminal/Shell
Terminal.app (macOS)
Additional Information
No response
Same issue here and the issue persisted until now. Nobody is looking into it despite reported like 6 hours ago?
My experience is that when Claude fixed Claude App and API connectivity errors, certain users are required to re-anthentic their account but the API error didn't prompt that. The workaround is to run claude /login
The issue in my case is that it is a random behavior. It works sometimes, and then all of a sudden it does not, and regardless of what trick/workaround I apply, including the above, it remains the same until, say, an hour later ?!?
Same here on windows
Same here on mac M1
Installing claude through npm instead of nix has solved this issue for me, bug report in nixpks
Installing claude through npm instead of nix has solved this issue for me, bug report in nixpks
Thank you,
I installed it in docker, using npm
RUN npm install -g @anthropic-ai/claude-code
This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.
I've been battling this for the last few hours. A summary of what seems to have fixed it (summary by Claude, obvs):
Problem: Claude Code SDK timeouts (10 seconds) when attempting IPv6 connections to api.anthropic.com, even though IPv4 connections succeed in 2-4 seconds.
Solution: 1. Kernel-level IPv6 Disable Applied via: Docker docker run command or docker-compose.yml --sysctl net.ipv6.conf.all.disable_ipv6=1
2. Force IPv4-only DNS Servers Applied via: Docker DNS configuration --dns 8.8.8.8 --dns 8.8.4.4
3. Node.js IPv4 Preference Applied via: Environment variable -e NODE_OPTIONS=--dns-result-order=ipv4first
Why all three are needed:
- Layer 1 (kernel): Disables IPv6 at OS level but doesn't prevent DNS from resolving IPv6 addresses
- Layer 2 (DNS): Uses Google's IPv4-only DNS servers to prevent IPv6 address resolution entirely
- Layer 3 (Node.js): Forces Node.js runtime to prefer IPv4 addresses in case any IPv6 addresses leak through
Result: Eliminates intermittent SDK timeouts by ensuring IPv4 is used at every network stack layer.
Context: Applies to any Docker containerized Node.js application experiencing intermittent connection timeouts when IPv6 is unavailable or unreliable.
Additionally, even though I started to see much improvement, it got worse as the yolo development continued:
Symptoms Observed
Pattern in Build Logs:
- Build starts with successful API requests (1-16): ✅ All succeed in 1-4 seconds
- Intermittent failures begin (17-41): ⚠️ Mix of successes and timeouts
- Complete failure (42-81): ❌ ALL requests timeout at exactly ~10 seconds
- Total: 33 successes, 48 consecutive timeouts in a single fresh container
Timeout Characteristics:
- Consistent 10-second timeout (10308ms, 10486ms, 10415ms, etc.)
- Error: fetch failed / connection timed out - error; no more retries left
- Pattern repeats on every build, regardless of fresh container restarts
Why This Pointed to WSL2 Port Exhaustion
Initial False Leads (Ruled Out):
- ❌ Connection pool exhaustion in Node.js - Fixed shared Anthropic client, but issue persisted
- ❌ Connection pool in Claude CLI - Fresh container didn't help
- ❌ Firewall blocking - Firewall script was disabled (devcontainer.json:23-24)
- ❌ IPv6 timeouts - Already fixed with kernel/DNS/Node.js IPv4-only configuration
- ❌ DNS resolution - api.anthropic.com resolves correctly to 160.79.104.10
- ❌ Cloudflare IP rotation - Curl tests succeeded (405 = API reachable)
- ❌ Linux conntrack limits - 262k limit with only 150 entries used
The Smoking Gun:
On WSL2 host
$ cat /proc/sys/net/ipv4/ip_local_port_range 60700 61000
Only 300 ephemeral ports available (61000 - 60700 = 300)
Why This Causes the Pattern:
-
Port Exhaustion Math: - Each API request = 1 outbound connection through WSL2 NAT - Closed connections remain in TIME-WAIT state for 60 seconds - With 300 ports available:
- ~40 requests at 2-3s each = 80-120 seconds of requests
- First ~15 requests complete before TIME-WAIT accumulates
- After that, ports start getting exhausted
- By request #42, all 300 ports are in TIME-WAIT → complete failure
-
Connection Flow (Why WSL Port Range Matters): Container Process (Claude CLI) ↓ container port (ephemeral from container's 32768-60999) Docker Bridge NAT ↓ host port (ephemeral from WSL's 60700-61000) ← BOTTLENECK WSL2 Network Stack ↓ Windows Host ↓ Internet (api.anthropic.com)
-
The WSL2 → Windows NAT is the bottleneck, not container → Docker NAT.
-
Why Fresh Containers Don't Help: - Port exhaustion is at the WSL kernel level, not container level - Containers share the same WSL kernel and its port range - Restarting container doesn't clear WSL's TIME-WAIT table
The WSL2 Issue Explained
Normal Linux Behavior:
- Standard ephemeral port range: 32768-60999 (28,232 ports)
- Supports thousands of concurrent connections
- TIME-WAIT connections (60s) don't cause issues with this many ports
WSL2 Default Behavior (Pre-Fix):
- Windows default range: 49152-65536 (16,384 ports) - reasonable
- But somehow WSL kernel had: 60700-61000 (300 ports) - catastrophic
- This may be due to:
- WSL2 inheriting a restricted Windows configuration
- Previous network configuration changes
- WSL2 version-specific bug/default
Impact:
- With only 300 ports, after ~40-50 requests all ports are exhausted
- New connections fail immediately (connection refused)
- Anthropic SDK waits 10 seconds then times out
- Pattern: Works fine → gradual degradation → complete failure
The Fix Applied
Step 1: Increase Windows Ephemeral Port Range
Executed in PowerShell (Administrator):
Before (insufficient):
Start Port : 49152 Number of Ports : 16384
Applied fix:
netsh int ipv4 set dynamicport tcp start=32768 numberofports=28232
After (standard Linux range):
Start Port : 32768 Number of Ports : 28232
Why These Numbers:
- Start: 32768 - Standard Linux/IANA ephemeral port range start
- Count: 28232 - Gives range 32768-61000 (standard Linux range)
- Matches Docker container defaults for consistency
What to Do After WSL Restart
- Restart WSL (already done when you restarted your terminal/WSL):
Windows PowerShell
wsl --shutdown
Then restart WSL by opening terminal or running: wsl
- Verify Fix is Applied:
Inside WSL - should now show 32768-61000 (or similar wide range)
cat /proc/sys/net/ipv4/ip_local_port_range
Expected output: 32768 61000
Restarted the service that runs the docker image with CC CLI on it - now working perfectly.