Fix: DOs take 2 mins to shutdown after using TCP socket
Hello 👋🏼
Normally Durable Objects will shut down after 10 seconds of inactivity. I noticed that if I used the TCP sockets API that number jumped up to 2 minutes, both locally and in production.
Root Cause
When a socket is created, setupSocket() creates a watchForDisconnectTask that waits on connection.whenWriteDisconnected(). This task is used for detecting unexpected disconnections (network failures, remote peer dropping while idle). However, the promise doesn't resolve until the TCP connection fully terminates at the OS level—which can take up to 2 minutes due to TCP TIME_WAIT.
When close() was called (or when the remote closed and EOF was detected), it would resolve the closed promise but leave this background task running. It would keep the IoContext alive, which kept an actor reference alive, preventing the DO from becoming "inactive" and starting its 10-second eviction timer.
The Fix
Cancel watchForDisconnectTask when the socket is closed, either:
- Explicitly via
close() - Automatically via
maybeCloseWriteSide()when remote closes andallowHalfOpenis false
The task was already designed to handle cancellation gracefully (there's a kj::defer that fulfills the disconnect promise with a "cancelled" flag, and the downstream handler ignores cancelled notifications). This is the same cleanup path that runs when the Socket is garbage collected.
What This Doesn't Affect
- Remote close detection still works (handled via read stream EOF)
- The
closedpromise behavior is unchanged - TCP still goes through proper shutdown sequence—we're just not waiting for OS-level confirmation
- Sockets that aren't explicitly closed still have the disconnect watcher active for detecting unexpected drops
Testing
- Manually verified: DO with socket now evicts in ~10 seconds after
close()instead of ~2 minutes - Manually verified: Remote close (killing the peer while reading) also allows normal eviction timing
- Existing socket tests should pass (no API behavior changes)