edgedb-python
edgedb-python copied to clipboard
Ping first if conn is idle for too long
The blocking sockets don't have active connection_lost() events, if the user is trying to execute a non-read-only query over a disconnected connection, is_closed() cannot capture the disconnection and the retry loop wouldn't work. In this case, we are sending a SYNC before the query, so that we could capture the connection error and reconnect.
This is tailored to handle the server-side session_idle_timeout; for regular network interruptions within the session_idle_timeout, we don't want to sacrifice performance by inserting more frequent SYNCs just to avoid connection errors on non-read-only queries.
This is tailored to handle the server-side
session_idle_timeout; for regular network interruptions within thesession_idle_timeout, we don't want to sacrifice performance by inserting more frequent SYNCs just to avoid connection errors on non-read-only queries.
Or maybe we do? We could also lazily select() for READ (and reconnect if necessary) every - say - (more than) 1 second per connection.
UPDATE: we don't, it's a rather marginal case and we cannot guarantee this doesn't happen at all.
Actually Elvis suggested over the meeting that, a proper fix should be that the server also sends a Terminate message before shutting down (idle) connections, so that the client won't have a vague state about the disconnection. I'll also do that, but this PR I think is still needed for EdgeDB 2.2 and lower.
- On unixes you can do just
pollwithPOLLOUT. If OS thinks connection is down, it will return an error. I think you can achieve the same withselecton windows. - As
Terminateis a subject to race condition, you have to handle it in every receive loop, which is quite inconvenient. And TCP RST packet sent by OS (which is caught by (1)) is almost as reliable as application specificTerminatepacket. Syncis probably the only reliable way (it covers more than (1) and (2) combined), although costly. But that cost is probably not critical as it is only incurred when connection was idle for the long time.
Also we could probably skip Sync if we send Parse instead of cached query, but that is probably over-optimizing too niche case.
I would go with (1) or (3), the latter is in current PR. As I don't see a significant difference between (1) and (2), whereas (2) complicates implementations of all blocking and async clients.
Sync is probably the only reliable way
Yeah.