edgedb-python Ping first if conn is idle for too long

The blocking sockets don't have active connection_lost() events, if the user is trying to execute a non-read-only query over a disconnected connection, is_closed() cannot capture the disconnection and the retry loop wouldn't work. In this case, we are sending a SYNC before the query, so that we could capture the connection error and reconnect.

This is tailored to handle the server-side session_idle_timeout; for regular network interruptions within the session_idle_timeout, we don't want to sacrifice performance by inserting more frequent SYNCs just to avoid connection errors on non-read-only queries.

Sep 12 '22 16:09 fantix

This is tailored to handle the server-side session_idle_timeout; for regular network interruptions within the session_idle_timeout, we don't want to sacrifice performance by inserting more frequent SYNCs just to avoid connection errors on non-read-only queries.

Or maybe we do? We could also lazily select() for READ (and reconnect if necessary) every - say - (more than) 1 second per connection.

UPDATE: we don't, it's a rather marginal case and we cannot guarantee this doesn't happen at all.

Sep 12 '22 16:09 fantix

Actually Elvis suggested over the meeting that, a proper fix should be that the server also sends a Terminate message before shutting down (idle) connections, so that the client won't have a vague state about the disconnection. I'll also do that, but this PR I think is still needed for EdgeDB 2.2 and lower.

Sep 14 '22 15:09 fantix

On unixes you can do just poll with POLLOUT. If OS thinks connection is down, it will return an error. I think you can achieve the same with select on windows.
As Terminate is a subject to race condition, you have to handle it in every receive loop, which is quite inconvenient. And TCP RST packet sent by OS (which is caught by (1)) is almost as reliable as application specific Terminate packet.
Sync is probably the only reliable way (it covers more than (1) and (2) combined), although costly. But that cost is probably not critical as it is only incurred when connection was idle for the long time.

Also we could probably skip Sync if we send Parse instead of cached query, but that is probably over-optimizing too niche case.

I would go with (1) or (3), the latter is in current PR. As I don't see a significant difference between (1) and (2), whereas (2) complicates implementations of all blocking and async clients.

Sep 15 '22 14:09 tailhook

Sync is probably the only reliable way

Yeah.

Oct 20 '22 18:10 1st1

edgedb-python edgedb-python copied to clipboard

Ping first if conn is idle for too long

edgedb-python
edgedb-python copied to clipboard