skunk
skunk copied to clipboard
Inefficient use of Flush in extended query protocol
When executing a prepared statement, Skunk currently uses the following protocol flow:
- Parse
- Flush
- Await response and check for errors
- Describe statement
- Flush
- Await response and check for errors
- Bind
- Flush
- Await response and check for errors
- Execute
- Flush
- Await response and check for errors
- Close portal
- Flush
- Await response and check for errors
- Close statement
- Flush
- Await response and check for errors
This is needlessly inefficient, as it prevents batching of request commands and requires a full network round-trip on every step.
In addition, this currently fails to work—and instead hangs indefinitely—with a Pgpool-II server, which does not fully honour Flush. (This is currently being discussed on the pgpool-general mailing list.)
As far as I can tell, the only reason it is done this way is that MessageSocket
does not allow queueing up response processing asynchronously and submitting more commands in the meantime. For example, skunk.net.protocol.Parse
currently does this:
https://github.com/tpolecat/skunk/blob/9e0744d00f6876f1eda5016ea34d7f0ecac40736/modules/core/shared/src/main/scala/net/protocol/Parse.scala#L38-L55
where flatExpect
suspends the fibre handling the current database connection until a response message is received. In an ideal world, it would schedule the block to be run whenever the next message arrives, but continue executing the sending fibre so more protocol commands can be queued; so the Flush could be omitted.
Any idea as to how this could be allowed in a clean way? I’m thinking that the sending and receiving should be separate fibres, but I’m not well-versed enough to see immediately how to achieve that.
Hi, thanks for the issue.
The big design goal for Skunk is transparent semantics: when things happen in code they happen on the server, so errors happen in the expected place, on the expected fiber (sessions can be used concurrently and we don't want an error on one fiber to be detected on another). So exchanges with the database must be self-contained (i.e., they need to end with ReadyForQuery
, which sometimes requires a Flush
).
Having said that, we're doing more exchanges than we need to do, and aren't caching everything that can be cached:
-
Parse
andDescribe
are a single operation at the API level and can thus be pipelined into a single exchange. In addition, the results can normally be cached and the resulting statement can be reused. Right now we're only caching theDescribe
exchange. -
Bind
and the initial relatedExecute
are also a single operation at the API level and can thus be pipelined. - Portals need to be closed eagerly, but not as eagerly as it's done now. We can save them up and close them asynchronously after the session is returned to the pool.
Once this is implemented, the current 6+ exchanges would be reduced to 2+ from on the first use:
- Parse+Describe
- Flush (may not be necessary)
- Await response and check for errors
- Bind+Execute
- Flush (may not be necessary)
- Await response and check for errors
- Execute*
- Flush (may not be necessary)
- Await response and check for errors
On subsequent use the initial exchange would be unnecessary, so interactions that fetch a single page of results would require only one:
- Bind+Execute
- Flush (may not be necessary)
- Await response and check for errors
- Execute*
- Flush (may not be necessary)
- Await response and check for errors
It probably makes sense to pre-fetch the next page of results when processing the result stream, so you can be reading the next page while you're streaming the current page back to the client or whatever.
In any case there is a plan. Right now I'm working on replacing the session queue, which will make caching easier.