postgrest icon indicating copy to clipboard operation
postgrest copied to clipboard

experiment: use pipeline mode

Open robx opened this issue 2 years ago • 4 comments

This pulls in an exploratory implementation of pipeline mode (#2295). The bulk of the change is in the supporting libraries:

  • postgresql-libpq is extended to wrap the libpq pipeline mode API
  • hasql is hacked to allow queueing up pipelined statements, with pipeline synchronization and result reading (ignoring) deferred until the next result is required

Some notes as to the state of this:

  • the hasql change is very much just a minimal hack to allow us to evaluate pipeline mode in postgrest -- it makes little sense as some kind of "pipeline mode for a hasql" feature
  • the implementation seems mostly correct, but error handling is at least a bit broken (we don't treat aborted pipelines quite right), and there's a decent chance that some failure scenarios actually mess up the connection state, though I haven't seen that
  • this includes #2682 to allow simulating slow postgres

robx avatar Mar 16 '23 15:03 robx

Some performance results, using postgrest-loadtest.

pipeline=yes pipeline=no main branch
pgdelay=0 300 287 294
pgdelay=1ms 62 55 55
pgdelay=5ms pgrst_delay=5ms 17.3 14.5 14.5
pgdelay=1ms pgrst_delay=10ms 26.6 24.8 24.7
pgdelay=10ms 12.1 9.6 9.6
pgdelay=50ms 2.7 2.2 2.2
  • the number is request rate from loadtest output -- it isn't particularly stable, e.g. I wouldn't trust the 300 > 294 in the first row to signal an improvement. But the overall improvement between the columns seem consistent.
  • pipeline=yes/no is on this branch, with usePipeline set to True or False; main branch is with unmodified dependencies
  • command line is e.g. PGRST_BUILD_CABAL=1 PGDELAY=1ms PGRST_DELAY=10ms postgrest-loadtest
  • PGRST_BUILD_CABAL=1 says to build using postgrest-build for quicker iteration (there's a supporting change in this PR)

Regarding the results:

  • there's some overhead in hasql that comes with supporting pipeline mode; this seem to be noticeable as a slight performance cost in the undelayed scenario
  • as soon as there's a bit of a latency towards postgresql (regardless of how that latency compares to the http client latency), pipeline mode does seem to provide a measurable benefit

robx avatar Mar 16 '23 15:03 robx

The library changes:

  • https://github.com/nikita-volkov/hasql/compare/master...robx:hasql:pipeline2
  • https://github.com/nikita-volkov/hasql-transaction/compare/master...robx:hasql-transaction:pipeline
  • https://github.com/PostgREST/postgresql-libpq/compare/master...robx:postgresql-libpq:pipeline

The postgresql-libpq change is essentially good to go upstream, but I haven't filed it yet.

robx avatar Mar 16 '23 16:03 robx

Very cool!

wolfgangwalther avatar Mar 16 '23 16:03 wolfgangwalther

Just FYI, I'm leaving the pgbench pipeline test on https://github.com/steve-chavez/postgrest/commit/682930d7b81116728f9c941b978cb758b73a3780.

steve-chavez avatar Mar 21 '23 08:03 steve-chavez