FrameworkBenchmarks
FrameworkBenchmarks copied to clipboard
Drogon and Postgres Batch Mode?
Looks like there's something fishy going on with drogon's build process seen in line https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/C%2B%2B/drogon/drogon-core.dockerfile#L34 and below?
It contains https://github.com/an-tao/postgres/releases/tag/batch_mode_ubuntu which is by the same author as the drogon framework...? I can't quite trace how exactly this differs but seems questionable?
@an-tao, I don't want to go behind your back bringing this up. I only just want a conversation surrounding what you've done here.
@cdbattags I use the batch mode patch (optionally) of libpq in drogon. I'm not the auther of it, the repository of postgres is forked from here Batch mode has a higher utilization of connections, so it can provide higher concurrent performance. I think there are many other frameworks that use this mode.
It's interessting to see that at these levels of performances, you need to patch the official sql drivers to go further. Did anybody know why they never merged this patch ? and why mysql did not implement it ? I'd like to use it for lithium but recompiling libpq from an outdated fork from 2016 to enable batch mode is tough. I'd prefer the user of my framework to use an up to date libpq that they don't have to recompile.
Just to link the libpq batch mode thread. Lots of information about it there:
https://www.postgresql.org/message-id/flat/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com
Keep in mind that batching is expressly disallowed by the TFB rules (except for the database updates test and only for the updates themselves). However, pipelining is permitted (and used by actix and others) and indeed I firmly believe that implementations that do not use it are simply not competitive (for the top spots) in the database tests. It just so happens that this unofficial fork of libpq exposes both functionalities behind the same interface, but I believe that using batches of one query (in order to leverage pipelining) would not be a violation of the rules, and I hope that's precisely what drogon is doing.
I am also really interested in using this for h2o, but coding against what is essentially a work-in-progress patch for libpq is not my idea of providing a production-grade implementation (another one of the TFB guidelines). However, I wouldn't hold anyone else to the same standard.
I won't do it for lithium neither. @volyrique Do you have info about how to pipeline multiple sql requests ? Do you mean we can just call PQsendPrepare several time before starting to call PQgetResult ?
It's impossible to do it with the upstream libpq - PQsendQueryPrepared() (or any other function) is going to return an error if it's called a second time while the first command hasn't completed. As for the fork - it should be possible according to the documentation, but I haven't bothered figuring out how because until those patches are merged, the interface may change significantly, so I don't see the point.
So do you mean that actix is also based on this fork ?
No, actix, Vert.x and other frameworks that support pipelining don't use libpq (or any variant of) at all, which is expected because if your implementation language is not C, then most probably you would like to avoid calls to a C interface such as libpq; heck, some people feel the same way when coding in C++.
So at the end, the only solution to boost c++ frameworks is to start a postgres c++ driver from scratch that support pipelining (I may be too lasy for this...), or just wait that next versions of libpq supports it...
If you are willing to write a database driver from scratch, you might as well just use the libpq fork - you are going to encounter the same issues. That is unless you really want a proper C++ PostgreSQL interface, in which case starting from scratch is justified (assuming that wrapping the libpq interfaces in classes and so on is not acceptable).
The only real solution for me is to wait for the upstream. Since pipelining support has been on the TODO list for a while, I am hopeful that we are going to get it eventually. In the meantime I will just have to live with unimpressive performance numbers.
Agree. Anyway without pipelining, perfs are still pretty good.
For info I just pinged [email protected] about this
Mariadb may also work on it someday, this may close the gab between mysql and postgresql: https://jira.mariadb.org/browse/CONC-478
Seems like the "outdated fork problem" just become a "up to date fork" problem: https://github.com/an-tao/drogon/issues/526 Hope this will land in master soon. I've implemented a pipelined version of lithium, so at least it will be ready when the libpq team merges this patch.
@an-tao you can check my script if you want to upgrade your libpq fork (note 1: that they changed the API function names so you'll have to update your framework, note 2: it gives a nice performance boost :) ): https://github.com/matt-42/FrameworkBenchmarks/blob/lithium-upgrade8/frameworks/C%2B%2B/lithium/compile_clang-pipeline.sh#L11-L18
@matt-42 Thanks so much. I'll try.
I took over the libpq batch mode patch and fixed the remaining issues. The patch is still waiting for review but hopefully it will get released with postgres 14...
https://commitfest.postgresql.org/29/2724/
The biggest issue I've found with pipelining and the reason I gave up caring for the time being is that you must take extra care with transactions. Two HTTP clients cannot share the same connection if one is to send a BEGIN command, but since my API above libpq creates a db pool that won't be hard to keep the db object if one started a transcation. Hoping for this to get in soon :)
It's true that it can cause very cryptic bugs if there is a yield (allowing another http connection to send an overlapping transaction) between a begin and a commit. In the current lithium implementation , if the user read the transaction results after committing the transaction, it's not possible that another coroutine send an overlapping transaction.
I finished the patch few weeks ago so it should be merged before the next pgsql release
@dantti @volyrique @an-tao @cdbattags the patch seems ok but the postgres team need some insight from potential users of the postgresql batch API. Could any of you have a look and give some feedback ? Thanks!
Here is the last patch: https://www.postgresql.org/message-id/attachment/115511/v23-0001-libpq-batch.patch The page where you can post your feedback (if you post it here I'll forward it): https://commitfest.postgresql.org/30/2724/
WOW lipq batch/pipelining support is finally going to be merged this month!! my
feeling is that this is committable, so I'll be looking at getting this pushed early next weeks, barring opinions from others. https://www.postgresql.org/message-id/flat/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com
@volyrique you'll be able to use it for h2o!
@LifeIsStrange it is already merged in master :) https://github.com/postgres/postgres/commit/acb7e4eb6b1c614c68a62fb3a6a5bba1af0a2659 pinging @an-tao
Recently I have worked on a further improvement to libpq's pipeline mode, which now has been merged upstream. You might be able to use it in your own code.
cc @pavelmash