imdbench optimize js-postgres

The currently published benchmarks show js-pg performing at about 25% the performance of py-asyncpg, and as a Node+Postgres aficionado, I took that personally.

This PR swaps the current js-pg strategy of wrapping simple commits in a transaction, for single SQL queries per benchmark-ed query. This is achieved through the use of the json_agg aggregate function for returning multiple rows in sub-selects, json_build_object function for created nested data structures, and CTE's for the insert_movie/insert_movie_plus queries (all three of which are supported by the node-postgres library and postgres itself for years).

An added benefit is the removal of any post-query data mangling in JavaScript, making this benchmark push postgres itself to do as much work as possible. Note that this should cause a (small) regression in get_user, but it did not seem fair to pick and choose strategies for optimal benchmarks.

Mar 28 '22 04:03 mweberxyz

We can add this benchmark as an alternative one, but not replace the original approach. The point here is not to benchmark the driver doing almost nothing (fetching one JSON value is trivial), but to benchmark how fast it can unpack complex data.

Merging this would make benchmark unfair to asyncpg (which can also be modified to use JSON queries and will likely still be faster than pg).

Mar 29 '22 05:03 1st1

@1st1 makes sense to me. Would you be interested if I bring the js-pg to parity with py-asyncpg with identical approaches (primarily array sub-selects, as opposed to relying on transactions)? The node-postgres driver is certainly more capable than it's getting credit for.

Mar 29 '22 15:03 mweberxyz

Would you be interested if I bring the js-pg to parity with py-asyncpg with identical approaches (primarily array sub-selects, as opposed to relying on transactions)?

Sure, let's try.

The node-postgres driver is certainly more capable than it's getting credit for.

Last time we checked it wasn't capable enough. Maybe things have changed.

Mar 30 '22 17:03 1st1

As of most recent push, the behavior should be as-identical-as-possible to the strategy employed by py-asyncpg.

While true that json_agg is needed instead of array_agg, Postgres isn't generating any of the actual JSON that gets returned, this all still happens in index.js -- the rows arrive from the node-postgres as [{"f1": val1, "f2": val2, "f3": val3}], as opposed to [val1, val2, val3] in the case of py-asyncpg.

Mar 30 '22 21:03 mweberxyz