Benchmarks Pool commands

Rather than creating them per request

Sep 27 '20 20:09 benaadams

No change on fortunes, so tried on updates:

db	updates-baseline	updates-pool
CPU Usage (%)	96	97	+1.04%
Raw CPU Usage (%)	2,698.85	2,706.41	+0.28%
Working Set (MB)	524	524	0.00%
Build Time (ms)	1,856	1,762	-5.06%
Start Time (ms)	326	322	-1.23%

application	updates-baseline	updates-pool
CPU Usage (%)	70	70	0.00%
Raw CPU Usage (%)	1,952.18	1,960.89	+0.45%
Working Set (MB)	481	480	-0.21%
Build Time (ms)	4,750	4,964	+4.51%
Start Time (ms)	1,595	1,561	-2.13%
Published Size (KB)	98,006	98,007	+0.00%

load	updates-baseline	updates-pool
CPU Usage (%)	2	2	0.00%
Raw CPU Usage (%)	43.98	45.07	+2.49%
Working Set (MB)	7	7	0.00%
Build Time (ms)	3,316	3,317	+0.03%
Start Time (ms)	0	0
Published Size (KB)	76,389	76,389	0.00%
First Request (ms)	81	81	0.00%
Requests/sec	12,737	12,913	+1.38%
Requests	191,923	194,986	+1.60%
Mean latency (ms)	22.53	22.23	-1.33%
Max latency (ms)	383.46	285.45	-25.56%
Bad responses	0	0
Socket errors	0	0
Read throughput (MB/s)	9.20	9.33	+1.41%
Latency 50th (ms)	17.54	17.24	-1.71%
Latency 75th (ms)	28.31	27.93	-1.34%
Latency 90th (ms)	43.44	43.06	-0.87%
Latency 99th (ms)	91.41	91.37	-0.04%

Sep 29 '20 23:09 sebastienros

@roji I think we have tested this approach by the past

Sep 29 '20 23:09 sebastienros

@roji does the command reparse the cmd string (to convert to from ado => postgres format, e.g. @parm to $1) on each execute or only if it has changed?

Sep 29 '20 23:09 benaadams

Raised issue https://github.com/npgsql/npgsql/issues/3200; it reduces allocations but reparses the query and generates a new one in pg format for each execution

Sep 30 '20 21:09 benaadams

SQL is very small and doesn't even contain parameters

All the SQL other than the fortunes benchmark contains parameters?

Oct 15 '20 10:10 benaadams

All the SQL other than the fortunes benchmark contains parameters?

IIRC yeah (though whether a parameter exists or not doesn't matter that much for SQL parsing, just a little bit).

Oct 15 '20 10:10 roji

Every little helps; top query per second is 1.1M and aspnet is 472k

1,185,480 = 20 * 59,274 472,120 = 20 * 23,606

Oct 15 '20 11:10 benaadams

@benaadams @roji is this old PR still relevant?

Mar 17 '23 20:03 DamianEdwards

@benaadams @roji is this old PR still relevant?

@roji said he was introducing a better way of doing it in a newer version of the driver; not sure of status of that

Mar 17 '23 21:03 benaadams

Things have changed quite a lot since this was done... Here are some thoughts.

Re SQL parsing, Npgsql 6.0 did introduce support for using (native) positional parameters and not parsing SQL (@p -> $1); for more details, see this write-up. This is automatically triggered when the command parameters are unnamed, but when there are no parameters, we do have to parse for backwards compat. IIRC the only benchmark that doesn't have parameters is fortunes, so we currently do parse there, which is unneeded overhead.

We do have an app context switch which allows disabling SQL parsing/rewriting globally in the application, even where there are no parameters. Since that switch is global and since Dapper and EF don't work with positional parameters, turning it on would break them. But unlike our implementation here, our TechEmpower platforms implementation has only raw (no Dapper/EF), so I'm doing that there (in https://github.com/TechEmpower/FrameworkBenchmarks/pull/8005). Though we'll have to figure out what to do if we unify the implementations (#1815).

Mar 18 '23 16:03 roji

The 2nd thing here is pooling the ADO.NET objects (e.g. NpgsqlCommand). With the introduction of NpgsqlDataSource, we'll soon be switching to creating commands directly from that instead of instantiating connections (https://github.com/aspnet/Benchmarks/pull/1816/files#r1139401127):

// instead of:
using var connection = new NpgsqlConnection(...);
using var command = new NpgsqlCommand("SQL", connection);
// we'll just do this:
using var command = dataSource.CreateCommand();

(We can do this since the benchmarks don't involve any connection state (e.g. transactions), and this models multiplexing much more correctly, i.e. just execute a command against the database, without needing to care about which connection it goes throw or how. This will also likely bring some optimizations later.)

Currently, NpgsqlDataSource.CreateCommand() doesn't pool. If it's really beneficial to do so, this is an optimization we can and should implement inside Npgsql itself; opened https://github.com/npgsql/npgsql/issues/5001 to track this.

/cc @vonzshik @NinoFloris

Mar 18 '23 17:03 roji

So beyond the above two things, I think this can be closed... We should definitely experiment with command pooling in Npgsql and see what happens.

Mar 18 '23 17:03 roji

@roji

We do have an app context switch which allows disabling SQL parsing/rewriting globally

Are there plans to enable setting this via a property on NpgsqlCommand directly?

Mar 19 '23 15:03 DamianEdwards

Not really... This whole thing is tricky and somewhat complex, and comes from the fact that someone in Npgsql's history decided to accept named parameter placeholders (@p) instead of positional ones ($1), and also to support batching by parsing the command's SQL for semicolons and splitting that to multiple batched statements at the wire protocol level. Neither of these are natively supported by PG, so Npgsql has to parse/rewrite in order to suppotr them (here's a writeup).

Now, if the command has parameters, we check whether they're named on or not (is DbParameter.ParameterName set). If they're unnamed, we take that as a signal that SQL parsing isn't required, i.e. positional parameters are being used. Since we already have a user gesture here (unset parameter name), we don't need an additional flag on NpgsqlCommand. Note that if you're using unnamed positional parameters, Npgsql also doesn't support semicolons for batching: you must instead use the DbBatch abstraction we introduced in Npgsql 6.0 (partially for this).

The only corner case is when there's no parameters at all. For this case, there's still the problem of semicolons inside the SQL (batching) - we must parse since there's no user gesture here. We could in theory introduce a bool property command just to skip parsing/rewriting in the no-parameters case, but that seems really excessive... I'd rather we made EF (and Dapper) compatible with positional parameters and DbBatch (yet another thing on my list...)

For now we can probably have #if FORTUNES or similar to enable this AppContext switch only when running fortunes...

(it's all been quite a long journey...)

Mar 19 '23 15:03 roji

The idea was that the parsed state would remain in the command if the command text didn't change; so reusing the command would skip the reparsing. Alas that isn't what happens and it reparses each time even though its the same command object with an unchanged command

Mar 19 '23 16:03 benaadams

@benaadams right. Parsing/rewriting was already disabled in all benchmarks with parameters, since we switched to positional placeholders a while ago; https://github.com/TechEmpower/FrameworkBenchmarks/pull/8005#issuecomment-1474907969 does that for Fortunes as well. So I don't think we need to worry about that part any more.

There's another kind of "parsing" which happens every time: to look up the PostgreSQL prepared statement in an internal dictionary. We're planning to add a data-source level API for "globally-available" prepared statements, that would skip this step (https://github.com/npgsql/npgsql/issues/4509).

In the meantime, we could in theory pool commands and assume that a command rented from the pool already has the correct SQL. That assumption would hold only in a single-statement benchmark, so it seems a bit unrealistic/problematic.

Mar 19 '23 17:03 roji

@benaadams Any reuse that we might introduce on the DbDataSource will likely release all query specific resources during return. I would personally like to see ADO.NET support for concurrent executions on a DbDataSource/connectionless command to be able to store one on a static (this would need new ExecuteReader methods accepting parameters as an argument and quite some rework in drivers, so it's not realistic any time soon).

If we really need fast pooling we could store relevant instances on the kestrel connection and pass them down.

Mar 19 '23 17:03 NinoFloris

Benchmarks Benchmarks copied to clipboard

Pool commands

Benchmarks
Benchmarks copied to clipboard