Ingest regression
Running the ingest-tx-overhead benchmark on dc2c9dd02dc9b5133e2d2923e5141d94374eeff9 (warmed up against a local node) gives:
{"stage":"ingest-batch-1000","time-taken-ms":133,"bench-id":"edfa4d55-29f5-4811-a091-f459c68ca174","jvm-id":"fin"}
{"stage":"ingest-batch-100","time-taken-ms":404,"bench-id":"edfa4d55-29f5-4811-a091-f459c68ca174","jvm-id":"fin"}
{"stage":"ingest-batch-10","time-taken-ms":1999,"bench-id":"edfa4d55-29f5-4811-a091-f459c68ca174","jvm-id":"fin"}
and on main it currently gives
{"stage":"ingest-batch-1000","time-taken-ms":3458,"bench-id":"1db61cd0-8e3f-4408-acea-c9c8328d633e","jvm-id":"fin"}
{"stage":"ingest-batch-100","time-taken-ms":20795,"bench-id":"1db61cd0-8e3f-4408-acea-c9c8328d633e","jvm-id":"fin"}
I expect this will be almost entirely due to the pgwire overheads. Assuming so, it would be interesting to know whether moving binary Arrow over pgwire could close most of the gap.
Added a commit 5eadead2 to reuse the pg-wire connection per ingest task. Resulting in:
{"stage":"ingest-batch-1000","time-taken-ms":3240,"bench-id":"fa8b01ab-f692-437c-adb1-932fe1b9cdca","jvm-id":"fin"}
{"stage":"ingest-batch-100","time-taken-ms":7761,"bench-id":"fa8b01ab-f692-437c-adb1-932fe1b9cdca","jvm-id":"fin"}
two sources jump out:
-
xtdb.sql/->env, particularly xform-table-info and table-chains (13%) - snapshot creation on commit (12%)
Since commit 3159ca07269629a3e01885a64cba24f0fad8e096 and us moving fully to pgwire https://github.com/xtdb/xtdb/pull/4485. There has been some regression.
{"stage":"ingest-batch-1000","time-taken-ms":317,"bench-id":"419b95cb-e219-4ca3-9c2d-8859e359050a","jvm-id":"fin"}
{"stage":"ingest-batch-100","time-taken-ms":1393,"bench-id":"419b95cb-e219-4ca3-9c2d-8859e359050a","jvm-id":"fin"}
Commit mentioned above (still http over the wire).
{"stage":"ingest-batch-1000","time-taken-ms":2707,"bench-id":"44ce7f54-8cac-46e8-9920-16b9b4eecfc7","jvm-id":"fin"}
{"stage":"ingest-batch-100","time-taken-ms":4608,"bench-id":"44ce7f54-8cac-46e8-9920-16b9b4eecfc7","jvm-id":"fin"}
pgwire on main
Flamegraph of the issue: