pgx icon indicating copy to clipboard operation
pgx copied to clipboard

Batch should flush every so often to reduce memory usage

Open csnewman opened this issue 1 year ago • 1 comments

At present, sendBatchExtendedWithDescription (https://github.com/jackc/pgx/blob/master/conn.go#L1157) queues all queries to be written and performs a single flush at the end. This causes the write buffer in the pgproto3 connection to become unnecessarily large.

Ideally every x statements, or once the buffer passes a certain size, it should be flushed to the server. This both allows the server to start processing, and also reduces the client side memory usage.

The libpq client appears to suggest the ideal approach is to be processing results while also queuing queries (https://www.postgresql.org/docs/current/libpq-pipeline-mode.html), however that would be a larger change.

csnewman avatar Apr 02 '24 22:04 csnewman

That would add a bit of complexity. It would mean some of the queries could be sent and then a failure occurs during c.eqb.Build() that prevents the rest of the batch from being sent. It's not clear what the state of the connection would or should be then. I guess if there were some benchmarks that showed a significant improvement it might be worth it, but my initial opinion is that the improvement is not worth the increased complexity.

If you need that level of control you can use pipeline mode: https://pkg.go.dev/github.com/jackc/pgx/[email protected]/pgconn#Pipeline (presumably with https://pkg.go.dev/github.com/jackc/pgx/v5#RowsFromResultReader).

jackc avatar Apr 13 '24 14:04 jackc