node-postgres icon indicating copy to clipboard operation
node-postgres copied to clipboard

Docs: document how to proper make batch inserts/updates

Open abenhamdine opened this issue 8 years ago • 4 comments

Hello,

Regularly there some questions about how to make batch inserts/updates and how to escape values safely. Eg see : #1400 https://github.com/brianc/node-postgres/issues/1387 #957 https://github.com/brianc/node-postgres/issues/880

I don't find any reference to this topic in the new doc site, but a user of this library can legitimately wonder how to handle this :

  • with a multistatements query ? (no since not supporting parameters)
  • with multiple queries and flow control library (eg : async.js async.parallel or Promise.all) ?
  • with a multistatements query formatted with a third-party formatting module ? (eg like https://github.com/datalanche/node-pg-format, see example https://github.com/brianc/node-postgres/issues/957#issuecomment-200000070)
  • with a not-so-obvious query like https://github.com/brianc/node-postgres/issues/957#issuecomment-295583050 ?
  • within transaction or not ?
  • and probably an additional question : when to use pg-copy ?

abenhamdine avatar Jul 28 '17 09:07 abenhamdine

👍

qas avatar Aug 02 '17 21:08 qas

Any guidance on this? I'm thinking of this in comparison with the way JDBC's PostgreSQL driver works:

  1. JDBC batch API. That is PreparedStatement#addBatch() enables driver to send multiple "query executions" in a single network roundtrip. Current implementation, however would still split large batches into smaller ones to avoid TCP deadlock.

    The actions would be much better:

    execute query
    ...
    execute query
    execute query
    execute query
    sync <-- wait for the response from the DB
    

It seems that using Promise.all() might be the equivalent in node-pg. However:

  1. Does node-pg client state get corrupted if a client.query() call occurs before another client.query() call has resolved, on the same Client?
  2. Does the second client.query() wait until the first client.query() has received a response from the server, or only until the first client.query() has finished sending the query to the server?
  3. [Especially if we don't wrap time-overlapping client.query() calls in a transaction], does node-pg queue server responses appropriately to resolve the correct Promise with the correct response?
  4. Is there any performance advantage to wrapping the calls in a transaction?

I am planning to migrate a PostgreSQL JDBC microservice that uses 1000-batched inserts to achieve desired performance to node-pg, which is why I am trying to figure this out.

ariofrio avatar May 08 '20 06:05 ariofrio

@ariofrio Batched queries – pipelining – is broader than batch insert. pg doesn’t have pipelining yet. I think @brianc is working on it.

I am planning to migrate a PostgreSQL JDBC microservice that uses 1000-batched inserts to achieve desired performance to node-pg, which is why I am trying to figure this out.

You should probably try to do it in fewer queries anyway, like https://github.com/brianc/node-postgres/issues/957#issuecomment-295583050.

charmander avatar May 08 '20 18:05 charmander

case it still helps: temp tables can achieve it - see this comment: https://github.com/brianc/node-postgres/issues/956#issuecomment-1046197203

sebastianrothbucher avatar Feb 20 '22 09:02 sebastianrothbucher