node-postgres
node-postgres copied to clipboard
Query streams break with `query_timeout`
Bug:
If a query_timeout
is reached when using a submit
-able query object that doesn't have a callback
method on it (such as all streaming query types, and probably most user-crafted submit
-ables), then the Client will throw a queryCallback is not a function
error. General repro would be:
const pg = require('pg')
const QueryStream = require('pg-query-stream')
const db = new pg.Client({ ..., query_timeout: 1000 })
db.connect(err => {
if (err) throw new Error('Connection error')
const qs = new QueryStream('SELECT pg_sleep(2)')
db.query(qs)
.on('data', d => console.log('data', d))
.on('end', d => console.log('end', d))
.on('error', d => (qs.close(), console.log('error', d)))
})
Analysis:
Although the node-postgres
Query
object has a callback
method for handling bulk results, other streaming objects, such as pg-cursor
and pg-query-stream
, do not have such a method. Looking at the code in client.js
, we see: https://github.com/brianc/node-postgres/blob/master/lib/client.js#L431-L457
If the Query object has a submit
method on it, and no callback
method, and if no callback is passed in as the second (values
) argument to Client#query
, then query.callback
will be undefined. Subsequently if a readTimeout
is specified (either as a query_timeout
on the Query object, or as part of the database configuration), then the read timeout assumes that query.callback
is defined and attempts to call it, resulting in a TypeError: queryCallback is not a function
.
Fix:
Since the callback
attribute appears to be an implementation detail of the node-postgres
Query
object, and the documentation for Client#query
does not indicate that this method should be present, or that a second parameter should be passed, it's unclear whether the fix is to make callback
a required attribute on all Query
-like objects, or if it's to fail gracefully if such a callback does not exist.
Furthermore, it's debatable whether a query_timeout
is even appropriate on non-bulk query types (timeout to results ready? timeout until entire result set has been retrieved?), and whether that timeout code should move into the typical bulk Query
object itself, dodging the issue entirely.
Because that's unclear, I'm leaving it to you to decide. Happy to submit a pull-request with a fix if you can tell me where/what you think the appropriate fix is.
I think that query_timeout
is useful in other situations. For example when you have a expensive query (like a gruoup by of joins on huge tables or a function that actually calculates some complicated thing).
With the query.on('row'
we can send results as they ocurr.
I need the query_timeout
be posible in these kind of call.
FYI, a monkey-patch fix for those using pg-promise
and the .stream
interface:
function pgpQueryStream(pgpDb, sql, params, initCb, opts) {
const qs = new QueryStream(sql, params, opts)
// FYI: If `query_timeout` is defined on the Client, it's impossible to
// disable the query_timeout by setting to undefined, or 0,
// which would negate the need for the hackiness that follows.
qs.query_timeout = opts && opts.queryTimeout
// Work around a bug in node-postgres -- not all Query objects have
// a `callback` method, but node-postgres is expecting it to exist.
// If it doesn't exist, then query_timeouts will start causing things
// to blow up.
qs.callback = () => undefined
// The following treats treats a read-timeout as time-to-first-data-result on
// query streams, rather than time to complete processing
const oldHandleDataRow = qs.handleDataRow
qs.handleDataRow = function () {
this.callback() // kills the readdb timeout
this.handleDataRow = oldHandleDataRow
oldHandleDataRow.apply(this, arguments)
}
// We have to patch this method too if the result is 0 rows, otherwise
// the read timeout on an older query can cancel the current active query
const oldHandleReadyForQuery = qs.handleReadyForQuery
qs.handleReadyForQuery = function () {
this.callback() // kills the read timeout
this.handleReadyForQuery = oldHandleReadyForQuery
oldHandleReadyForQuery.apply(this, arguments)
}
return pgpDb.stream(qs, initCb)
}
// Use:
pgpQueryStream(pgp, 'SELECT * FROM users WHERE id = $1', [123], qs => {
qs
.on('data', row => {
// do something with row
})
.on('end', () => { /* etc */ })
.on('error', () => { /* etc */ })
})
// Or perhaps more usefully:
async function * queryStreamIter(pgpDb, sql, params, opts) {
const stream = await new Promise(resolve => {
pgpQueryStream(pgpDb, sql, params, resolve, opts)
})
return yield * stream
}
// Then:
for await (const row of queryStreamIter(db, 'SELECT * FROM users WHERE id = $1', [123])) {
// do something with row
}
Is there any news on this? I am following the example at https://node-postgres.com/api/cursor and I am getting this error:
TypeError: queryCallback is not a function
at Timeout.setTimeout (node_modules/pg/lib/client.js:490:7)
at ontimeout (timers.js:436:11)
at tryOnTimeout (timers.js:300:5)
at listOnTimeout (timers.js:263:5)
at Timer.processTimers (timers.js:223:10)
I'm not even sure why I'm getting the timeout, because I'm getting rows returned. Is it because I am taking too long to process the rows returned by the cursor?
EDIT: This does indeed seem to be the case - I am half way through processing my rows when the code aborts with the above error. Is there any way to have the timeout only active while the call to cursor.read()
is in progress, rather than leaving it active between calls?
Any news so far? Still experiencing this
@brianc The issue still exists in the latest version. Could you have a look at it, please?
Any update? Error is still there:(
any update?
@brianc, Please provide ETA for this fix - to prevent Fatal exception using "query_timeout" with queryStream.