postgres icon indicating copy to clipboard operation
postgres copied to clipboard

Truncated call stack for PostgresError

Open pesterhazy opened this issue 1 year ago • 5 comments

The call stack from PostgresError (for things like sql syntax errors or unique constraint violations) doesn't include any of my stack frames; instead it bottoms out in TCP.onStreamRead. This makes debugging database errors much more difficult. Here's an example from a test in node:test:

  Error [PostgresError]: update or delete on table "contacts" violates foreign key constraint "calls_customer_id_fkey" on table "calls"
      at ErrorResponse (/Users/paulusesterhazy/prg/telli/backend2/node_modules/postgres/cjs/src/connection.js:788:26)
      at handle (/Users/paulusesterhazy/prg/telli/backend2/node_modules/postgres/cjs/src/connection.js:474:6)
      at Socket.data (/Users/paulusesterhazy/prg/telli/backend2/node_modules/postgres/cjs/src/connection.js:315:9)
      at Socket.emit (node:events:519:28)
      at addChunk (node:internal/streams/readable:561:12)
      at readableAddChunkPushByteMode (node:internal/streams/readable:512:3)
      at Readable.push (node:internal/streams/readable:392:5)
      at TCP.onStreamRead (node:internal/stream_base_commons:189:23) {
    severity_local: 'ERROR',
    severity: 'ERROR',
    code: '23503',
    detail: 'Key (id)=(99cf2b0f-791d-41df-a911-d62ef51103cd) is still referenced from table "calls".',
    schema_name: 'public',
    table_name: 'calls',
    constraint_name: 'calls_customer_id_fkey',
    file: 'ri_triggers.c',
    line: '2478',
    routine: 'ri_ReportViolation'
  }

My understanding of the issue (which may not be accurate) is that generally v8's async stack traces should address this issue, but postgres doesn't use async/await internally for this interaction with the database so there's no coherent stack trace leading to the function call in my code.

Using postgres 3.4.4 and drizzle-orm 0.33.0.

pesterhazy avatar Oct 17 '24 17:10 pesterhazy

For context, here's a similar issue in node-postgres https://github.com/brianc/node-postgres/issues/2470#issuecomment-917635821

pesterhazy avatar Oct 17 '24 17:10 pesterhazy

I just tried switching to node-postgres, and indeed it has coherent stackframes. E.g.:

✖ getContactsGroupedByReached (7.579ms)
  error: could not determine data type of parameter $2
      at /Users/paulusesterhazy/prg/telli/backend2/node_modules/pg-pool/index.js:45:11
      at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
      at async getReached (/Users/paulusesterhazy/prg/telli/backend2/api/src/analytics.ts:122:15)
      at async getContactsGroupedByReached (/Users/paulusesterhazy/prg/telli/backend2/api/src/analytics.ts:151:20)
      at async TestContext.<anonymous> (/Users/paulusesterhazy/prg/telli/backend2/api/src/analytics.test.ts:73:18)
      at async Test.run (node:internal/test_runner/test:931:9)
      at async Test.processPendingSubtests (node:internal/test_runner/test:629:7) {

pesterhazy avatar Oct 21 '24 17:10 pesterhazy

I got an incomplete stack trace on query failure too, I know what is wrong with my query. Why I get an incomplete stack trace, where are the async parts of it?

Error: UNDEFINED_VALUE: Undefined values are not allowed
    at handleValue (file:///Users/user/project/node_modules/postgres/src/types.js:83:20)
    at file:///Users/user/project/node_modules/postgres/src/connection.js:225:38
    at Array.forEach (<anonymous>)
    at build (file:///Users/user/project/node_modules/postgres/src/connection.js:225:25)
    at Object.execute (file:///Users/user/project/node_modules/postgres/src/connection.js:167:7)
    at go (file:///Users/user/project/node_modules/postgres/src/index.js:344:14)
    at Query.handler (file:///Users/user/project/node_modules/postgres/src/index.js:333:14)
    at Query.handle (file:///Users/user/project/node_modules/postgres/src/query.js:140:65)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

franciscomerdot avatar Dec 29 '24 21:12 franciscomerdot

(Apologies for the long comment -- @porsager I have a question at the end, if you think this catch-based approach is worth adding for sql.unsafe)

Fwiw @pesterhazy it looks like:

a) these "truncated stacks" only happen when using sql.unsafe(...)?

(For me this is true; when using the regular sql tagged literal, the stack traces are correct/full, using some neat caching that uses the heuristic that "each sql invocation passes a SQL statement that is unique across the codebase")

b) setting postgres({ debug: true }) fixes this for sql.unsafe calls (although #290 notes this is likely too expensive to leave enabled in production),

I.e. with debug: true set I get "my code" added back to the stack trace:

PostgresError: syntax error at or near ")"
    at ErrorResponse (/home/stephen/other/joist-orm/node_modules/postgres/cjs/src/connection.js:790:26)
    at handle (/home/stephen/other/joist-orm/node_modules/postgres/cjs/src/connection.js:476:6)
    at Socket.data (/home/stephen/other/joist-orm/node_modules/postgres/cjs/src/connection.js:317:9)
    at Socket.emit (node:events:507:28)
    at addChunk (node:internal/streams/readable:559:12)
    at readableAddChunkPushByteMode (node:internal/streams/readable:510:3)
    at Socket.Readable.push (node:internal/streams/readable:390:5)
    at TCP.onStreamRead (node:internal/stream_base_commons:189:23)
    at new Query (/home/stephen/other/joist-orm/node_modules/postgres/cjs/src/query.js:34:41)
    at Function.unsafe (/home/stephen/other/joist-orm/node_modules/postgres/cjs/src/index.js:123:21)
    at PostgresDriver.executeQuery (/home/stephen/other/joist-orm/packages/orm/src/drivers/PostgresDriver.ts:64:39)
    at PostgresDriver.executeFind (/home/stephen/other/joist-orm/packages/orm/src/drivers/PostgresDriver.ts:59:17)
    at DataLoader._batchLoadFn (/home/stephen/other/joist-orm/packages/orm/src/dataloaders/loadDataLoader.ts:37:34)
    at dispatchBatch (/home/stephen/other/joist-orm/node_modules/dataloader/index.js:288:27)

Which is great! ...except for the "probably shouldn't use in production" wrinkle. :-/

What's interesting is that the node-pg version of this fix is done via a .catch and so able to run all the time (not just during debug mode):

https://github.com/brianc/node-postgres/pull/2983/files#diff-c63141091dba80ae61344624c42b69ce45c2a37db9774250df42093a40ad6613R523

If I apply this same trick, at the very end of postgres.js's Query constructor:

    // if tagged or debug enabled, return as-is, otherwise use catch trick
    return handler.debug || this.tagged ? this : this.catch(err => {
      Object.defineProperties(err, {
        stack: { value: err.stack + new Error().stack.replace(/.*\n/, '\n') }
      })
      throw err;
    })

Then I get stack traces that at least have the "first await" back in my code:

syntax error at or near ")"
PostgresError: syntax error at or near ")"
    at ErrorResponse (/home/stephen/other/joist-orm/node_modules/postgres/cjs/src/connection.js:790:26)
    at handle (/home/stephen/other/joist-orm/node_modules/postgres/cjs/src/connection.js:476:6)
    at Socket.data (/home/stephen/other/joist-orm/node_modules/postgres/cjs/src/connection.js:317:9)
    at Socket.emit (node:events:507:28)
    at addChunk (node:internal/streams/readable:559:12)
    at readableAddChunkPushByteMode (node:internal/streams/readable:510:3)
    at Socket.Readable.push (node:internal/streams/readable:390:5)
    at TCP.onStreamRead (node:internal/stream_base_commons:189:23)
    at /home/stephen/other/joist-orm/node_modules/postgres/cjs/src/query.js:43:37
    at processTicksAndRejections (node:internal/process/task_queues:105:5) <-- previous stack trace
    at DataLoader._batchLoadFn (/home/stephen/other/joist-orm/packages/orm/src/dataloaders/loadDataLoader.ts:37:18) <-- my code here is new!

Note that technically there are a few lines of stack missing, i.e. executeFind and executeQuery are present in the debug-based stack trace (which is most correct), because it looks like the catch-driven traces only picks up frames that are specifically using the await keyword (which loadDataLoader does, but the others do not).

So, it's not perfect, because it assumes the user-land code has an await somewhere, but that seems like a safe assumption.

@porsager wdyt? Are you open to a catch-based addition (only for non-debug / non-tagged queries) to help with sql.unsafe production stack traces?

Thanks!

stephenh avatar Feb 22 '25 20:02 stephenh

Any updates on this, makes it hard to debug when using Postgres in production 🙏

jczstudios avatar Jul 02 '25 22:07 jczstudios