pg_duckdb icon indicating copy to clipboard operation
pg_duckdb copied to clipboard

Weird stuff happens in the background worker sometimes

Open JelteF opened this issue 7 months ago • 1 comments

Description

There seems to be an issue with the background worker syncing. It seems to have something to do with the background worker trying to remove non-existent tables for which it knows oids. I guess that's because the duckdb.tables stores oids, and somehow these tables got removed without the entries being removed from duckdb.tables. How that happens I don't know.

Then that resulted in QueryCancelHoldoffCount being 0, when we did not expect so. I made a temporary workaround for this in #745. To be clear, I think this is related, but I'm definitely not convinced it's a single bug. It could very well be a first bug triggering a second bug.

Duck catalog for database 'my_db' in 'postgres': {'uuid': 2432bf03-f7f8-4067-afc6-8ef476b11e15, 'oid': 2353, 'version': 1}
2025-04-25 15:14:33.384 CEST [724922] WARNING:  syntax error at or near "24577" at character 12
2025-04-25 15:14:33.384 CEST [724922] QUERY:  DROP TABLE 24577
2025-04-25 15:14:33.384 CEST [724922] WARNING:  Failed to drop deleted MotherDuck table 24577
2025-04-25 15:14:33.384 CEST [724922] DETAIL:  While executing command: DROP TABLE 24577
2025-04-25 15:14:33.384 CEST [724922] HINT:  See previous WARNING for details
2025-04-25 15:14:33.384 CEST [724922] WARNING:  syntax error at or near "24580" at character 12
2025-04-25 15:14:33.384 CEST [724922] QUERY:  DROP TABLE 24580
2025-04-25 15:14:33.384 CEST [724922] WARNING:  Failed to drop deleted MotherDuck table 24580
2025-04-25 15:14:33.384 CEST [724922] DETAIL:  While executing command: DROP TABLE 24580
2025-04-25 15:14:33.384 CEST [724922] HINT:  See previous WARNING for details
TRAP: failed Assert("QueryCancelHoldoffCount > 0"), File: "src/pgduckdb_node.cpp", Line: 299, PID: 724922
postgres: pg_duckdb sync worker (ExceptionalCondition+0x6e)[0x55e6df61c962]
/home/jelte/.pgenv/pgsql-17beta9/lib/pg_duckdb.so(+0x40112)[0x7cccd03e1112]
/home/jelte/.pgenv/pgsql-17beta9/lib/pg_duckdb.so(+0x41be7)[0x7cccd03e2be7]
/home/jelte/.pgenv/pgsql-17beta9/lib/pg_duckdb.so(+0x40150)[0x7cccd03e1150]
postgres: pg_duckdb sync worker (ExecEndCustomScan+0x1a)[0x55e6df2f8e67]
postgres: pg_duckdb sync worker (ExecEndNode+0x167)[0x55e6df2e49b7]
postgres: pg_duckdb sync worker (+0x3022ce)[0x55e6df2de2ce]
postgres: pg_duckdb sync worker (standard_ExecutorEnd+0x66)[0x55e6df2de3a4]
postgres: pg_duckdb sync worker (ExecutorEnd+0x1d)[0x55e6df2de457]
postgres: pg_duckdb sync worker (PortalCleanup+0x64)[0x55e6df275146]
postgres: pg_duckdb sync worker (PortalDrop+0x3f)[0x55e6df652051]
postgres: pg_duckdb sync worker (SPI_cursor_close+0x17)[0x55e6df3233c2]
/home/jelte/.pgenv/pgsql-17beta9/lib/pg_duckdb.so(+0x229ed)[0x7cccd03c39ed]
/home/jelte/.pgenv/pgsql-17beta9/lib/pg_duckdb.so(+0x1f200)[0x7cccd03c0200]
/home/jelte/.pgenv/pgsql-17beta9/lib/pg_duckdb.so(+0x22cb6)[0x7cccd03c3cb6]
/home/jelte/.pgenv/pgsql-17beta9/lib/pg_duckdb.so(+0x1f36a)[0x7cccd03c036a]
/home/jelte/.pgenv/pgsql-17beta9/lib/pg_duckdb.so(+0x1f5f5)[0x7cccd03c05f5]
/home/jelte/.pgenv/pgsql-17beta9/lib/pg_duckdb.so(pgduckdb_background_worker_main+0x18e)[0x7cccd03c0824]
postgres: pg_duckdb sync worker (BackgroundWorkerMain+0x291)[0x55e6df41b24e]
postgres: pg_duckdb sync worker (postmaster_child_launch+0xc7)[0x55e6df41d419]
postgres: pg_duckdb sync worker (+0x444c21)[0x55e6df420c21]
postgres: pg_duckdb sync worker (+0x444ed4)[0x55e6df420ed4]
postgres: pg_duckdb sync worker (+0x44507d)[0x55e6df42107d]
postgres: pg_duckdb sync worker (+0x445e45)[0x55e6df421e45]
postgres: pg_duckdb sync worker (BackgroundWorkerInitializeConnection+0x0)[0x55e6df4234ce]
postgres: pg_duckdb sync worker (main+0x219)[0x55e6df33e2ac]
/lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x7ccccf82a1ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x7ccccf82a28b]
postgres: pg_duckdb sync worker (_start+0x25)[0x55e6df0b9ef5]

JelteF avatar Apr 25 '25 14:04 JelteF

I found the cause for the oids being there. That was happening because the regclass queries were being sent through DuckDB. This problem is fixed by #770, but the cause for the QueryCancelHoldoffCount becoming 0 I still don't know. So leaving this issue open for now.

JelteF avatar May 07 '25 17:05 JelteF