PgCat Gets Confused With Multiple Unnamed Prepared Statements
Describe the bug There's a bug in PgCat around parsing and binding of multiple unnamed prepared statements.
PgCat saves prepared statements in a HashMap<String, (Arc<Parse>, u64)>, where the key is the name of the prepared statement. It also parses and buffers all extended queries until it receives a Sync, at which point the queries are sent down to the server.
Because PgCat renames and dedupes the prepared statements it sends to the server, it first must make sure the server has all the prepared statements it needs. This is problematic in the case of unnamed prepared statements.
Take the following extended query:
- Parse("", "CREATE TABLE my_table")
- Bind("")
- Execute
- Parse("", "SELECT * FROM my_table")
- Bind("")
- Execute
- Sync
When PgCat receives this query it renames and buffers the first parse into its hashmap: { "": ("PGCAT_0", "CREATE TABLE my_table") }.
On the second parse it renames the parse, buffers it, and overwrites its map: { "": ("PGCAT_1", "SELECT * FROM my_table") }.
Now we get the sync and start working through the buffer.
On the first parse: we check to see if the server already has PGCAT_0. In our case it does not and buffers a Parse("PGCAT_0", "CREATE TABLE my_table") to send down.
On the first bind: we need to make sure the prepared statement we're using is already on the server. We look up "" in our hashmap and find ("PGCAT_1", "SELECT * FROM my_table"). PGCAT_1 is not on the server, so we first do a Parse("PGCAT_1", "SELECT * FROM my_table").
But wait! We just sent down the second prepared statement when we were still working on the first. Now the server will fail to parse PGCAT_1 (the table doesn't exist yet!), and the whole extended query will fail.
Now if the second query wasn't dependent on the first (or just didn't fail to parse in general), we should still get correct results. This is because we still buffer the correct Bind("PGCAT_1") and Bind("PGCAT_1") messages to send down to the server.
To Reproduce Send an extended protocol query with the following:
- Parse("", "CREATE TABLE my_table")
- Bind("")
- Execute
- Parse("", "SELECT * FROM my_table")
- Bind("")
- Execute
- Sync
Read the response and see that it fails.
Expected behavior PgCat handles unnamed prepared statements correctly and parses / binds them in the correct order.
Screenshots N/A
Desktop (please complete the following information):
- OS: aarch64 / AL2
- Browser: N/A
- Version: Postgres 16 / PgCat 1.2.0
Additional context Add any other context about the problem here.
Hi Taylor, this pattern of using multiple Parse statements before a single sync is not something I've seen in the wild. Is this something your ORM is doing? or are you specifically sending? are you able to send syncs in between the parses?
I've seen it used by JDBC specifically in the context of Liquibase update. This is with autocommit on:
PostgreSQL
Type: Parse
Length: 13
Statement:
Query: BEGIN
Parameters: 0
PostgreSQL
Type: Bind
Length: 12
Portal:
Statement:
Parameter formats: 0
Parameter values: 0
Result formats: 0
PostgreSQL
Type: Execute
Length: 9
Portal:
Returns: all rows
PostgreSQL
Type: Parse
Length: 4542
Statement:
Query [truncated]: select string_agg(word, ',') from pg_catalog.pg_get_keywords() where word <> ALL ('{a,abs,absolute,action,ada,add,admin,after,all,allocate,alter,always,and,any,are,array,as,asc,asensitive,assertion,assignment,asymmetric
Parameters: 0
PostgreSQL
Type: Bind
Length: 12
Portal:
Statement:
Parameter formats: 0
Parameter values: 0
Result formats: 0
PostgreSQL
Type: Describe
Length: 6
Portal:
PostgreSQL
Type: Execute
Length: 9
Portal:
Returns: all rows
PostgreSQL
Type: Sync
Length: 4
You can see it does an unnamed parse for BEGIN followed by the select.