pgx
pgx copied to clipboard
Ignore ERROR: prepared statement already exists (SQLSTATE 42P05) error when preparing
Describe the bug
It's not clear what happened but we had a bunch of the following errors:
expected statement description, got *pgconn.ResultReader
expected statement description, got *pgconn.PipelineSync
unexpected pipeline result: *pgconn.PipelineSync
unexpected pipeline result: *pgconn.StatementDescription
Then the connection forever was throwing the error ERROR: prepared statement 'stmtcache_2fdb49e13e67d391f9a11b5965c6bdc9427a2b5e65613919' already exists (SQLSTATE 42P05).
To Reproduce Unfortunately, I have no idea how to reproduce this. There was a bunch of packet loss in GCP at the time this occurred so it could have been something related to that.
We issue this particular SQL like:
b := new(pgx.Batch)
b.Queue("BEGIN READ ONLY")
b.Queue(query, arg)
b.Queue("COMMIT")
pool.SendBatch(ctx, b)
Expected behavior
Despite the pipeline parsing failing, I would expect future prepares to handle 42P051 especially since Prepare says it's idempotent.
Alternatively, if we get an unexpected pipeline type, we should maybe call asyncClose so the connection can be thrown away?
Actual behavior
Instead that query forever failed on the connection with the error ERROR: prepared statement 'stmtcache_2fdb49e13e67d391f9a11b5965c6bdc9427a2b5e65613919' already exists (SQLSTATE 42P05).
Version
- Go:
$ go version-> 1.21.5 - PostgreSQL:
$ psql --no-psqlrc --tuples-only -c 'select version()'-> PostgreSQL 11.2-YB-2.18.2.2-b0 on x86_64-pc-linux-gnu, compiled by clang version 15.0.3 (https://github.com/yugabyte/llvm-project.git 0b8d1183745fd3998d8beffeec8cbe99c1b20529), 64-bit - pgx:
$ grep 'github.com/jackc/pgx/v[0-9]' go.mod-> v5.5.0
Additional context
My best guess is that there was maybe an error here and we failed to close the connection:
diff --git a/pgconn/pgconn.go b/pgconn/pgconn.go
index d5a67bea..49319350 100644
--- a/pgconn/pgconn.go
+++ b/pgconn/pgconn.go
@@ -2117,6 +2117,7 @@ func (p *Pipeline) getResults() (results any, err error) {
case *pgproto3.ParseComplete:
peekedMsg, err := p.conn.peekMessage()
if err != nil {
+ p.conn.asyncClose()
return nil, err
}
if _, ok := peekedMsg.(*pgproto3.ParameterDescription); ok {
If I happen to get a network error right at that peekMessage() then I can reproduce the issue.
I think you are correct about the peekMessage error. Fixed in cbc5a705 along with normalizing the error.
Not sure if that actually will resolve the original issue though.
The idempotency of Prepare is based on it keeping track of all prepared statements and being a no-op on repeats. It doesn't handle getting out of sync with the server. I suppose we could handle that case, but it seems like it might be masking an underlying issue.
The idempotency of
Prepareis based on it keeping track of all prepared statements and being a no-op on repeats. It doesn't handle getting out of sync with the server. I suppose we could handle that case, but it seems like it might be masking an underlying issue.
That makes sense, I'm just trying to figure out how to recover from this scenario. The solution might be to just kill the connection using the new OnPgError. I could try to call Deallocate but it isn't simple for me to generate the hash for the SQL since that's an internal detail.
@jackc We just saw this error on pgx 5.5.3! ERROR: prepared statement "stmtcache_e3bbdda88e6fb7fb4a24ad6c6d84a6a52aba9fd5c8667677" already exists (SQLSTATE 42P05)
for more information, we have a pgxpool that runs out of statement cache space relatively frequently (few times a day) and we also have very strict context timeouts so we often have conns getting timed out. Not sure if that info helps
We also experienced this error intermittently and had to roll back to 5.4, but unfortunately it's quite hard to track down the situation that leads to this. I will add that we also use context timeouts extensively, so perhaps there could be clue in there.
I think I found the underlying issue. If an error occurred when deallocating invalidated statements from the cache they would be removed from the cache but not actually deallocated on the server. In particular this could happen when a context was canceled before the query was even started.
This problem is fixed in 832b4f97718c2d9d2eb16bbd2fef1d05ede7aab5 and included in the just tagged v5.5.3.
@jackc Looking over that commit (and checking my PG knowledge) is it correct to say that the pipeline only succeeds or fails atomically? Or is it that deallocates are idempotent so potentially resending them is safe?
is it correct to say that the pipeline only succeeds or fails atomically?
Technically no, but practically yes. Preparing and deallocating statements at the protocol layer is not transactional. If I recall correctly it may even work in a broken transaction. (This might vary between real PostgreSQL and more or less compatible databases like CockroachDB.) But pretty much the only way it could partially succeed is if the network connection was interrupted. But if that happened the connection is dead anyway.
Or is it that deallocates are idempotent so potentially resending them is safe?
Protocol level deallocates are idempotent.
FYI @jackc we just saw the same error again on 5.5.3 Interestingly this time it was in a pgx.Batch execution that is part of a larger transaction
@jacobmikesell The problem only occurred in very specific circumstances.
If the size of the statement cache is exceeded then a query is evicted from the cache, but it is only marked as invalid it is not actually deallocated on the server.
Before the next query is sent all invalidated statements are deallocated.
However, deallocating invalidated statements did not occur inside of a transaction. I believe this was a relic of the older deallocate code that actually called a DEALLOCATE SQL statement instead of using the underlying protocol operation.
This wasn't a problem to regular queries as the entry still exists in Conn.preparedStatements so the call to Prepare() is idempotent. However, batch queries do not call Prepare(). They send all prepare statement requests in bulk.
So if a transaction was in progress, and a statement was evicted from the cache, and the batch then tried to send that exact query then the error would occur.
Should be resolved in 046f497efb4e92caa9575a0e9c351e4906af14c6. It allows deallocating invalidated statements inside of a transaction that is not in an error state.
@jackc sorry for the comment spam but just wanted to give you kudos for the investigations in this issue! Hopefully this is the last instance of this and we can close it out.
Thanks for looking into and fixing :D is there an estimate for when this will hit a minor version? We bump into it about once or twice a week (no good repro on our end) but I'd love to pull the new minor version and test it out!
@jacobmikesell I expect a new patch release next week.
Thank you for working on a fix for that. We needed to roll it back after bumping into this problem as well. Is there any news on the patch release?
@jacobmikesell The fix was included into v5.5.5.
I was running pgx (v5.6.0) on AWS lambda when the function timeout was exceeded and context was cancelled. Now every other run of the function fails with the error:
{"errorMessage":"ERROR: prepared statement \"stmtcache_7fb03f1b51f17f5fd2622a2d28406b66f300cdef9f045c81\" already exists (SQLSTATE 42P05)","errorType":"PgError"}
In the lambda function I create a *pgxpool.Pool and only call pool.Exec on it.
EDIT: turns out the context doesn't even have to be canceled, here is minimal reproduction code:
package main
import (
"context"
"os"
"github.com/jackc/pgx/v5/pgxpool"
)
func main() {
pgConfig, err := pgxpool.ParseConfig(os.Getenv("POSTGRES_URL"))
if err != nil {
panic(err)
}
postgresPool, err := pgxpool.NewWithConfig(context.Background(), pgConfig)
if err != nil {
panic(err)
}
defer postgresPool.Close()
_, err = postgresPool.Exec(context.Background(), "SELECT $1", "hello world")
if err != nil {
panic(err)
}
}
run the code two times to get this error:
panic: ERROR: prepared statement "stmtcache_165a22d8eabe300a7430a1c63e8bd81b6fbc45f42f099dcb" already exists (SQLSTATE 42P05)
@vytskalt
run the code two times to get this error:
You mentioned that you are running on AWS Lambda. It runs fine locally, as would be expected. It's impossible for two program executions to share the same connection. But that's what it sounds like is happening. It seems the there is a naive load balancer between your program and the server. When your second execution connects, it actually gets the same underlying connection with PostgreSQL that already has the statement prepared from the first execution.
The simplest solution is to connect with QueryExecModeCacheDescribe to disable the automatic prepared statement cache. Or you may be able to configure the load balancer to properly handle prepared statements. But either way, I don't think this is a pgx issue.
We have normal postgres server inside vm (qemu-system-x86_64 + kvm).
I have same issue right now:
go.mod:
github.com/jackc/pgerrcode v0.0.0-20220416144525-469b46aa5efa
github.com/jackc/pgx/v5 v5.5.5
Errors:
ERROR: prepared statement \"stmtcache_93cf7e50c125105caeb818738b522fdca92c652014a505c1\" already exists (SQLSTATE 42P05)
I can't share whole code it is too big.
How connection is made:
conn, err := pgxpool.New(context.Background(), "postgres://user:pass@somehostname:1337/db")
// ...
conn.Query(ctxWithTimeout, query)
I am also running into this issue (v5.6.0) while using a lambda connecting to a supabase db (with ?pgbouncer=true), my setup looks like:
var pool *pgxpool.Pool
func init() {
ctx := context.Background()
pgConn := os.Getenv("PG_CONN")
if pgConn == "" {
log.Fatalf("PG_CONN environment variable is not set")
}
var err error
pool, err = pgxpool.New(ctx, pgConn)
if err != nil {
log.Fatalf("Unable to connect to database: %v", err)
}
}
func handler(ctx context.Context, event events.SQSEvent) error {
// Use pgxpool
serv := service.New(pool)
...
return nil
}
func main() {
lambda.Start(handler)
}
Seems like it happens intermittently. The only way to fix this right now is to restart my supabase postgres instance
@reVrost I am running into the same issue. When I switched from Supabase's session mode (port 5432) to transaction mode (port 6543), I am hitting this error with mostly every query.
EDIT: Restarting supabase instance doesn't help.
@CermakM You're right, restarting only help for a moment but It quickly fails again soon after. Changing to session mode (5432) fixes the prepared statements issue. Although for me this led to "FATAL: Max client connections reached (SQLSTATE XX000)" error as I am using aws lambdas. And it seems that from supabase they recommend to
use transaction mode
session mode limits concurrent connections to the Supavisor pool size
use port 6543 with your pooler connection string
configure your client to disable prepared statements, all clients support this
There might be a way to disable prepared statements using pgx but I haven't thoroughly looked into the config settings. Currently I'm just reducing the number of concurrent lambdas with session mode to alleviate it for now.
There might be a way to disable prepared statements using pgx but I haven't thoroughly looked into the config settings.
Any ConnConfig.DefaultQueryExecMode other than the default QueryExecModeCacheStatement.
Just want to give an update connecting to port 6543 (transaction mode) to supabase with QueryExecModeSimpleProtocol seems to work ok. Although I'm not sure which is preferred 5432 with default exec mode or 6543 with simple protocol. @jackc some recommendation here which one is better would be appreciated :). Thanks!
While using the simple protocol I also ran into an issue when inserting jsonb bytes into the database. Similar to the issue written up here https://github.com/jackc/pgtype/issues/45
Although, I managed to solve it using the :copyfrom insert.
@reVrost I've never used Supabase, so I can't really give any advice there. But QueryExecModeCacheDescribe is a good choice if you do not have schema changes occurring while your application is running.
Thanks @jackc! QueryExecModeCacheDescribe works great with supabase's transaction mode.
Heres code for anyone interested:
config, err := pgxpool.ParseConfig(pgConnStr)
if err != nil {
log.Fatalf("Unable to parse connection string: %v", err)
}
config.ConnConfig.DefaultQueryExecMode = pgx.QueryExecModeCacheDescribe
pool, err = pgxpool.NewWithConfig(ctx, config)
if err != nil {
log.Fatalf("Unable to connect to database: %v", err)
}
In case someone runs into this with Supabase: use the port 5432 instead of the session mode 6543
I had this issue using a batched query and the first execution has a schema error and swallowed the error the second attempt of the same query produces this error.
@evanlurvey Can you provide example code demonstrating this?
Sure
Setup DB
docker run --rm -it -p 5432:5432 -e POSTGRES_PASSWORD=password postgres
main.go
package main
import (
"context"
"fmt"
"github.com/jackc/pgx/v5"
)
func main() {
ctx := context.Background()
conn, err := pgx.Connect(ctx, "postgres://postgres:password@localhost:5432/postgres")
if err != nil {
fmt.Println(err)
return
}
_, _ = conn.Exec(ctx, "CREATE TABLE foo (col1 TEXT PRIMARY KEY)")
err = query(ctx, conn)
fmt.Println(err)
err = query(ctx, conn)
fmt.Println(err)
}
func query(ctx context.Context, conn *pgx.Conn) error {
b := &pgx.Batch{}
b.Queue("SELECT col1 FROM foo").Query(func(rows pgx.Rows) error {
for rows.Next() {
var col1 string
rows.Scan(&col1)
}
return nil
})
b.Queue("SELECT col1 FROM baz").Query(func(rows pgx.Rows) error {
for rows.Next() {
var col1 string
rows.Scan(&col1)
}
return nil
})
err := conn.SendBatch(ctx, b).Close()
if err != nil {
return err
}
return nil
}
go mod init whatever
go mod tidy
go run .
ERROR: relation "baz" does not exist (SQLSTATE 42P01)
ERROR: prepared statement "stmtcache_abcef3bff3ee2c6d01eb3016fe43f23a45b61c9480c7ce77" already exists (SQLSTATE 42P05)
When you play with it and change the second query you will notice the sha doesn't change because it is just driven off the first query it seems. Also if you swap it so the first query bombs you do not get this error it handles it properly.
@evanlurvey Thanks for the clear reproduction. Should be fixed in fd0c65478e18be837b77c7ef24d7220f50540d49.