pgx icon indicating copy to clipboard operation
pgx copied to clipboard

`fatal error: concurrent map writes` in `EnumCodec.lookupAndCacheString`

Open jamesroutley opened this issue 1 year ago • 4 comments

Describe the bug

We've been seeing fatal error: concurrent map writes errors when running queries using PGX that queries data from tables that have a column with an enum type.

Here's an example stack trace, which should be read bottom to top:

image

I think this is because there's nothing in that object preventing concurrent map writes. I've had a stab at a fix in #2088

To Reproduce

I haven't created a minimal reproduction unfortunately (sorry!).

Expected behavior

I would expect to be able to concurrently query tables which contain an uncached enum value

Actual behavior

We get a fatal error: concurrent map writes

Version

  • Go: $ go version -> go version go1.22.5 darwin/arm64
  • PostgreSQL: $ psql --no-psqlrc --tuples-only -c 'select version()' -> PostgreSQL 14.9 on aarch64-unknown-linux-gnu, compiled by aarch64-unknown-linux-gnu-gcc (GCC) 9.5.0, 64-bit
  • pgx: $ grep 'github.com/jackc/pgx/v[0-9]' go.mod -> v5.5.5

Additional context n/a

jamesroutley avatar Jul 16 '24 14:07 jamesroutley

EnumCodec (and any Codec for that matter) is not expected to be concurrency safe. How is this being used?

jackc avatar Jul 16 '24 17:07 jackc

Hi, sorry for the long delay here.

We're getting this error when reading from a table that has a column who's type is a custom enum (created with create type T as enum (X, Y, Z). Attached is a slightly longer stack trace showing that the error is originating from a DB read (ReadPaymentInstrument just does a select * from table where ...), which calls pgxpool.Pool.QueryRow.

From the trace, it looks like reading from a table that has a custom enum column calls EnumCodec.DecodeValue, which calls the non-concurrency-safe method lookupAndCacheString. Our service performs multiple reads concurrently - do you know of anything between QueryRow and lookupAndCacheString that would prevent concurrent calling of the latter?

Thanks

image

jamesroutley avatar Aug 09 '24 17:08 jamesroutley

Every connection gets its own type map and set of codecs. EnumCodec is not concurrency safe -- but I wouldn't have expected it to be possible for it to be used concurrently. 🤔

jackc avatar Aug 13 '24 23:08 jackc

Okay thanks that's interesting information. I'll dig a bit more into our setup and see if I can reproduce a minimal example

jamesroutley avatar Aug 15 '24 08:08 jamesroutley

I'm going to close this - I wasn't able to get to the root cause of this but it's stopped happening for us

jamesroutley avatar Nov 07 '24 17:11 jamesroutley

In case anybody else comes across this, one potential cause is loading custom types (e.g. as suggested here or here) once but using them for every new connection. Here's a sketch of the bad code:

customTypes := make([]*pgtype.Type, 0, len(customTypeNames))

// ... load custom types (only happens once, reused for every connection)

config.AfterConnect = func(ctx context.Context, conn *pgx.Conn) error {
	for _, t := range customTypes {
		conn.TypeMap().RegisterType(t)
	}

	return nil
}

And here's a sketch of the fixed code:

config.AfterConnect = func(ctx context.Context, conn *pgx.Conn) error {
        customTypes := make([]*pgtype.Type, 0, len(customTypeNames))

        // ... load custom types (now happens once per connection)

	for _, t := range customTypes {
		conn.TypeMap().RegisterType(t)
	}

	return nil
}

a-churchill avatar Nov 15 '24 21:11 a-churchill