pgx Use hash as key for statement cache

Currently, the LRU cache of prepared statements in pgx.Conn is using the query string as cache key. This makes it tricky in our project where a query builder is reusing a byte buffer and sending query strings to pgx directly from the allocated buffer.

I have two suggestions, and I'm happy to contribute with both.

1. Use a hash of the query as key for the statement cache

Instead of using the query string directly, hash it (just like stmtcache.StatementName) and use the hash as the key, efficiently keeping no reference to the original string. A query might also be rather long, so this should decrease the memory footprint.

2. Switch hashing algorithm to xxhash

For both stmtcache.StatementName and cache key, switch from sha256 to xxhash. This should:

reduce allocations (not measured yet, but looks like e.g. stmtcache.StatementName does three allocations, which in this case would be reduced to one allocation)
reduce memory footprint (when going from a 24-byte slice to an 8-byte integer)
increase the performance (as xxhash is mush faster than sha256)

Alternatively: Add support for metadata on a `pgx.Conn`

If you don't agree with the suggestions above, it would at least be nice to have support for some arbitrary metadata on the pgx.Conn (either any or unsafe.Pointer) so that we can roll our own statement cache, without the need for synchronisation that an outside map would require.

Mar 25 '24 09:03 Webbmekanikern

Currently, the LRU cache of prepared statements in pgx.Conn is using the query string as cache key. This makes it tricky in our project where a query builder is reusing a byte buffer and sending query strings to pgx directly from the allocated buffer.

I'm not sure how this would make a difference. A string will get allocated and the data copied from the []byte regardless of what pgx does later.

Instead of using the query string directly, hash it (just like stmtcache.StatementName) and use the hash as the key, efficiently keeping no reference to the original string. A query might also be rather long, so this should decrease the memory footprint.

I don't think this will reduce memory usage. The statement cache still uses the normal pgx/pgconn prepared statement system and that keeps a reference to the original SQL. See https://pkg.go.dev/github.com/jackc/pgx/[email protected]/pgconn#StatementDescription.

Switch hashing algorithm to xxhash

This would involve adding an external dependency. There would need to be a very significant performance increase to justify that.

Alternatively: Add support for metadata on a pgx.Conn

I think we can do this. See the https://github.com/jackc/pgx/issues/1896 for the current proposal.

Apr 14 '24 01:04 jackc

Alternatively: Add support for metadata on a pgx.Conn

This has just been added in 6f0deff0156a7ffcd557eac1011ebb5ce73739d3.

May 09 '24 20:05 jackc

pgx pgx copied to clipboard

Use hash as key for statement cache

1. Use a hash of the query as key for the statement cache

2. Switch hashing algorithm to xxhash

Alternatively: Add support for metadata on a pgx.Conn

pgx
pgx copied to clipboard

Alternatively: Add support for metadata on a `pgx.Conn`