Performance issues with pgxpool
Okay, so I just switched to using the pgx interface from database/sql for everything. I think database/sql has it's own default connection pool so I tried to use the pgxpool(v4) with pgx. It is just a basic program that finds all video files in a folder and all it's children. Anyway, its heavily uses goroutines. So here are the timings:
pgx with database/sql: 2 min 54 sec
pgxpool : 7 min 54 sec
The folder size is 412 GB with 1390 items. The documentation for pgxpool is pretty much non existent. So I'm currently using Exec() method for write queries and Query() for others. I'm a noob so well, it is probably my fault. Here is the link to the repo. So well, can anyone provide any suggestions or point to some sources on how to use pgx well?
The issue might be the connection pool size. It appears the code is starting a new Go routine for every item -- depending on the nature of your workload this could be ideal or it could be very inefficient. But regardless, pgxpool has fairly conservative default max size. I don't think database/sql has any default max size. Try setting pgxpool.Config.MaxConns to 50 or 100 and see if it makes a difference.
@jackc Also having really poor performance with pgxpool.
Using just pgx a select on a table with 1 row is 60ms, but using pgxpool it is 800ms.
Do you have a test case I could see?
I would be very interested in seeing this too!
It's very interesting issue. Do you have more details or it is a dead issue with no reply?
I am no longer working on the project since it was pretty much something to get more familiar with golang. In the end, it was somewhat solved by being smarter about when to start a new goroutine in my case just like @jackc mentioned.
Is there a solution to this issue? I'm experiencing this issue right now, however, increasing MaxConns seems to have no effect on the performance. With Pgx inserts and queries take ∼0.2ms while with PgxPool anywhere from 30-60ms. Any ideas?
Perhaps try calling Acquire and then the query method on the conn instead of the query method directly on the pool. This will allow measuring the acquire and query time separately to more accurately determine where the delay is occurring.
Also, check out https://pkg.go.dev/github.com/jackc/pgx/[email protected]/pgxpool#Pool.Stat. That may provide additional insight.
For me also, trying to use acquire from a pool is insanely slower than a single traditional connection My go mod:
github.com/gin-gonic/gin v1.9.0
github.com/jackc/pgx/v5 v5.3.1
I already made a benchmark with go-wrk, you can check it here
I still need to extract some numbers, but I am experiencing sloweness too compared to sql stdlib package. Both cases I made max connections 50
I still need to extract some numbers, but I am experiencing sloweness too compared to
sqlstdlib package. Both cases I made max connections 50
I just did the exercise to roll back to built-in pool in stdlib.sql, and I have seen no clear performance advantage (measuring 99% latency in a specific grpc call), so I'd say that so far the slowness might be circumstantial, so please disregard my previous comment.
Thanks a lot!
I would like to add my 2 cents to this thread: I use this package in one of my little hobby projects which I made in my free time and wondered why basically every query takes 1ms.
Now this does not sound much. But at the end, the DB is EMPTY, besides one product/customer/location. Indexes are created aswell. And yet, not matter what, it takes 1ms. Sometimes 980
Here the benchmarks:
# go test ./internal/database/... -bench=. -benchmem
2025.01.08 20:54:09.260 | database | Connecting to PostgreSQL...
2025.01.08 20:54:09.275 | database | Successfully connected to database
2025.01.08 20:54:09.275 | database | Checking database schema...
2025.01.08 20:54:09.281 | database | Current DB schema version: 0
2025.01.08 20:54:09.281 | database | Required DB schema version: 2
2025.01.08 20:54:09.281 | database | Upgrading schema from version 0 to 2
2025.01.08 20:54:09.294 | database | Applied migration to version 1
2025.01.08 20:54:09.309 | database | Applied migration to version 2
2025.01.08 20:54:09.309 | database | Database schema is up to date
goos: linux
goarch: amd64
pkg: go_pgxv5_bench/internal/database
cpu: AMD EPYC 7702P 64-Core Processor
BenchmarkGetCustomerLocations-10
7192 147392 ns/op 493 B/op 12 allocs/op
BenchmarkGetCustomersWithLocationCounts-10
7116 183781 ns/op 494 B/op 13 allocs/op
BenchmarkListCustomers-10
7177 165152 ns/op 492 B/op 13 allocs/op
BenchmarkCreateCustomer-10
2991 434387 ns/op 666 B/op 16 allocs/op
BenchmarkGetCustomer-10
5218 203537 ns/op 1293 B/op 25 allocs/op
BenchmarkUpdateCustomer-10
2586 476605 ns/op 314 B/op 10 allocs/op
BenchmarkDeleteCustomer-10
1293 839399 ns/op 796 B/op 21 allocs/op
BenchmarkCreateLocation-10
967 1371925 ns/op 2004 B/op 52 allocs/op
BenchmarkGetLocation-10
5592 208487 ns/op 1405 B/op 24 allocs/op
BenchmarkUpdateLocation-10
1890 562887 ns/op 347 B/op 13 allocs/op
BenchmarkDeleteLocation-10
685 2001530 ns/op 2129 B/op 57 allocs/op
BenchmarkListLocations-10
5373 200599 ns/op 1413 B/op 23 allocs/op
BenchmarkCreateProduct-10
3412 442558 ns/op 600 B/op 16 allocs/op
BenchmarkGetProduct-10
9864 157515 ns/op 956 B/op 19 allocs/op
BenchmarkUpdateProduct-10
2752 504309 ns/op 234 B/op 8 allocs/op
BenchmarkDeleteProduct-10
1501 889365 ns/op 728 B/op 21 allocs/op
BenchmarkListProducts-10
8013 127872 ns/op 437 B/op 9 allocs/op
PASS
ok go_pgxv5_bench/internal/database 28.338s
These results do not line up with the ones provided here: https://gist.github.com/jackc/c402b42244d3390f26c6 But ofc there always is a chance I am just using the package wrong. Is there a "best practice" tutorial or anything like this?
I have seen some older versions have a "prepare statement", but newer version seems to do this automatically. I think it would be cool, if we could get some more insights into where exactly the time is spend.
- the DB is unoptimized (when large and without Indexes)
- the TCP layer (when the DB is remote)
- the pgx package itself 3.1. preparing statements 3.2. need to wait for free object in pool 3.3. other stuff ...
@jackc I would be pretty happy if anyone could lead/guide me into a direction. Maybe also something like these settings:
- log = DEBUG (or TRACE)
- lofFile = ./pgx_v5.log
would be helpful to be able to store the detailed info in a file to analyse it later.
Best regards, Martin
Here also snippets from my code, so people who read that, have some context:
Code Snippet
// ProductRequest represents the product form data
type ProductRequest struct {
Name string `form:"name" json:"name" validate:"required"`
Version string `form:"version" json:"version"`
Description string `form:"description" json:"description"`
}
// setupTestDB creates a test database connection and cleans up tables
func setupTestDB(b *testing.B) *DB {
cfg := config.DBConfig
// Setup database connection
db, err := Connect(cfg)
if err != nil {
b.Fatal(err)
}
// Migrate database
err = db.InitSchema(cfg)
if err != nil {
b.Fatal(err)
}
// Clean up tables before each benchmark
_, err = db.Pool.Exec(context.Background(), `
TRUNCATE customers CASCADE;
TRUNCATE products CASCADE;
TRUNCATE locations CASCADE;
`)
if err != nil {
b.Fatal(err)
}
return db
}
// CreateProduct creates a new product in the database
func (db *DB) CreateProduct(product *models.ProductRequest) (int, error) {
var id int
err := db.Pool.QueryRow(context.Background(), `
INSERT INTO products (name, version, description)
VALUES ($1, $2, $3)
RETURNING id`,
product.Name,
product.Version,
product.Description,
).Scan(&id)
if err != nil {
return 0, fmt.Errorf("error creating product: %w", err)
}
return id, nil
}
Benchmark:
func BenchmarkCreateProduct(b *testing.B) {
db := setupTestDB(b)
defer db.Pool.Close()
b.ResetTimer()
for i := range b.N {
product := &models.ProductRequest{
Name: fmt.Sprintf("Test Product %d", i),
Version: fmt.Sprintf("1.0.%d", i),
Description: fmt.Sprintf("Test Description %d", i),
}
_, err := db.CreateProduct(product)
if err != nil {
b.Fatal(err)
}
}
}
I added "Aquire" and defer "Release" to the functions, now the benchmarks look like this:
goos: linux
goarch: amd64
pkg: go_pgxv5_bench/internal/database
cpu: AMD EPYC 7702P 64-Core Processor
BenchmarkGetCustomerLocations-10
6226 164144 ns/op 466 B/op 14 allocs/op
BenchmarkGetCustomersWithLocationCounts-10
6709 195440 ns/op 451 B/op 13 allocs/op
BenchmarkListCustomers-10
6006 178938 ns/op 451 B/op 13 allocs/op
BenchmarkCreateCustomer-10
2290 572324 ns/op 627 B/op 16 allocs/op
BenchmarkGetCustomer-10
6170 234969 ns/op 1250 B/op 25 allocs/op
BenchmarkUpdateCustomer-10
2041 505314 ns/op 315 B/op 10 allocs/op
BenchmarkDeleteCustomer-10
1153 1135982 ns/op 759 B/op 21 allocs/op
BenchmarkCreateLocation-10
890 1783827 ns/op 1881 B/op 52 allocs/op
BenchmarkGetLocation-10
5164 234506 ns/op 1362 B/op 24 allocs/op
BenchmarkUpdateLocation-10
2004 748476 ns/op 347 B/op 13 allocs/op
BenchmarkDeleteLocation-10
511 3636844 ns/op 2008 B/op 56 allocs/op
BenchmarkListLocations-10
4180 245810 ns/op 1371 B/op 23 allocs/op
BenchmarkCreateProduct-10
2757 708746 ns/op 558 B/op 16 allocs/op
BenchmarkGetProduct-10
6277 177829 ns/op 914 B/op 19 allocs/op
BenchmarkUpdateProduct-10
2271 717145 ns/op 236 B/op 8 allocs/op
BenchmarkDeleteProduct-10
1200 1296335 ns/op 687 B/op 21 allocs/op
BenchmarkListProducts-10
9129 143186 ns/op 394 B/op 9 allocs/op
PASS
ok go_pgxv5_bench/internal/database 30.586s
@the-hotmann
Just to clarify, the Acquire and Release should be done once for the entire benchmark. That is, a connection should be checked out and only that connection used for all database activity. While obviously you can't do that in production, that would isolate any pgxpool overhead from the underlying pgx.Conn.
That said, your original message doesn't seem to match the benchmark results you posted.
I use this package in one of my little hobby projects which I made in my free time and wondered why basically every query takes 1ms.
BenchmarkGetCustomerLocations-10
7192 147392 ns/op 493 B/op 12 allocs/op
That is taking 147 microseconds, much less than 1 millisecond. The other read benchmarks seem similarly fast. I'm not sure how much faster you can expect a machine with relatively slow single threaded performance to run. For reference, using an M3 Max on a Unix domain socket the fastest I can do a minimal query is around 25 microseconds.
I doubt your issues are with pgxpool, but you can test by making a benchmark that runs the same query with a conn directly and with the pool. If there is no contention for the pool the overhead should be under 1 microsecond.
You also could run the benchmarks for the pgx driver itself to have a comparison.
@jackc thanks for you clear and valuable response!
That is taking 147 microseconds, much less than 1 millisecond. The other read benchmarks seem similarly fast. I'm not sure how much faster you can expect a machine with relatively slow single threaded performance to run. For reference, using an M3 Max on a Unix domain socket the fastest I can do a minimal query is around 25 microseconds.
I doubt your issues are with pgxpool, but you can test by making a benchmark that runs the same query with a conn directly and with the pool. If there is no contention for the pool the overhead should be under 1 microsecond.
I guess you're right! I will rerun the benchmarks without Acquire and Release and compare it to the time it takes with psql itself.
Thanks for your response :)