puddle icon indicating copy to clipboard operation
puddle copied to clipboard

Application Hangs on `defer p.destructWG.Wait()` in Docker Environment

Open rubenhazelaar opened this issue 1 year ago • 6 comments

Description: As described in issue #25, my application hangs on the statement defer p.destructWG.Wait(). Any deferred statement or code that should execute after defer pool.Close() does not run. This issue occurs when running the application in a Docker container (FROM golang:1.22-alpine) on a Windows host machine through Docker Desktop. I have not tested this in other environments.

Steps to Reproduce:

  1. Run the application in a Docker container with the base image golang:1.22-alpine.
  2. Execute the provided code snippet.
  3. Observe that the application hangs on defer p.destructWG.Wait().

Expected Behavior: Deferred statements, including those after defer pool.Close(), should execute as expected.

Actual Behavior: Deferred statements after defer pool.Close() do not execute, causing the application to hang.

Environment:

  • Docker base image: golang:1.22-alpine
  • Host OS: Windows (via Docker Desktop)

Additional Context: This issue occurs in the context where an actual connection has not been made through the pool. Below is the simplified code used in my application:

package main

import (
	"context"
	"errors"
	"github.com/jackc/pgx/v5"
	"github.com/jackc/pgx/v5/pgxpool"
	"os"
)

func main() {
	err := run(context.Background())
	if err != nil {
		os.Exit(1)
	}
}

func run(ctx context.Context) error {
	// Create the pool in a run func which is called by main func

	poolConfig, err := pgxpool.ParseConfig(/* applicationConfig.Dsn */)
	if err != nil {
		return err
	}
	poolConfig.AfterConnect = func(ctx context.Context, conn *pgx.Conn) error {
		// Here I register some custom types
		return nil
	}

	pool, err := pgxpool.NewWithConfig(ctx, poolConfig)
	if err != nil {
		return err
	}
	defer pool.Close()

	// More code where an error is returned from run to main func, like so:
	err = aCallWhichFails()
	if err != nil {
		return err
	}

	return nil
}

func aCallWhichFails() error {
	return errors.New("test")
}

Feel free to adjust any part of this as needed!

rubenhazelaar avatar Oct 21 '24 10:10 rubenhazelaar

I can't reproduce with this example. But Pool.Close blocks until all resources are released. My guess is that a connection is not being released.

jackc avatar Oct 23 '24 01:10 jackc

I also have this issue on Mac in tests. Postgres is running using testcontainers. github.com/jackc/puddle/v2 v2.2.2

runtime.gopark(proc.go:425)
runtime.goparkunlock(proc.go:430)
runtime.semacquire1(sema.go:178)
sync.runtime_Semacquire(sema.go:71)
sync.(*WaitGroup).Wait(waitgroup.go:118)
github.com/jackc/puddle/v2.(*Pool[go.shape.*uint8]).Close.deferwrap1(pool.go:180)
runtime.deferreturn(panic.go:605)
github.com/jackc/puddle/v2.(*Pool[go.shape.*uint8]).Close(pool.go:195)
github.com/jackc/pgx/v5/pgxpool.(*Pool).Close.func1(pool.go:387)
sync.(*Once).doSlow(once.go:76)
sync.(*Once).Do(once.go:67)
github.com/jackc/pgx/v5/pgxpool.(*Pool).Close(pool.go:385)

My guess is that a connection is not being released.

@jackc Can you please advise what I cannot do locally to debug the reason why connection is not being released?

These are stats of the pool, before it is invoked to be Closed.

Screenshot 2024-11-28 at 13 38 00

nikolayk812 avatar Nov 28 '24 11:11 nikolayk812

acquiredResources = 1 indicates that something has checked out a connection. The pool can't close until that connection is released.

Unfortunately, I don't know of an easy way to find where that connection was checked out or what it is doing.

jackc avatar Nov 29 '24 16:11 jackc

I am using golang-migrate library and this code to make it work with pgx5. I suspect it might leak connection here..

func (r repo) ApplyMigrations() error {
	db := stdlib.OpenDBFromPool(r.pool)
	defer db.Close()

         //

	return nil
}

nikolayk812 avatar Nov 29 '24 17:11 nikolayk812

So yes, for me this snippet was the root cause.

db := stdlib.OpenDBFromPool(r.pool)
defer db.Close()

Now I don't take a connection from pgx5 pool, but use a standard go approach to create a conneciton and perform migrations.

nikolayk812 avatar Dec 17 '24 17:12 nikolayk812

In my case the deadlock was caused by sending batch without the close. I've made a repo with minimal demonstration: https://github.com/areknoster/puddle-reproduce

I think that in similar cases some finite timeout plus error return from pool close, with information about resource that wasn't freed could be helpful

areknoster avatar Dec 23 '24 13:12 areknoster