db icon indicating copy to clipboard operation
db copied to clipboard

Concurrent use of connection pool is surprisingly slow

Open Bogdanp opened this issue 1 month ago • 1 comments
trafficstars

This program:

#lang racket/base

(require db)

(define pool
  (connection-pool
   #:max-connections 64
   (lambda ()
     (postgresql-connect
      #:database "congame"
      #:user "congame"
      #:password "congame"))))

(define N 1000)
(define sema (make-semaphore 64))
(define (test q)
  (for-each
   thread-wait
   (for/list ([_ (in-range N)])
     (thread
      (lambda ()
        (call-with-semaphore sema
          (lambda ()
            (define conn #f)
            (dynamic-wind
              (lambda ()
                (set! conn (connection-pool-lease pool)))
              (lambda ()
                (query-row conn q))
              (lambda ()
                (disconnect conn))))))))))

(define-syntax-rule (do-test q)
  (begin
    (collect-garbage)
    (collect-garbage)
    (printf "=== ~.s~n" 'q)
    (time (test q))))

(do-test "SELECT 42")

is about 8x slower than the sequential version:

#lang racket/base

(require db)

(define pool
  (connection-pool
   #:max-connections 64
   (lambda ()
     (postgresql-connect
      #:database "congame"
      #:user "congame"
      #:password "congame"))))

(define N 1000)
(define (test q)
  (for ([_ (in-range N)])
    (define conn #f)
    (dynamic-wind
      (lambda ()
        (set! conn (connection-pool-lease pool)))
      (lambda ()
        (query-row conn q))
      (lambda ()
        (disconnect conn)))))

(define-syntax-rule (do-test q)
  (begin
    (collect-garbage)
    (collect-garbage)
    (printf "=== ~.s~n" 'q)
    (time (test q))))

(do-test "SELECT 42")

Whereas using a different pool implementation (from my resource-pool-lib package, also kill safe), results in about the same performance as the serial version:

#lang racket/base

(require data/pool
         db
         racket/promise)

(define pool
  (make-pool
   #:max-size 64
   (lambda ()
     (delay
       (postgresql-connect
        #:database "congame"
        #:user "congame"
        #:password "congame")))
   (compose1 disconnect force)))

(define N 1000)
(define sema (make-semaphore 64))
(define (test q)
  (for-each
   thread-wait
   (for/list ([_ (in-range N)])
     (thread
      (lambda ()
        (call-with-semaphore sema
          (lambda ()
            (define conn #f)
            (dynamic-wind
              (lambda ()
                (set! conn (pool-take! pool)))
              (lambda ()
                (query-row (force conn) q))
              (lambda ()
                (pool-release! pool conn))))))))))

(define-syntax-rule (do-test q)
  (begin
    (collect-garbage)
    (collect-garbage)
    (printf "=== ~.s~n" 'q)
    (time (test q))))

(do-test "SELECT 42")

I haven't dug in too deeply to figure out what's going wrong, but I do see that the pool implementation (and some of the other connection wrappers) mutate hash tables then iterate over them, which may be causing the same issue described here.

Bogdanp avatar Oct 10 '25 11:10 Bogdanp

After some experimenting, here's what I've discovered.

Things that don't seem to matter: Eliminating the hash tables entirely (since this example doesn't rely on them) doesn't improve performance much. Eliminating the proxy connection wrapper doesn't improve performance much.

Things that matter: Performance fell off a cliff with a concurrency factor of just over 10, which is the default value for max-idle-connections. If max-idle-connections is set to the concurrency factor, the performance cliff disappears. It seems that many releases would happen, excess connections would be discarded, and then the pool would have to create new ones when the lease requests came in. (I should fix this, at least.) Starting with a full idle list improves the performance further. I wonder if your implementation benefits from moving resource creation out of the actor thread; I need to look at how the Racket thread scheduler prioritizes threads.

After making the changes described above and eliminating every other significant difference I could think of, your version is still faster by about a factor of 2, and I'm not yet sure why.

rmculpepper avatar Oct 13 '25 06:10 rmculpepper

I'm going to go ahead and close this since it's been addressed by 016c4d5. Thank you!

Bogdanp avatar Nov 10 '25 13:11 Bogdanp