norm icon indicating copy to clipboard operation
norm copied to clipboard

#50 add connection pooling

Open PhilippMDoerner opened this issue 1 year ago • 2 comments

THIS IS NOT A FINISHED PR

PR for Issue #50

Note

I would like to cooperate on this with you @moigagoo though, to throw ideas back and forth for how exactly we implement this in norm. I've provided a first stab on how this could look like. Please give your general thoughts as well as your specific thoughts about the suggestions I'm making in the "How the pool works" section (marked with "Idea N", N being a number).

How the pool works

The pool is basically just a glorified private global variable of type seq[DbConn] with a lock for thread-safe access. This seq can not be interacted with outside of the specifically provided procs that make up the public API of this module.

Initially you basically just need to initialize the pool. Upon initialization it will create said seq and populate it with DbConn instances created via a proc that creates connections. That proc can be supplied by the user, or they supply nothing, in that case the current getDb proc is used to create connections. (Idea 1: We may want to add checks that if getDb is used, that the env variables must've been set and that we just crash the program if they haven't been).

The user then can interact with the pool in an identical manner to tinyPool and more:

  1. borrow/recycle connections
  2. use withDb (which now borrows/recycles connections instead of creating/closing them)
  3. use getDb to actually create a connection (Idea 2: I do not think it is a good idea to allow the user to circumvent the pool as there is no benefit to them and instead this puts them at risk if they forget to close a connection. I would like to make this proc private, even if that is a breaking API change)
  4. use dropDb to drop the database (Idea 3: I am currently uncertain about whether having this makes sense, I feel tempted to suggest that we just remove this proc as well to keep the API simpler)

Should the user call borrowConnection while the pool is empty it goes into burst mode (functionality is the same as in tinypool, so here the explanation from my readme there):

If more connections are needed than it has, the pool will temporarily go into "burst mode" and automatically refill with a new batch of connections, the amount of which is determined by the poolSize. While the pool is in burst mode it can hold an unlimited amount of connections. While the pool is not in burst mode, any superfluous connection that is returned to it gets closed. Burst mode ends after a specified duration (30 minutes), though that gets extended if the connections from the added batch are still needed.

Code organization

genericPool.nim is the center-piece for logic regarding handling connection pools. It also contains and exports all possible gobal variables needed for interacting with any kind of connection pool. It gets imported by individual pool.nim files that basically just contain the instance of the pool as well as the public API procs made specific for this particular pool type (aka whether it's for ndb/sqlite or ndb/postgres. It further exports the global variables needed to interact with this specific kind of connection pool. pool.nim modules then get imported and exported by the respective sqlite.nim/postgres.nim modules, making the API for interacting with the connection pool immediately accessible as part of importing sqlite.nim/postgres.nim.

I've further refactored out various things from the individual sqlite.nim and postgres.nim modules into genericPool.nim, since these are about handling the connection to the database and it made sense to me to centralize that into genericPool.nim.

If you have a different idea on how this should be structured please give feedback! As stated, this is just a first stab at it.

Questions

I have a question here though: I have some trouble understanding how the proc dropDb for postgres works:

proc dropDb* =
  ## Drop the database defined in environment variables.
  let dbConn = open(getEnv(dbHostEnv), getEnv(dbUserEnv), getEnv(dbPassEnv), "template1")
  dbConn.exec(sql "DROP DATABASE IF EXISTS $#" % getEnv(dbNameEnv))
  close dbConn

What exactly is "template1" and why is it inserted there instead of using the dbNameEnv variable?

PhilippMDoerner avatar Sep 15 '22 17:09 PhilippMDoerner

General Concept

I wonder if would be better or at all possible to not hide the global pool variable. I.e., what if we have:

  1. the Pool type
  2. acconmpanying new proc that
  3. proc to excplitly add connections to the pool, e.g. addConnections(n: Positive, conn: DbConn = getDb())
  4. iterator that gets the next free connectio from the pool, e.g. nextConn(p: Pool)
  5. withPool macro that is very similar to withDb but a) calls nextConn to get the actual connection and b) doesn't run close on the connection after it's done; withDb and withPool could probably share a common core implementation

This way, you'd define your pool in a shared module and pass it around as an explicit global variable. What do you think? We could have a call and discuss it if you want.

dropDb

I don't think this should be implemented here. Dropping databases has nothing to do with pooling.

Burst Mode

AFAIU burst mode happens when all connections are busy but a request for a new one is made. Is this correct?

If it is, I think it should be replaced with an explicit check and call to addConnections. Also, I think we should define several policies that regulate how the pool should react to this situation, e.g. addConnection, wait, cancel.

moigagoo avatar Sep 17 '22 19:09 moigagoo

General Concept

I wonder if would be better or at all possible to not hide the global pool variable. I.e., what if we have:

1. the `Pool` type

2. accompanying `new` proc that

3. proc to explicitly add connections to the pool, e.g. `addConnections(n: Positive, conn: DbConn = getDb())`

4. iterator that gets the next free connection from the pool, e.g. `nextConn(p: Pool)`

5. `withPool` macro that is very similar to `withDb` but a) calls `nextConn` to get the actual connection and b) doesn't run `close` on the connection after it's done; `withDb` and `withPool` could probably share a common core implementation

This way, you'd define your pool in a shared module and pass it around as an explicit global variable. What do you think? We could have a call and discuss it if you want.

That would be great! Mostly because I think there's quite a bit of rapid exchange of ideas and concepts around this that can be done and that just goes by faster as a call.

dropDb

I don't think this should be implemented here. Dropping databases has nothing to do with pooling.

Done

Burst Mode

AFAIU burst mode happens when all connections are busy but a request for a new one is made. Is this correct?

If it is, I think it should be replaced with an explicit check and call to addConnections. Also, I think we should define several policies that regulate how the pool should react to this situation, e.g. addConnection, wait, cancel.

That is entirely correct. I think we can also discuss this during the call, mostly so I get a better idea of the vision you have behind it.

PhilippMDoerner avatar Sep 18 '22 08:09 PhilippMDoerner