skunk
skunk copied to clipboard
Automatic reconnection
Right now, if the connection to the server is lost (i.e. restart PostgreSQL server), the application needs to be restarted to get reconnected. I think this would be a nice feature to have :)
Here's the exception I get when I stop and then start the PostgreSQL server:
root[ERROR] java.lang.Exception: Fatal: EOF before 5 bytes could be read.Bytes
root[ERROR] at skunk.net.BitVectorSocket$$anon$2.$anonfun$readBytes$1(BitVectorSocket.scala:72)
root[ERROR] at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:139)
root[ERROR] at cats.effect.internals.IORunLoop$RestartCallback.signal(IORunLoop.scala:355)
root[ERROR] at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:376)
root[ERROR] at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:316)
root[ERROR] at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:136)
root[ERROR] at cats.effect.internals.IORunLoop$RestartCallback.signal(IORunLoop.scala:355)
root[ERROR] at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:376)
root[ERROR] at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:316)
root[ERROR] at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:136)
root[ERROR] at cats.effect.internals.IORunLoop$RestartCallback.signal(IORunLoop.scala:355)
root[ERROR] at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:376)
root[ERROR] at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:316)
root[ERROR] at cats.effect.internals.IOShift$Tick.run(IOShift.scala:36)
root[ERROR] at cats.effect.internals.PoolUtils$$anon$2$$anon$3.run(PoolUtils.scala:51)
root[ERROR] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
root[ERROR] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
root[ERROR] at java.lang.Thread.run(Thread.java:748)
I'm not too worried about that with single-use connections but I think I will probably remove that ability and only allow pooled connections. They will have a health check when they're returned to the pool but it might make sense to do a liveness check again before they're handed out to handle this case.
FWIW this happened using a pool of connections, but maybe that's not fully implemented yet? I recall seeing some TODO comments around this code.
maybe that's not fully implemented yet
Yeah that pooling code is bad. I have a WIP for a new implementation but it doesn't have the pre-checkout phase yet.
Also to a first approximation nothing is fully implemented! I mean, you discovered that INSERT
wasn't supported yet because nobody had ever tried it ;-)
No worries :smile:
I'll leave this issue open hoping it gets fixed someday, no rush :) Once your new pooling code is out, I can give it a try and maybe come back with a PR if it's not yet fixed.
I've also encountered this issue.
An easy to use reproduction can be found in this repo, including the necessary scripts to start/restart the PSQL in docker. N.B. it only reproduces w/ pooled connections.
Now, I've hacked together quite a nice way to get pool reinitialization in userland, without full app restart (ping @gvolpe if you need it). But it's not a fully tested in terms of all the concurrency problems that can arise, but it works in production :joy:. A bunch of Ref
, Semaphore
and Resource.allocated
:grimacing:. And not generic enough to handle restart when you wanted a stream in your query. But in case anyone wants it, it can be found in the same repo
@tpolecat, I'll gladly take a look at solving this problem in skunk itself if this is wanted. As the solution here is much simpler than the userland pool reinit hack.
This should be fixed. Have you tried this with https://github.com/tpolecat/skunk/pull/388 ?
Nice, @tpolecat please feel free to close this, I completely forgot about this issue! :smile:
@tpolecat if that fix wound up in 0.0.24
then I did try it. And it's not fully fixed, not for the particular usecase in the repo I had linked.