Handle dying DB connections
Retry multiple times, then raise.
Recently we started to run into this issue. Seems to happen when dead connections are involved, we are using DigitalOceans Managed database
INFO [16:52:57.861 #1] [c1a2aeae] POST /atm/new 500 9.568ms
INFO [16:53:58.708 #1] Endpoints::Atm::New::Params::JSON(@machine_id="xxxxxxx-xxxxxx-xxxxxx")
DEBUG [16:53:58.708 #1] [postgresql] SELECT atms.id FROM atms WHERE (atms.machine_id = $1)
ERROR [16:53:58.713 #1] [2fe65c94] ERROR Error writing to socket: Broken pipe
Error writing to socket: Broken pipe (Errno)
from /usr/share/crystal/src/thread.cr:69:3 in 'unbuffered_write'
from /usr/share/crystal/src/openssl/bio.cr:24:7 in '->'
from BIO_write
from ???
from ???
from /usr/share/crystal/src/openssl/ssl/socket.cr:122:5 in 'unbuffered_write'
from /usr/share/crystal/src/io/buffered.cr:180:5 in 'send_sync_message'
from lib/pg/src/pg/statement.cr:18:5 in 'perform_query'
from lib/db/src/db/statement.cr:103:14 in 'query'
from lib/onyx-sql/src/onyx-sql/repository/query.cr:9:11 in 'query'
from lib/onyx-sql/src/onyx-sql/repository/query.cr:23:12 in 'query'
Sorry about that. Ideally, a DB connection should handle such errors itself. It should not be the responsibility of an ORM to handle those dying connections.
crystal-db has pool implemented, which is claimed to handle connection errors. I suggest you to try raise the problem there, @arubinofaux.
Try adding retry_attempts=x&retry_delay=y to your database URL (which is shitty design IMO, as these params being out of scope of URI specification and 12-factor violation).
@arubinofaux did you manage to make it work better?
@vladfaust we manage to control the issue by constructing the database url like this.
postgres://user:password@host:port/database?sslmode=require&initial_pool_size=32&max_pool_size=32&max_idle_pool_size=0&retry_attemps=5
If you notice all the magic happens in max_idle_pool_size when set to 0 we stopped getting Error writing to socket: Broken pipe
Due to the current state of crystal, i think this is the best fix we can expect without going in and rewriting crystal-pg.