X-Road
X-Road copied to clipboard
X-Road Proxy fails to restore DB Connection
Version: 7.0.1 Our external PostgreSQL database (AWS RDS Aurora) occasionally reboots / is restored and as a result, the X-Road Proxy enters into permanent fail state. Only way to recover is to restart the service.
Are there any configuration options we could tune to have the services automatically restore the DB connections?
Hey @autero1. We will look into this matter.
@autero1 Do you know for how long usually DB is unavailable? Short timeouts to DB should be fully recoverable. By default transactions wait for 30seconds (which is also visible in provided logs) before being killed. My guess that during these outages high transaction count might overload connection pool and eventually application locks up.
To verify this you can monitor hikariCP pool stats. To enable it please add
<logger name="com.zaxxer.hikari" level="TRACE" />
<logger name="com.zaxxer.hikari.HikariConfig" level="DEBUG" />
to /etc/xroad/conf.d/proxy-logback.xml
Logs will look like this:
Pool stats (total=20, active=0, idle=20, waiting=0)
Playing with timeout configuration and Increasing pool size might help, but it would only solve this if transaction are consumed faster than new ones are created.
Note: Was not able to reproduce this by using security server sidecar and remote database. Tried killing it, restarting, stalling. Might be related to SS load.
This mainly took place during nightly backups. I also think this could be related to #1293 . We haven't witnessed any issues since we tuned the message logging, so I think I'll just close this issue and reopen if something new surfaces.