infisical
infisical copied to clipboard
Connection terminated unexpectedly
Describe the bug
When performing API calls to my self-hosted instance after a period of inactivity, I occasionally get the following error:
{
"level": 50,
"severity": "ERROR",
"err": {
"type": "DatabaseError",
"message": "Failed to execute db ops",
"stack": "Find one: Failed to execute db ops\n at Object.findOne (/backend/src/lib/knex/index.ts:106:13)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async Object.login (/backend/src/services/identity-ua/identity-ua-service.ts:53:24)\n at async Object.handler (/backend/src/server/routes/v1/identity-universal-auth-router.ts:48:9)",
"name": "Find one",
"error": {
"type": "Error",
"message": "select * from \"identity_universal_auths\" where \"clientId\" = $1 limit $2 - Connection terminated unexpectedly",
"stack": "Error: select * from \"identity_universal_auths\" where \"clientId\" = $1 limit $2 - Connection terminated unexpectedly\n at Connection.<anonymous> (/backend/node_modules/pg/lib/client.js:131:73)\n at Object.onceWrapper (node:events:632:28)\n at Connection.emit (node:events:518:28)\n at Connection.emit (node:domain:489:12)\n at Socket.<anonymous> (/backend/node_modules/pg/lib/connection.js:62:12)\n at Socket.emit (node:events:518:28)\n at Socket.emit (node:domain:489:12)\n at TCP.<anonymous> (node:net:343:12)\n at TCP.callbackTrampoline (node:internal/async_hooks:130:17)"
}
},
"msg": "Failed to execute db ops"
}
Only after the first one or two error responses the API works as expected.
To Reproduce
Steps to reproduce:
- Deploy an infisical instance.
- Set up a client ID and secret for API authentication.
- Wait some time (I have not been able to identify a more specific rule or pattern that triggers this behavior).
- Attempt to authenticate using a POST request to
/api/v1/auth/universal-auth/login.
Expected behavior
The API should immediatly respond with the access token.
Additional context
The error can occur for every API call, I just usually get it for the universal-auth after some period of inactivity on Infisical, but i had the same problem with different calls.
I suppose this has to do with the database connection configurations. In the instance.ts the pool parameter is not passed to knex, so the default values are used. As it is stated in the official documentation and in this knex issue (which addresses the same exact problem I am having), it is advised to change the pool's min value to 0. Is there any particular reason why the pool's min value is not being set? Wouldn't it be better to make it configurable via an environment variable?
This seems to be tls issue with your database. Hard to say without debugging this further on infrastructure level.
This seems to be tls issue with your database. Hard to say without debugging this further on infrastructure level.
I have run some additional tests and I can reproduce the behavior locally with this stack compose file:
version: '3'
networks:
infisical:
volumes:
pg_data:
redis_data:
services:
backend:
image: infisical/infisical:v0.113.0-postgres
env_file: .env
networks:
- infisical
ports:
- 8080:8080
deploy:
replicas: 1
redis:
image: redis
environment:
- ALLOW_EMPTY_PASSWORD=yes
networks:
- infisical
volumes:
- redis_data:/data
deploy:
replicas: 1
db:
image: postgres:14-alpine
env_file: .env
volumes:
- pg_data:/var/lib/postgresql/data
networks:
- infisical
ports:
- 5432:5432
deploy:
replicas: 1
TLS is not configured, and I am not having any connection issues to the Postgres service from other clients. Connection string and postgres authentication parameters are correct as i can use the frontend to manage my Infisical projects. As I said, only after a period of inactivity i get the Connection terminated unexpectedly from the APIs.
Are you sure it doesn't have to do with the pool configurations of Knex as mentioned in the issue I posted in the first message?
Hi @7hael
Interesting. Let me check on this, may be it's because of not being inactive in our product - this issue not happening
I believe this happens for us too. I have an Infiniscale deployed in the K8s (tried Helm and my own manifest) and after several minutes – the application terminates, unfortunately with no logs! Yet I've seen similar problems with docker-compose deployment, so I assume the fault is similar.
I'm using AWS RDS Postgres and TBH this is not the first time we're facing a problem with an app having problems to maintain pg connection and being able to recover successfully without crashing the container.