twenty
twenty copied to clipboard
Database becomes unreachable / nameresolution fails with docker-compose.yml
Bug Description
When setting up twenty with the provided docker-compose.yml, everything starts out working as expected. But after some time, the database starts to become unreachable every now and then (DNS nameresolution seems to fail), leading to error messages on the website. However, most of the time, twenty recovers from this (the logs do not indicate a crash/restart of postgres, so it appears to be only related to DNS).
Eventually, these name resolution failures lead to a crash of the entire service and the site becomes unreachable (containers seem to have died then). I then have to run podman-compose down
and podman-compose up
to make it work again. So it seems not like a configuration error per se, but rather like something crashing at runtime.
Log Example:
ecc77284652f Exception Captured
ecc77284652f { user: undefined }
ecc77284652f [
ecc77284652f Error: getaddrinfo ENOTFOUND twenty-db
ecc77284652f at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:108:26) {
ecc77284652f errno: -3008,
ecc77284652f code: 'ENOTFOUND',
ecc77284652f syscall: 'getaddrinfo',
ecc77284652f hostname: 'twenty-db'
ecc77284652f }
ecc77284652f ]
Edit: the above exception is one that allows the service to keep running. Here is a log example for the unhandled case that causes the service to become unreachable:
50b7515e958b 2024-08-07 21:03:50.324 GMT [75] LOG: checkpoint complete: wrote 14 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=1.308 s, sync=0.004 s, total=1.316 s; sync files=11, longest=0.002 s, average=0.001 s; distance=77 kB, estimate=99 kB
f8fc7d0e40ee node:internal/errors:496
f8fc7d0e40ee ErrorCaptureStackTrace(err);
f8fc7d0e40ee ^
f8fc7d0e40ee
f8fc7d0e40ee Error [ERR_UNHANDLED_ERROR]: Unhandled error. ({
f8fc7d0e40ee errno: -3008,
f8fc7d0e40ee code: 'ENOTFOUND',
f8fc7d0e40ee syscall: 'getaddrinfo',
f8fc7d0e40ee hostname: 'twenty-db',
f8fc7d0e40ee message: 'getaddrinfo ENOTFOUND twenty-db (Queue: __pgboss__send-it, Worker: 2ded992d-6008-47c0-80c1-c97a5a4637f0)',
f8fc7d0e40ee stack: 'Error: getaddrinfo ENOTFOUND twenty-db (Queue: __pgboss__send-it, Worker: 2ded992d-6008-47c0-80c1-c97a5a4637f0)\n' +
f8fc7d0e40ee ' at /app/node_modules/pg-pool/index.js:45:11\n' +
f8fc7d0e40ee ' at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n' +
f8fc7d0e40ee ' at async Db.executeSql (/app/node_modules/pg-boss/src/db.js:28:14)\n' +
f8fc7d0e40ee ' at async Manager.fetch (/app/node_modules/pg-boss/src/manager.js:497:16)\n' +
f8fc7d0e40ee ' at async Worker.start (/app/node_modules/pg-boss/src/worker.js:49:22)',
f8fc7d0e40ee queue: '__pgboss__send-it',
f8fc7d0e40ee worker: '2ded992d-6008-47c0-80c1-c97a5a4637f0'
f8fc7d0e40ee })
f8fc7d0e40ee at new NodeError (node:internal/errors:405:5)
f8fc7d0e40ee at PgBoss.emit (node:events:503:17)
f8fc7d0e40ee at PgBoss.emit (node:domain:489:12)
f8fc7d0e40ee at Manager.<anonymous> (/app/node_modules/pg-boss/src/index.js:88:37)
f8fc7d0e40ee at Manager.emit (node:events:514:28)
f8fc7d0e40ee at Manager.emit (node:domain:489:12)
f8fc7d0e40ee at Worker.onError (/app/node_modules/pg-boss/src/manager.js:256:12)
f8fc7d0e40ee at Worker.start (/app/node_modules/pg-boss/src/worker.js:70:14)
f8fc7d0e40ee at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
f8fc7d0e40ee code: 'ERR_UNHANDLED_ERROR',
f8fc7d0e40ee context: {
f8fc7d0e40ee errno: -3008,
f8fc7d0e40ee code: 'ENOTFOUND',
f8fc7d0e40ee syscall: 'getaddrinfo',
f8fc7d0e40ee hostname: 'twenty-db',
f8fc7d0e40ee message: 'getaddrinfo ENOTFOUND twenty-db (Queue: __pgboss__send-it, Worker: 2ded992d-6008-47c0-80c1-c97a5a4637f0)',
f8fc7d0e40ee stack: 'Error: getaddrinfo ENOTFOUND twenty-db (Queue: __pgboss__send-it, Worker: 2ded992d-6008-47c0-80c1-c97a5a4637f0)\n' +
f8fc7d0e40ee ' at /app/node_modules/pg-pool/index.js:45:11\n' +
f8fc7d0e40ee ' at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n' +
f8fc7d0e40ee ' at async Db.executeSql (/app/node_modules/pg-boss/src/db.js:28:14)\n' +
f8fc7d0e40ee ' at async Manager.fetch (/app/node_modules/pg-boss/src/manager.js:497:16)\n' +
f8fc7d0e40ee ' at async Worker.start (/app/node_modules/pg-boss/src/worker.js:49:22)',
f8fc7d0e40ee queue: '__pgboss__send-it',
f8fc7d0e40ee worker: '2ded992d-6008-47c0-80c1-c97a5a4637f0'
f8fc7d0e40ee }
f8fc7d0e40ee }
f8fc7d0e40ee
f8fc7d0e40ee Node.js v18.17.1
Expected behavior
The DNS Resolution works at all times and temporary DNS resultion errors do not lead to a full crash or the site becoming unreachable.
Technical inputs
I am using podman instead of docker and run the services in a rootless environment. There are several other docker-compose-based services on the same server that use postgres or other databases. Their configuration is essentially the same, so it does not appear to be a principal error with my setup or the DNS resolution inside podman.
However, the other services use the postgres image directly (e.g., docker.io/postgres:13.1-alpine
) and not a customized bitnami image like twenty.
The issue orccurs using the latest
tag on the docker images (i.e., v0.23) and on v0.22.
docker-compose.yml (redacted where necessary)
version: "3.9"
name: twenty
services:
change-vol-ownership:
image: docker.io/ubuntu
user: root
volumes:
- /containers/twenty-crm/data:/data
- /containers/twenty-crm/server-local-data:/tmp/server-local-data
- /containers/twenty-crm/docker-data:/tmp/docker-data
- /containers/twenty-crm/db-data:/tmp/db-data
command: >
bash -c "
chown -R 1000:1000 /tmp/server-local-data
&& chown -R 1000:1000 /tmp/docker-data
&& chown -R 1001:1001 /tmp/db-data"
server:
image: docker.io/twentycrm/twenty:v0.22
volumes:
- /containers/twenty-crm/server-local-data:/app/packages/twenty-server/.local-storage
- /containers/twenty-crm/docker-data:/app/docker-data
ports:
- "127.0.0.1:xxxx:3000"
environment:
PORT: 3000
PG_DATABASE_URL: postgres://twenty:twenty@twenty-db:5432/default
SERVER_URL: "https://crm.example.com"
FRONT_BASE_URL: "https://crm.example.com"
MESSAGE_QUEUE_TYPE: "pg-boss"
ENABLE_DB_MIGRATIONS: "true"
SIGN_IN_PREFILLED: "true"
STORAGE_TYPE: "local"
ACCESS_TOKEN_SECRET: "redacted"
LOGIN_TOKEN_SECRET: "redacted"
REFRESH_TOKEN_SECRET: "redacted"
FILE_TOKEN_SECRET: "redacted"
depends_on:
change-vol-ownership:
condition: service_completed_successfully
db:
condition: service_healthy
healthcheck:
test: curl --fail http://localhost:3000/healthz
interval: 5s
timeout: 10s
retries: 20
restart: always
worker:
image: docker.io/twentycrm/twenty:v0.22
command: ["yarn", "worker:prod"]
environment:
PG_DATABASE_URL: postgres://twenty:twenty@twenty-db:5432/default
SERVER_URL: "https://crm.example.com"
FRONT_BASE_URL: "https://crm.example.com"
MESSAGE_QUEUE_TYPE: "pg-boss"
ENABLE_DB_MIGRATIONS: "false" # it already runs on the server
STORAGE_TYPE: "local"
ACCESS_TOKEN_SECRET: "redacted"
LOGIN_TOKEN_SECRET: "redacted"
REFRESH_TOKEN_SECRET: "redacted"
FILE_TOKEN_SECRET: "redacted"
EMAIL_SMTP_HOST: "redacted"
EMAIL_SMTP_PORT: "redacted"
EMAIL_SMTP_USER: "redacted"
EMAIL_SMTP_PASSWORD: "redacted"
EMAIL_FROM_NAME: "redacted"
EMAIL_FROM_ADDRESS: "redacted"
depends_on:
twenty-db:
condition: service_healthy
server:
condition: service_healthy
restart: always
twenty-db:
image: docker.io/twentycrm/twenty-postgres:v0.22
volumes:
- /containers/twenty-crm/db-data:/bitnami/postgresql
depends_on:
change-vol-ownership:
condition: service_completed_successfully
environment:
POSTGRES_PASSWORD: "redacted"
healthcheck:
test: pg_isready -U twenty -d default
interval: 5s
timeout: 10s
retries: 20
restart: always