tolgee-platform icon indicating copy to clipboard operation
tolgee-platform copied to clipboard

Tolgee not able to connect to postgresql in docker container

Open targetingsnake opened this issue 1 year ago • 23 comments

Hey guys,

since one of the recent updates tolgee isn't able to connect to the internal postgresql database. We currently run into the issue seen in the attached log: tolgee-log.txt

With Kind Regards Targetingsnake / Antistasi Dev-Team

targetingsnake avatar Jan 20 '24 15:01 targetingsnake

We use the following config:

tolgee:
  authentication:
    initial-password: $Initial_Password$
    initial-username: $Initial_Username$
    enabled: true
  machine-translation:
    google:
      api-key: my_google_api_key
  smtp:
    auth: false
    from: Antistasi Translate <$Sender_Mail$>
    host: $Mailserver_IP$
    password: 'omg/my/password'
    port: 25
    ssl-enabled: false
    username: $Sender_Mail$
  rate-limits:
    enabled: false

in which these parameters are actually replaced with the depending values:

  • $Initial_Password$
  • $Initial_Username$
  • $Sender_Mail$
  • $Sender_Mail$

all other values are like in the code block above

targetingsnake avatar Jan 20 '24 15:01 targetingsnake

Our docker compose file is the following:

version: '7'

services:
  app:
    image: tolgee/tolgee:latest
    volumes:
      - ./data:/data
      - ./config.yaml:/config.yaml # <--- this line
    ports:
      - '25432:25432'
      - '8080:8080'
    environment:
      spring.config.additional-location: file:///config.yaml # <--- this line

targetingsnake avatar Jan 20 '24 15:01 targetingsnake

Getting the same error

Flo4604 avatar Jan 21 '24 10:01 Flo4604

Hey guys,

since one of the recent updates tolgee isn't able to connect to the internal postgresql database. We currently run into the issue seen in the attached log: tolgee-log.txt

With Kind Regards Targetingsnake / Antistasi Dev-Team

Hi, as you can see i had the same issue, I investiaged the postgres directory and saw that it had weird permissions it was owned by the user and group 70 which doesnt exist on my server. 91284

I chowned the data/postgres directory to root again and started tolgee up and it worked afterwards. Did you check the permissions of the postgres folder?

Flo4604 avatar Jan 21 '24 11:01 Flo4604

that's kind of interesting can confirm that on my docker as well

total 132K
drwx------ 19   70 root 4.0K Jan 20 16:59 .
drwxr-xr-x  4 root root 4.0K Jul 29 00:39 ..
drwx------  6   70   70 4.0K Jul 29 00:41 base
drwx------  2   70   70 4.0K Jan 21 03:00 global
drwx------  2   70   70 4.0K Jul 27 17:16 pg_commit_ts
drwx------  2   70   70 4.0K Jul 27 17:16 pg_dynshmem
-rw-------  1   70   70 4.8K Jul 29 00:33 pg_hba.conf
-rw-------  1   70   70 1.6K Jul 27 17:16 pg_ident.conf
drwx------  4   70   70 4.0K Jan 20 16:59 pg_logical
drwx------  4   70   70 4.0K Jul 27 17:16 pg_multixact
drwx------  2   70   70 4.0K Jul 27 17:16 pg_notify
drwx------  2   70   70 4.0K Jul 27 17:16 pg_replslot
drwx------  2   70   70 4.0K Jul 27 17:16 pg_serial
drwx------  2   70   70 4.0K Jul 27 17:16 pg_snapshots
drwx------  2   70   70 4.0K Jan  8 13:42 pg_stat
drwx------  2   70   70 4.0K Jan 21 12:58 pg_stat_tmp
drwx------  2   70   70 4.0K Oct  1 13:01 pg_subtrans
drwx------  2   70   70 4.0K Jul 27 17:16 pg_tblspc
drwx------  2   70   70 4.0K Jul 27 17:16 pg_twophase
-rw-------  1   70   70    3 Jul 27 17:16 PG_VERSION
drwx------  3   70   70 4.0K Dec  8 00:13 pg_wal
drwx------  2   70   70 4.0K Jul 27 17:16 pg_xact
-rw-------  1   70   70   88 Jul 27 17:16 postgresql.auto.conf
-rw-------  1   70   70  28K Jul 29 00:33 postgresql.conf
-rw-------  1   70   70   24 Jan 20 16:59 postmaster.opts
-rw-------  1   70   70   85 Jan 20 16:59 postmaster.pid

targetingsnake avatar Jan 21 '24 12:01 targetingsnake

that's kind of interesting can confirm that on my docker as well

run chown root:root ./data/postgres/ -R to fix it but from what I see once you restart the container it will break again so something in the current release changes the permission for whatever reason.

Flo4604 avatar Jan 21 '24 12:01 Flo4604

Okay i did the chown with chown root:root -Rv ./data/postgres/ which does the same, directory permissions afterwards:

total 132K
drwx------ 19 root root 4.0K Jan 21 13:01 .
drwxr-xr-x  4 root root 4.0K Jul 29 00:39 ..
drwx------  6 root root 4.0K Jul 29 00:41 base
drwx------  2 root root 4.0K Jan 21 13:01 global
drwx------  2 root root 4.0K Jul 27 17:16 pg_commit_ts
drwx------  2 root root 4.0K Jul 27 17:16 pg_dynshmem
-rw-------  1 root root 4.8K Jul 29 00:33 pg_hba.conf
-rw-------  1 root root 1.6K Jul 27 17:16 pg_ident.conf
drwx------  4 root root 4.0K Jan 21 13:01 pg_logical
drwx------  4 root root 4.0K Jul 27 17:16 pg_multixact
drwx------  2 root root 4.0K Jul 27 17:16 pg_notify
drwx------  2 root root 4.0K Jul 27 17:16 pg_replslot
drwx------  2 root root 4.0K Jul 27 17:16 pg_serial
drwx------  2 root root 4.0K Jul 27 17:16 pg_snapshots
drwx------  2 root root 4.0K Jan  8 13:42 pg_stat
drwx------  2 root root 4.0K Jan 21 13:02 pg_stat_tmp
drwx------  2 root root 4.0K Oct  1 13:01 pg_subtrans
drwx------  2 root root 4.0K Jul 27 17:16 pg_tblspc
drwx------  2 root root 4.0K Jul 27 17:16 pg_twophase
-rw-------  1 root root    3 Jul 27 17:16 PG_VERSION
drwx------  3 root root 4.0K Dec  8 00:13 pg_wal
drwx------  2 root root 4.0K Jul 27 17:16 pg_xact
-rw-------  1 root root   88 Jul 27 17:16 postgresql.auto.conf
-rw-------  1 root root  28K Jul 29 00:33 postgresql.conf
-rw-------  1 root root   24 Jan 21 13:01 postmaster.opts
-rw-------  1 root root   85 Jan 21 13:01 postmaster.pid

and after i started it again:

total 132K
drwx------ 19   70 root 4.0K Jan 21 13:05 .
drwxr-xr-x  4 root root 4.0K Jul 29 00:39 ..
drwx------  6   70 root 4.0K Jul 29 00:41 base
drwx------  2   70 root 4.0K Jan 21 13:01 global
drwx------  2   70 root 4.0K Jul 27 17:16 pg_commit_ts
drwx------  2   70 root 4.0K Jul 27 17:16 pg_dynshmem
-rw-------  1   70 root 4.8K Jul 29 00:33 pg_hba.conf
-rw-------  1   70 root 1.6K Jul 27 17:16 pg_ident.conf
drwx------  4   70 root 4.0K Jan 21 13:05 pg_logical
drwx------  4   70 root 4.0K Jul 27 17:16 pg_multixact
drwx------  2   70 root 4.0K Jul 27 17:16 pg_notify
drwx------  2   70 root 4.0K Jul 27 17:16 pg_replslot
drwx------  2   70 root 4.0K Jul 27 17:16 pg_serial
drwx------  2   70 root 4.0K Jul 27 17:16 pg_snapshots
drwx------  2   70 root 4.0K Jan  8 13:42 pg_stat
drwx------  2   70 root 4.0K Jan 21 13:05 pg_stat_tmp
drwx------  2   70 root 4.0K Oct  1 13:01 pg_subtrans
drwx------  2   70 root 4.0K Jul 27 17:16 pg_tblspc
drwx------  2   70 root 4.0K Jul 27 17:16 pg_twophase
-rw-------  1   70 root    3 Jul 27 17:16 PG_VERSION
drwx------  3   70 root 4.0K Dec  8 00:13 pg_wal
drwx------  2   70 root 4.0K Jul 27 17:16 pg_xact
-rw-------  1   70 root   88 Jul 27 17:16 postgresql.auto.conf
-rw-------  1   70 root  28K Jul 29 00:33 postgresql.conf
-rw-------  1   70 root   24 Jan 21 13:05 postmaster.opts
-rw-------  1   70   70   85 Jan 21 13:05 postmaster.pid

and Tolgee didn't start again. But that might be the case cause i run the docker with the rootless mode. Log will follow in a Minute

targetingsnake avatar Jan 21 '24 12:01 targetingsnake

i think the 70 might be an user in the docker container? Also here is the new log: new-tolgee-log.txt

Also tested with a copy of the files under the root user, doesn't work either.

targetingsnake avatar Jan 21 '24 12:01 targetingsnake

what happens if you do docker compose up -d && chown root:root ./data/postgres/ -R

Flo4604 avatar Jan 21 '24 14:01 Flo4604

that doesn't help either

CONTAINER ID   IMAGE                  COMMAND                  CREATED         STATUS                            PORTS                                                                                                NAMES
803903a933c0   tolgee/tolgee:latest   "ash -c 'java -cp ap…"   2 minutes ago   Up 2 minutes (health: starting)   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 5432/tcp, 0.0.0.0:25432->25432/tcp, :::25432->25432/tcp   tolgee-app-1

and the webpage isn't reachable

targetingsnake avatar Jan 21 '24 15:01 targetingsnake

that doesn't help either

CONTAINER ID   IMAGE                  COMMAND                  CREATED         STATUS                            PORTS                                                                                                NAMES
803903a933c0   tolgee/tolgee:latest   "ash -c 'java -cp ap…"   2 minutes ago   Up 2 minutes (health: starting)   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 5432/tcp, 0.0.0.0:25432->25432/tcp, :::25432->25432/tcp   tolgee-app-1

and the webpage isn't reachable

what do the logs say

Flo4604 avatar Jan 21 '24 18:01 Flo4604

@Flo4604 I actually can't get an log like previously as i created them starting the container with docker compose up and piped the output into tee. So basicly i did docker compose up | tee -a log.txt . Also a manual check on the webaddress still brings that tolgee isn't reachable.

I don't think it is an issue with the database as i pulled a dump from it after tolgee didn't wanted to start. The database itself seems to be fine. (Did that with the 70:70 permissions)

targetingsnake avatar Jan 22 '24 12:01 targetingsnake

Hello!

Can you share this information?

  • OS
  • Docker version
  • From what version you were upgrading Tolgee
  • To what version of Tolgee

Thanks a lot!

JanCizmar avatar Jan 22 '24 14:01 JanCizmar

OS:

Distributor ID: Debian
Description:    Debian GNU/Linux 11 (bullseye)
Release:        11
Codename:       bullseye

with kernel Linux vm-name 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31) x86_64 GNU/Linux Dockerversion: Docker version 24.0.7, build afdd53b

Version which i think we're updating from from: v3.41.4 with id 6f499ba09e73 Current Installed Docker-Container image id of tolgee: c492434846d7

Just to be safe i will put here the Dockerimages which are on the host as list down below. I didn't removed any image so that's basicly our update history. We have the problem since roughly 3 week:

REPOSITORY      TAG       IMAGE ID       CREATED        SIZE
tolgee/tolgee   latest    c492434846d7   3 days ago     707MB
tolgee/tolgee   <none>    9c691da90358   5 days ago     707MB
tolgee/tolgee   <none>    7dffcc02ab8d   2 weeks ago    707MB
tolgee/tolgee   v3.43.2   7b0137167f07   3 weeks ago    720MB
tolgee/tolgee   v3.42.3   1fe5c16bac20   3 weeks ago    720MB
tolgee/tolgee   v3.41.4   6f499ba09e73   4 weeks ago    719MB
tolgee/tolgee   <none>    62c265ee984c   6 weeks ago    672MB
tolgee/tolgee   <none>    f5e8c5dd63da   2 months ago   661MB
tolgee/tolgee   <none>    862fc1b7d585   3 months ago   661MB
tolgee/tolgee   <none>    237abc90aca0   3 months ago   661MB
tolgee/tolgee   <none>    58dee9e131be   6 months ago   661MB
hello-world     latest    9c7a54a9a43c   8 months ago   13.3kB

targetingsnake avatar Jan 22 '24 16:01 targetingsnake

Hello!

Can you share this information?

  • OS
  • Docker version
  • From what version you were upgrading Tolgee
  • To what version of Tolgee

Thanks a lot!

Hi,

Small contribution to this issue. I've been playing with Tolgee for the first time today and encountered the same issue.

OS:

Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy

Docker version:

Client: Docker Engine - Community
 Version:           25.0.1
 API version:       1.44
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          25.0.1
  API version:      1.44 (minimum version 1.24)
  Built:            Tue Jan 23 23:09:23 2024
  OS/Arch:          linux/amd64

From what / to what version of Tolgee: None as I started playing with it, it was just an empty shell.

I noticed that from version 3.40.x to the latest, which was posted a few hours ago, restarting the container ends up with the postgresql issue mentioned by @targetingsnake. So I'm currently playing with v3.39.0!

Hope this helps

compi-tom avatar Jan 26 '24 14:01 compi-tom

@compi-tom thx für the suggestion of switching back to v3.39.0 it seems to work.

@JanCizmar do you need more information or was that information enough and the things you wanted to know?

targetingsnake avatar Jan 29 '24 16:01 targetingsnake

Hey! Thanks for the info! I'll try to look into it soon. Sorry for the delay.

JanCizmar avatar Jan 29 '24 16:01 JanCizmar

Sorry folks, I still don't have much time to look into that. Hope you've found some workarounds. I guess this happened with the update to the Spring Boot 3, where we also updated out base Dockerfile to align it with the last Postgres 13.

https://github.com/tolgee/tolgee-platform/commit/3147302b80d9accb5102a676590f5ac558e738b7#diff-a7a92e793e13161b58a26c6c1c0445fef52d1b6fd0cc081d71701d0a1ad5c089

However, on the first look I haven't spot anything what would affect the permissions 🤔

JanCizmar avatar Feb 02 '24 06:02 JanCizmar

I have the same issue. I tried version 3.38.3 at some point in dev environment when I was testing the platform and I didn't have any issues, I was able to start/restart the container without any problems.

When I started rolling to production I used version 3.43.0 and I was unable to start the container even once (unable to connect to postgres). I was under the impression there was some problem on my end, but after some trial and error I decided to downgrade to the version that I used (3.38.3) and I was able to start it without problem.

BUT after a week in production my translators reported that portal is down. I checked the logs, lo and behold, the same issue came up again. I was unable to restart container after that. Resetting permissions helped, but it's kinda troublesome to do that like every X days :(

If I understand this correctly, 3.38.3 is before spring boot 3 update, so maybe the problem is somewhere else?

OS:

Distributor ID:	CentOS
Description:	CentOS Linux release 7.9.2009 (Core)
Release:	7.9.2009
Codename:	Core

Docker container:

tolgee/tolgee   v3.38.3   3ddce2cb557e   2 months ago   672MB

dammitt avatar Feb 03 '24 06:02 dammitt

I'm running into the same problem with latest tag

OliverGeneser avatar Feb 11 '24 14:02 OliverGeneser

@OliverGeneser Do you mean that you updated to latest tag from some other version and you're getting the errors?

JanCizmar avatar Feb 12 '24 09:02 JanCizmar

@JanCizmar I created a new instance with latest as the tag and on first start the container starts without issue, but after a restart the DB doesn't mount correctly and the Tolgee web interface isn't accessible.

OliverGeneser avatar Feb 12 '24 09:02 OliverGeneser

Hey! Since I am super busy now with implementing this, I cannot really dig deeper to find what happens. I've added this documentation section which is a workaround for this.

https://tolgee.io/platform/self_hosting/running_with_docker#running-with-docker-compose-with-external-postgresql-database

So you can simply Tolgee to run with external db, which is run separately by docker compose.

If someone is interested into digging deeper and finding a solution, the cause can probably be found in docker/base/Dockerfile. You can build complete Tolgee image by running ./gradlew docker, which builds a tagged image tolgee/tolgee:latest on your local machine.

JanCizmar avatar Feb 12 '24 10:02 JanCizmar

It looks like, there is no development to this, so I am closing it. If there are more similar reports, we can reopen it.

JanCizmar avatar Apr 10 '24 09:04 JanCizmar