CSGHub icon indicating copy to clipboard operation
CSGHub copied to clipboard

docker compose failed to start, csghub-server-1 is unhealthy

Open Tendo33 opened this issue 11 months ago • 8 comments

Describe the bug

root@fhv-devcsghub-01 /data/nfs/csghub/docker-compose-ori # bash startup.sh
[2025-01-10 15:34:15] [NORM] Current configured domain name is 10.0.36.67.
[2025-01-10 15:34:15] [NORM] Nginx Main:
[2025-01-10 15:34:15] [INFO] - render nginx configuration file.
[2025-01-10 15:34:15] [INFO] - generate temporal auth file.
[2025-01-10 15:34:16] [NORM] CoreDNS:
[2025-01-10 15:34:16] [INFO] - render coredns Corefile.
[2025-01-10 15:34:16] [INFO] - generate CoreDNS reverse parsing files.
[2025-01-10 15:34:16] [NORM] Minio:
[2025-01-10 15:34:16] [INFO] - create data directories.
[2025-01-10 15:34:16] [NORM] Registry:
[2025-01-10 15:34:16] [INFO] - create config directories.
[2025-01-10 15:34:16] [INFO] - generate registry auth file.
[2025-01-10 15:34:16] [NORM] Gitaly:
[2025-01-10 15:34:16] [INFO] - render configuration file.
[2025-01-10 15:34:16] [NORM] Gitlab-Shell:
[2025-01-10 15:34:16] [INFO] - generate gitaly auth file.
[2025-01-10 15:34:16] [INFO] - generate host key pairs for gitlab-shell.
[2025-01-10 15:34:16] [WARN] - ssh_host_rsa_key pair already exists.
[2025-01-10 15:34:16] [WARN] - ssh_host_ecdsa_key pair already exists.
[2025-01-10 15:34:16] [WARN] - ssh_host_ed25519_key pair already exists.
[2025-01-10 15:34:16] [NORM] Csghub_Space_Builder:
[2025-01-10 15:34:16] [INFO] - render docker daemon file.
[2025-01-10 15:34:16] [INFO] - render docker config file.
[2025-01-10 15:34:16] [NORM] Nats:
[2025-01-10 15:34:16] [INFO] - render nats config file.
[2025-01-10 15:34:16] [NORM] Casdoor:
[2025-01-10 15:34:16] [INFO] - render casdoor init_data file.
[2025-01-10 15:34:16] [INFO] - render casdoor config file.
[2025-01-10 15:34:16] [NORM] Csghub_Proxy_Nginx:
[2025-01-10 15:34:16] [INFO] - render proxy nginx config file.
[2025-01-10 15:34:16] [NORM] Starting services...
[+] Running 17/20
 ⠦ Network docker-compose-ori_opencsg                 Created                                                 13.6s 
 ✔ Container docker-compose-ori-nats-1                Started                                                  1.3s 
 ✔ Container docker-compose-ori-redis-1               Started                                                  1.5s 
 ✔ Container docker-compose-ori-registry-1            Started                                                  1.0s 
 ✔ Container docker-compose-ori-postgres-1            Started                                                  1.4s 
 ✔ Container docker-compose-ori-gitaly-1              Started                                                  1.0s 
 ✔ Container docker-compose-ori-minio-1               Started                                                  1.3s 
 ✔ Container docker-compose-ori-temporal-1            Started                                                  2.4s 
 ✔ Container docker-compose-ori-gitlab-shell-1        Started                                                  2.1s 
 ✔ Container docker-compose-ori-casdoor-1             Healthy                                                 13.4s 
 ✔ Container docker-compose-ori-csghub-portal-1       Starte...                                                2.8s 
 ✔ Container docker-compose-ori-temporal-ui-1         Started                                                  3.3s 
 ✘ Container docker-compose-ori-csghub-server-1       Error                                                    7.0s 
 ✔ Container docker-compose-ori-csghub-proxy-1        Started                                                  5.1s 
 ⠸ Container docker-compose-ori-nginx-1               Created                                                 13.3s 
 ✔ Container docker-compose-ori-csghub-mirror-repo-1  S...                                                     4.6s 
 ⠸ Container docker-compose-ori-csghub-db-init-1      Creat...                                                13.3s 
 ✔ Container docker-compose-ori-csghub-user-1         Started                                                  4.6s 
 ✔ Container docker-compose-ori-csghub-mirror-lfs-1   St...                                                    4.8s 
 ✔ Container docker-compose-ori-csghub-accounting-1   St...                                                    5.1s 
dependency failed to start: container docker-compose-ori-csghub-server-1 is unhealthy

root@fhv-devcsghub-01 /data/nfs/csghub/docker-compose-ori # docker logs -f docker-compose-ori-csghub-server-1
Database setup...
Migration init
init logger, level: INFO, format: json
Migration migrate
init logger, level: INFO, format: json
{"time":"2025-01-10T07:34:22.652590318Z","level":"INFO","msg":"there are no new migrations to run (database is up to date)"}
Trigger multisync once in background
Start server...
init logger, level: INFO, format: json
{"time":"2025-01-10T07:34:23.626796384Z","level":"INFO","msg":"FIFOScheduler run started"}
2025/01/10 07:34:23 INFO  No logger configured for temporal client. Created default one.
Error: unable to create workflow client, error: failed reaching server: last connection error: connection error: desc = "transport: Error while dialing: dial tcp 192.171.100.122:7233: connect: connection refused"
Database setup...
Migration init
init logger, level: INFO, format: json
Migration migrate
init logger, level: INFO, format: json
{"time":"2025-01-10T07:34:25.700792256Z","level":"INFO","msg":"there are no new migrations to run (database is up to date)"}
Trigger multisync once in background
Start server...
init logger, level: INFO, format: json
{"time":"2025-01-10T07:34:26.588432132Z","level":"INFO","msg":"FIFOScheduler run started"}
{"time":"2025-01-10T07:34:26.654411105Z","level":"ERROR","msg":"failed to get all service status","error":"Get \"http://csghub-runner:8082/api/v1/service/status-all\": dial tcp: lookup csghub-runner on 127.0.0.11:53: no such host"}
2025/01/10 07:34:26 INFO  No logger configured for temporal client. Created default one.
Error: failed to register cron jobs:  unable to create schedule, error:expected 0 args for function: SyncAsClientWorkflow but found 1
Database setup...
Migration init
init logger, level: INFO, format: json
Migration migrate
init logger, level: INFO, format: json
{"time":"2025-01-10T07:34:30.469571043Z","level":"INFO","msg":"there are no new migrations to run (database is up to date)"}
Trigger multisync once in background
Start server...
init logger, level: INFO, format: json
{"time":"2025-01-10T07:34:30.971572671Z","level":"INFO","msg":"FIFOScheduler run started"}
{"time":"2025-01-10T07:34:31.036007227Z","level":"ERROR","msg":"failed to get all service status","error":"Get \"http://csghub-runner:8082/api/v1/service/status-all\": dial tcp: lookup csghub-runner on 127.0.0.11:53: no such host"}
2025/01/10 07:34:31 INFO  No logger configured for temporal client. Created default one.
Error: failed to register cron jobs:  unable to create schedule, error:expected 0 args for function: SyncAsClientWorkflow but found 1
Database setup...
Migration init
init logger, level: INFO, format: json
Migration migrate
init logger, level: INFO, format: json
{"time":"2025-01-10T07:34:34.094583944Z","level":"INFO","msg":"there are no new migrations to run (database is up to date)"}
Trigger multisync once in background
Start server...
init logger, level: INFO, format: json
{"time":"2025-01-10T07:34:34.500603614Z","level":"INFO","msg":"FIFOScheduler run started"}
2025/01/10 07:34:34 INFO  No logger configured for temporal client. Created default one.
{"time":"2025-01-10T07:34:34.53463916Z","level":"ERROR","msg":"failed to get all service status","error":"Get \"http://csghub-runner:8082/api/v1/service/status-all\": dial tcp: lookup csghub-runner on 127.0.0.11:53: no such host"}
Error: failed to register cron jobs:  unable to create schedule, error:expected 0 args for function: SyncAsClientWorkflow but found 1
Database setup...
Migration init
init logger, level: INFO, format: json
Migration migrate
init logger, level: INFO, format: json

Then I looked at the temporal container's log:

time=2025-01-10T07:34:24.889 level=ERROR msg="failed reaching server: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 192.171.100.122:7233: connect: connection refused\""

Environment CSGHub Version: v0.7|v0.8|... OS: Linux | Windows | MacOS.. Hardware: 2c4G | 4c8G |... Launch: docker compose | helm chart

To Reproduce Steps to reproduce the behavior: The latest docker compose file, after changing the host IP in the .env file, executed bash startup.sh.

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

Tendo33 avatar Jan 10 '25 07:01 Tendo33

I found that the error seems to be:

{"time":"2025-01-10T08:41:55.275285734Z","level":"ERROR","msg":"failed to get all service status","error":"Get \"http://localhost:8082/api/v1/service/status-all\": dial tcp [::1]:8082: connect: connection refused"}

But i did not run any runner-related functions.

Tendo33 avatar Jan 10 '25 08:01 Tendo33

It seems you didn't start by release v1.2.3, It's main branch?

MasonXon avatar Jan 10 '25 08:01 MasonXon

It seems you didn't start by release v1.2.3, It's main branch?

Yes, I am using: https://github.com/OpenCSGs/csghub-installer/tree/main/docker-compose

Tendo33 avatar Jan 10 '25 09:01 Tendo33

if your csghub is not configured a k8s to work with, then you can ignore below error {"time":"2025-01-10T08:41:55.275285734Z","level":"ERROR","msg":"failed to get all service status","error":"Get \"http://localhost:8082/api/v1/service/status-all\": dial tcp [::1]:8082: connect: connection refused"} @Tendo33

wayneliu0019 avatar Jan 10 '25 09:01 wayneliu0019

@Tendo33 Please using package in release v1.2.3, It has a bug in main that we haven't fixed.

MasonXon avatar Jan 10 '25 09:01 MasonXon

@wayneliu0019 this error cause csghub-server cannot be started Error: failed to register cron jobs: unable to create schedule, error:expected 0 args for function: SyncAsClientWorkflow but found 1, this is a known bug.

MasonXon avatar Jan 10 '25 09:01 MasonXon

@Yiling-J pleas have a look

Rader avatar Jan 10 '25 09:01 Rader

@Tendo33 When you use the csghub-installer main branch, it installs the nightly version of CSGHub, which is built every night from the main branch of CSGHub. Please note that the main branch is under active development, and we cannot guarantee stability. To resolve the issue you encountered, please switch to the csghub-installer release branch and try again. sorry for the inconvenience and we’ll work on improving the README to clarify this.

Yiling-J avatar Jan 10 '25 09:01 Yiling-J