cloudberry icon indicating copy to clipboard operation
cloudberry copied to clipboard

[Bug] Coordinator can't start up in Docker env.

Open tuhaihe opened this issue 2 years ago • 0 comments

Cloudberry Database version

master

What happened

The Coordinator log during startup

2022-09-09 15:25:46.383478 HKT,,,p37615,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","starting PostgreSQL 13beta1 (Greenplum Database 8.0.0-alpha.0 build dev) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-5ubuntu1) 9.4.0, 64-bit",,,,,,,0,,"postmaster.c",1320,
2022-09-09 15:25:46.383629 HKT,,,p37615,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","listening on IPv4 address ""0.0.0.0"", port 7000",,,,,,,0,,"pqcomm.c",626,
2022-09-09 15:25:46.383648 HKT,,,p37615,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","listening on IPv6 address ""::"", port 7000",,,,,,,0,,"pqcomm.c",626,
2022-09-09 15:25:46.383946 HKT,,,p37615,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","listening on Unix socket ""/tmp/.s.PGSQL.7000""",,,,,,,0,,"pqcomm.c",621,
2022-09-09 15:25:46.386061 HKT,,,p37617,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","database system was shut down at 2022-09-09 15:25:45 HKT",,,,,,,0,,"xlog.c",6545,
2022-09-09 15:25:46.386556 HKT,,,p37617,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","end of transaction log location is 0/4F2AAA8",,,,,,,0,,"xlog.c",7797,
2022-09-09 15:25:46.386792 HKT,,,p37617,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","latest completed transaction id is 490 and next transaction id is 491",,,,,,,0,,"xlog.c",8162,
2022-09-09 15:25:46.387079 HKT,,,p37617,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","database system is ready",,,,,,,0,,"xlog.c",8189,
2022-09-09 15:25:46.393443 HKT,,,p37615,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","PostgreSQL 13beta1 (Greenplum Database 8.0.0-alpha.0 build dev) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-5ubuntu1) 9.4.0, 64-bit compiled on Sep  9 2022 15:22:15 (with assert checking)",,,,,,,0,,"postmaster.c",3539,
2022-09-09 15:25:46.393473 HKT,,,p37615,th-1933846144,,,,0,,,seg-1,,,,,"LOG","00000","database system is ready to accept connections","PostgreSQL 13beta1 (Greenplum Database 8.0.0-alpha.0 build dev) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-5ubuntu1) 9.4.0, 64-bit compiled on Sep  9 2022 15:22:15 (with assert checking)",,,,,,0,,"postmaster.c",3541,
2022-09-09 15:25:46.414234 HKT,,,p37623,th-1933846144,,,,0,con2,,seg-1,,,,sx1,"LOG","00000","initialized 1 resource queues",,,,,,,0,,"resscheduler.c",265,
2022-09-09 15:25:47.520627 HKT,,,p37623,th-1933846144,,,,0,con2,,seg-1,,,,sx1,"ERROR","XX000","epoll_ctl() failed: No such file or directory",,,,,,,0,,"latch.c",938,"Stack trace:
1    0x5618b6f24706 postgres errstart + 0x3de
2    0x5618b6d10b6f postgres <symbol not found> + 0xb6d10b6f
3    0x5618b6d10a1b postgres ModifyWaitEvent + 0x174
4    0x5618b6ff16f5 postgres cdbgang_createGang_async + 0xa38
5    0x5618b6feedfd postgres cdbgang_createGang + 0x53
6    0x5618b6feef44 postgres AllocateGang + 0x145
7    0x5618b6fec982 postgres <symbol not found> + 0xb6fec982
8    0x5618b6fec7e1 postgres CdbDispatchCommandToSegments + 0xd2
9    0x5618b6fec70d postgres CdbDispatchCommand + 0x38
10   0x5618b7039225 postgres <symbol not found> + 0xb7039225
11   0x5618b7038fe1 postgres <symbol not found> + 0xb7038fe1
12   0x5618b7038dbd postgres <symbol not found> + 0xb7038dbd
13   0x5618b7039db6 postgres DtxRecoveryMain + 0x83
14   0x5618b6c62ca5 postgres StartBackgroundWorker + 0x2d2
15   0x5618b6c7bb9b postgres <symbol not found> + 0xb6c7bb9b
16   0x5618b6c7bff9 postgres <symbol not found> + 0xb6c7bff9
17   0x5618b6c781ea postgres <symbol not found> + 0xb6c781ea
18   0x7fa98f76d520 libc.so.6 <symbol not found> + 0x8f76d520
19   0x7fa98f84674d libc.so.6 __select + 0xbd
20   0x5618b6c75792 postgres <symbol not found> + 0xb6c75792
21   0x5618b6c74f80 postgres PostmasterMain + 0x14aa
22   0x5618b6b02814 postgres <symbol not found> + 0xb6b02814
23   0x7fa98f754d90 libc.so.6 <symbol not found> + 0x8f754d90
24   0x7fa98f754e40 libc.so.6 __libc_start_main + 0x80
25   0x5618b66c2bc5 postgres _start + 0x25

and after cd ${working_dir}/gpAux/gpdemo && make

20220909:13:06:05:019679 gpstart:85ca67d4de3c:hashdata-[CRITICAL]:-Error occurred: non-zero rc: 1
 Command was: 'env GPSESSID=0000000000 GPERA=c7675ce57929a8bb_220909121634 $GPHOME/bin/pg_ctl -D /home/hashdata/workspace/cbdb_dev/gpAux/gpdemo/datadirs/qddir/demoDataDir-1 -l /home/hashdata/workspace/cbdb_dev/gpAux/gpdemo/datadirs/qddir/demoDataDir-1/log/startup.log -w -t 600 -o " -p 7000 -c gp_role=dispatch " start'
rc=1, stdout='waiting for server to start........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... stopped waiting
', stderr='pg_ctl: server did not start in time
'

What you think should happen instead

No response

How to reproduce

  1. better update docker image to docker.artifactory.hashdata.xyz/docker/cbdb:devel-devtoolset-10-cbdb-docker-ubuntu-2019-20220908
    • or just cherry-pick https://code.hashdata.xyz/cloudberry/cbdb/-/merge_requests/110/
  2. run ./cbdb/deploy/docker_start.sh
    • or just build cbdb + make create-demo-cluster

After I reverted https://code.hashdata.xyz/cloudberry/cbdb/-/commit/33eaa5437b66d7f03021a62cdabe71de0b69393d. Then make create-demo-cluster can be work.

Operating System

Default

Anything else

No response

Are you willing to submit PR?

  • [ ] Yes, I am willing to submit a PR!

Code of Conduct

tuhaihe avatar Jul 24 '23 07:07 tuhaihe