[Bug] Bug in gpinitsystem when creating multiple segments per segment host
Cloudberry Database version
1.6.0
What happened
See attached documents. We are creating a cluster on 12 segment hosts that is supposed to have 4 primaries per segment host spread across 4 mounted disks:
/data1 /data2 /data3 /data4
What resulted instead is we ended up with 2 primaries per segment host with it being fairly random as to what disks it created it on.
As a workaround I had to change the config file to double up the disk list. It just seems to be cutting it in half for some reason.
What you think should happen instead
it should be creating the number of primary segments as per the number of data drives in the DATA_DIRECTORY list
How to reproduce
See attached gpinit_bug_1.txt gpinit_bug_workaround.txt
Operating System
rocky8
Anything else
No response
Are you willing to submit PR?
- [ ] Yes, I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct.
Hey, @lmugnano4537 welcome!🎊 Thanks for taking the time to point this out.🙌
Hey @RyanWei could you help assign engineers to have a look? Thanks.
Hi @lmugnano4537 , looking at the log gpinit_bug_1.txt, you specified multiple hostname sdw1 and sdw12 on the same host ip-10-9-1-139.us-west-2.compute.internal. So gpinitsystem spread the segments evenly across sdw1 and sdw12. Could you please correct it and try again?
20240911:09:08:11:237749 gpinitsystem:ip-10-9-1-175:gpadmin-[INFO]:-ip-10-9-1-139.us-west-2.compute.internal 4000 sdw12 /data1/primary/gpseg0 2
20240911:09:08:11:237749 gpinitsystem:ip-10-9-1-175:gpadmin-[INFO]:-ip-10-9-1-139.us-west-2.compute.internal 4001 sdw12 /data2/primary/gpseg1 3
20240911:09:08:11:237749 gpinitsystem:ip-10-9-1-175:gpadmin-[INFO]:-ip-10-9-1-139.us-west-2.compute.internal 4002 sdw1 /data3/primary/gpseg2 4
20240911:09:08:11:237749 gpinitsystem:ip-10-9-1-175:gpadmin-[INFO]:-ip-10-9-1-139.us-west-2.compute.internal 4003 sdw1 /data4/primary/gpseg3 5
Hey @liang8283 . So some more context and I added some files to look at. It seems like the issue is the same thing that happens with gpssh. You pass these utilities a host file with a list of hosts for the cluster, then the utilities in the background seem to ssh into each host in that file, grab the hostname using something like hostname -f, and then it uses that newly generated list of hostnames to continue do perform it's operations. I don't fully understand why these utilities do this and don't just trust that the hostname we pass it is correct, I'd be interested in hearing if anyone knows more about that. I vaguely remember some mailing list discussion for GPDB years ago about this but I don't know if those conversations are available anymore
At the very least, I would hope that gpssh and gpinitsystem (and any other utilities that do this behavior) should at the very least do a duplicate hostname check when they grab the hostnames off the machines they ssh into and if there is a duplicate, throw some sort of exception and fail explicitly. I'd be happy to implement this as a bug fix if you think there is good reason to do so. The real challenge with it is it's not at all obvious that duplicate hostnames are the issue when you see the side effects of it. For instance gpinitsystem will not use all of the specified data directories and create a weird configuration (I've attached some files of an example of this with comments in the files. sdw1 and sdw2 I manually set the hostname to be the same at the OS level on both of them). This also caused all kinds of weird rsync error messages when I was doing gpcheckperf which lead me down a rabbit hole trying to find if there was a bug in rsync. I should be able to replicate this too if needed.