crmsh icon indicating copy to clipboard operation
crmsh copied to clipboard

DRAFT: Feature: Non-privileged hacluster

Open aleksei-burlakov opened this issue 2 years ago • 5 comments

It enables creating the cluster under the hacluster It requires some preparaions. To use it, do:

Bring the user to the haclient group. Assume the user is 'hackerman'

user=hackerman
sudo sed -i "s|$userx1000:1000|$userx1000:90|g" /etc/passwd
reboot -h now

If you also want it working with -N key, the hacluster should be in /etc/sudoers sudo echo "$user ALL=(ALL:ALL) NOPASSWD: ALL" >> /etc/sudoers

Create links to the executables, to enable them for no-roots

sudo ln -s /sbin/crm* /sbin/cibadmin /sbin/stonith \
        /sbin/corosync-keygen /sbin/corosync-cfgtool \
        /sbin/csync2 /sbin/sbd /usr/bin/ /usr/bin/

Give the permissions to folder for the haclient group

sudo chmod g+w /etc/pacemaker/
sudo chown root:haclient -R /etc/csync2 /etc/corosync \
        /usr/share/doc/packages/corosync \
        /etc/sysconfig /var/lib/csync2 /etc/crm /etc/lvm \
        /etc/samba /var/lib/pacemaker
sudo chmod 777 -R /etc/csync2 /etc/corosync \
        /usr/share/doc/packages/corosync \
        /etc/sysconfig /var/lib/csync2 /etc/crm /etc/lvm \
        /etc/samba /var/lib/pacemaker

Pay ATTENTION: /run resets its permissions on reboot

sudo chown root:haclient /run /var/log
sudo chmod g+w /run /var/log

Besides set up the csync2 running as hackerman Add User=hackerman into the [Service] section of [email protected]

sudo echo "User=$user" >> /usr/lib/systemd/system/[email protected]
sudo systemctl daemon-reload

And now you can do the usual routines under hackerman

crm cluster init ...
crm cluster join -c alice@host1 ...

or crm cluster init -N alice@node1 -N bob@node2 .. -N hackerman@nodeN (where nodeN is the localhost. You can directly write localhost actually.)

aleksei-burlakov avatar Aug 16 '22 07:08 aleksei-burlakov

Thanks @aleksei-burlakov !

In my environment

sudo mv /sbin/crm* /sbin/cibadmin /sbin/stonith \
        /sbin/corosync-keygen /sbin/corosync-cfgtool /usr/bin/

crm* cibadmin stonith corosync-keygen corosync-cfgtool are under /usr/sbin

And my other concern is it might impact on customers existing scripts if these commands change the path

And not sure the customers will allow to change the mod like /etc/sysconfig /etc/lvm /etc/samba

@zzhou1 @gao-yan @nicholasyang2022 What do you think?

liangxin1300 avatar Aug 22 '22 06:08 liangxin1300

Instead of moving these binaries, we may create symlinks.

Changing the mode of any directory to 777 is dangerous. And changing the owner of base system directories, i.e. /etc/sysconfig, /etc/lvm, /run may breaks other programs.

nicholasyang2022 avatar Aug 22 '22 07:08 nicholasyang2022

@liangxin1300 , @nicholasyang2022 , @zzhou1 Thank you for having a look. Indeed, one can create only links to the sbin-binaries instead of moving them.

In general, it looks to me to loop in SUDO_USER could help the usage with "sudo crm cluster init xxx", for example.

I have created a function utils.user_of(host) that returns the username that should be user to access the host. The function takes the values from the crm.conf in the form user@host. (Those values are stored when the cluster is created and csynced among the nodes).

Then it was only a very small prototype that was barely working. I have improved it, and now it is well functioning. (Could you please have a try?) However, there are still things to improve. They are noted with FIXME and TODO.

  • For example there is hard-code in configure_ssh_key.
  • Besieds csync complains that some files are not synced. WARNING: /etc/corosync/corosync.conf was not synced. However, when I check the md5sum they are the same.
  • Besides, I have added a function remote_public_key_from that does the almost same that fetch_public_key_from_remote_node, they should be merged/refactored.
  • Tests are broken and will be adjusted and new tests created.

aleksei-burlakov avatar Sep 01 '22 08:09 aleksei-burlakov

Adding support to non-privileged user adds more parameters to remote calling commands, creating some very complex cases, such as:

get_stdout_stderr('ssh {}@{} sudo cat {}'.format(user, node, path))

And different errors, such as ssh authentication failure, sudo authorization failure, and command failure, need to be handled.

I think we should build new utils to execute remote commands and unify credential passing and error handling.

I am working on another non-root user related issue bsc#1201785, which requires ssh authentication failure to be catched. Maybe we can work together to refactor these remote calling codes.

nicholasyang2022 avatar Sep 09 '22 09:09 nicholasyang2022

Adding support to non-privileged user adds more parameters to remote calling commands, creating some very complex cases, such as:

get_stdout_stderr('ssh {}@{} sudo cat {}'.format(user, node, path))

And different errors, such as ssh authentication failure, sudo authorization failure, and command failure, need to be handled.

I think we should build new utils to execute remote commands and unify credential passing and error handling.

I am working on another non-root user related issue bsc#1201785, which requires ssh authentication failure to be catched. Maybe we can work together to refactor these remote calling codes.

Yes, sure. I will look the bsc#1201785 the next week ✌️

aleksei-burlakov avatar Sep 16 '22 10:09 aleksei-burlakov

@aleksei-burlakov,

BTW, I do want to clarify the general purpose of the use case PED-139 is to maintain the full fledged crmsh functionalities. With that, we should call out to support non-root but still privileged users, aka. sudoer, instead of non-privileged user(eg. hacluster).

Or, put it this way, crmsh functionality with the "hacluster" user is very limited. I only see hawk webui use it in this context.

So, it makes much more sense to change the PR titile/description in such settings, and same for the commit log. Or, you may want to withdraw this PR, and create a new PR instead.

zzhou1 avatar Oct 20 '22 08:10 zzhou1

BTW, I do want to clarify the general purpose of the use case PED-139 is to maintain the full fledged crmsh functionalities. With that, we should call out to support non-root but still privileged users, aka. sudoer, instead of non-privileged user(eg. hacluster).

Or, put it this way, crmsh functionality with the "hacluster" user is very limited. I only see hawk webui use it in this context.

So, it makes much more sense to change the PR titile/description in such settings, and same for the commit log. Or, you may want to withdraw this PR, and create a new PR instead.

That's actually correct, the caption is outdated.

aleksei-burlakov avatar Nov 01 '22 13:11 aleksei-burlakov

With the newest changes, the user doesn't require neither to be in haclient group, nor the permissions for the files the crm accesses to be granted to the haclient group. The crm would try first under the current user, and it it fails it would try again as the super-user.

aleksei-burlakov avatar Nov 01 '22 15:11 aleksei-burlakov

Hi @aleksei-burlakov

I suggest to resolve conflicts firstly so that the PR can trigger the CI process the see if has any regressions.

I applied current PR and found error on join node(under root):

# crm cluster join -c 15sp4-1 -y
WARNING: chronyd.service is not configured to start at system boot.
INFO: SSH key for root does not exist, hence generate it now
INFO: Configuring SSH passwordless with root@15sp4-1
Password: 
Run "systemctl is-active pacemaker.service" on 15sp4-1
INFO: Configuring csync2...
ERROR: cluster.join: [Errno 2] No such file or directory: '/var/lib/csync2/15sp4-2.db3'

Thank you!:)

liangxin1300 avatar Nov 02 '22 02:11 liangxin1300

Hi @aleksei-burlakov

I suggest to resolve conflicts firstly so that the PR can trigger the CI process the see if has any regressions.

I applied current PR and found error on join node(under root):

# crm cluster join -c 15sp4-1 -y
WARNING: chronyd.service is not configured to start at system boot.
INFO: SSH key for root does not exist, hence generate it now
INFO: Configuring SSH passwordless with root@15sp4-1
Password: 
Run "systemctl is-active pacemaker.service" on 15sp4-1
INFO: Configuring csync2...
ERROR: cluster.join: [Errno 2] No such file or directory: '/var/lib/csync2/15sp4-2.db3'

Thank you!:)

For sure!

aleksei-burlakov avatar Nov 02 '22 08:11 aleksei-burlakov

I do think you would like to tweak the commit title too, similarly like the PR title ;)

Good catch!)

aleksei-burlakov avatar Nov 02 '22 08:11 aleksei-burlakov

@aleksei-burlakov In case you might need to debug any behave test case locally, see https://github.com/ClusterLabs/crmsh#functional-tests

liangxin1300 avatar Nov 03 '22 14:11 liangxin1300

@aleksei-burlakov After #1033 merged, there is conflict need to be resolved

liangxin1300 avatar Nov 22 '22 14:11 liangxin1300

Hi @aleksei-burlakov Two points:

  • Along with more PRs got merged, there are conflicts again

  • Could you please add the whole process as a separate .feature file? Like non_root_bootstrap.feature under crmsh/test/features? So, we can enable functional test for this PR:) In this way, we can make sure our further PR wouldn't break these non-root feature

    Thank you!

liangxin1300 avatar Dec 05 '22 03:12 liangxin1300