scylla-cluster-tests icon indicating copy to clipboard operation
scylla-cluster-tests copied to clipboard

feature(hydra.sh): add a command to update scylla db packages

Open dimakr opened this issue 1 year ago • 2 comments

The change adds a new hydra command that allows to update DB binaries of the whole cluster or of selected nodes. This functionality allows for a developer to update scylla version from their developing machine and re-run a test / run another test without the need to re-create a cluster (i.e. using reuse-cluster capability).

Testing

[x] :green_circle:

  • provisioned a cluster: pr-provision-test
  • scylla versions before updating the packages:
> for i in 34.245.158.223 52.208.56.22 34.243.47.154; do kitty +kitten ssh -i ~/.ssh/scylla_test_id_ed25519 -A ubuntu@$i "scylla --version"; done
6.0.1-0.20240612.bc89aac9d017
6.0.1-0.20240612.bc89aac9d017
6.0.1-0.20240612.bc89aac9d017
  • started the new update-scylla-packages command for the test-id in question, selecting 2 DB nodes out of 3 for update and providing local folder as a location of new packages versions:
> SCT_ENABLE_ARGUS=false ./sct.py update-scylla-packages --test-id ff89b3ea-77f5-474b-bb66-66bfe36ef666 -p /home/dmitriy/Work/Scylla/test_upd_pkg_dir
logged in as arn:aws:sts::797456418907:assumed-role/DeveloperAccessRole/[email protected]
? Select machine:  (Use arrow keys to move, <space> to select, <a> to toggle, <i> to invert)
 » ● aws - PR-provision-test-hydra-up-db-node-ff89b3ea-3 - 34.243.47.154 10.4.2.122 - eu-west-1
   ○ aws - PR-provision-test-hydra-up-db-node-ff89b3ea-2 - 52.208.56.22 10.4.3.64 - eu-west-1
   ● aws - PR-provision-test-hydra-up-db-node-ff89b3ea-1 - 34.245.158.223 10.4.2.230 - eu-west-1
   ○ aws - PR-provision-test-hydra-up-loader-node-ff89b3ea-1 - 52.211.172.210 10.4.3.197 - eu-west-1
   ○ aws - PR-provision-test-hydra-up-monitor-node-ff89b3ea-1 - 3.253.39.197 10.4.2.156 - eu-west-1
   ...
? Select machine:  done (2 selections)
Stop scylla service on the nodes
Update DB packages on the nodes
<34.243.47.154>: Command rsync not available -- disabled
<34.245.158.223>: Command rsync not available -- disabled
Start scylla service on the nodes
DB packages update duration -> 351 s
  • scylla versions after updating the packages:
> for i in 34.245.158.223 52.208.56.22 34.243.47.154; do kitty +kitten ssh -i ~/.ssh/scylla_test_id_ed25519 -A ubuntu@$i "scylla --version"; done
6.1.0~rc0-0.20240718.14222ad20520
6.0.1-0.20240612.bc89aac9d017
6.1.0~rc0-0.20240718.14222ad20520

Also performed test when using cloud storage (s3) as a location of the new packages.

PR pre-checks (self review)

  • [ ] I added the relevant backport labels
  • [x] I didn't leave commented-out/debugging code

Reminders

  • Add New configuration option and document them (in sdcm/sct_config.py)
  • Add unit tests to cover my changes (under unit-test/ folder)
  • Update the Readme/doc folder relevant to this change (if needed)

dimakr avatar Jul 21 '24 21:07 dimakr

Not sure why it failed, but the fail in unit tests is not related to this change.

dimakr avatar Jul 22 '24 17:07 dimakr

Not sure why it failed, but the fail in unit tests is not related to this change.

you have multiple things failing in the CI here. I would recommend rebaseing and checking it again

fruch avatar Jul 24 '24 15:07 fruch

@fruch As we discussed, when a cluster is deployed without test_communication_public.yaml config it is not possible to get to DB nodes from local dev machine. So added in this PR possibility to execute RemoteCmdRunner command on a remote node via jump host, but only for fabric-based ssh transport. In case of ssh2-python library I was not able to make it working and not sure if it works in the lib itself - the API for that is not really documented, no examples in the lib repo and haven't found anywhere online. So I suggest to merge, when ready, this version of the update-db-packages command. And create a separate task, where execution ssh command via proxy will be implemented for ssh2-python based client. Then we can simply switch this command to libssh2 transport by a one line change in the code.

dimakr avatar Aug 02 '24 12:08 dimakr

@dimakr any more things to check/fix here? Or we merge it?

soyacz avatar Aug 06 '24 12:08 soyacz

@dimakr any more things to check/fix here? Or we merge it?

Yes, do not merge it yet please. I've noticed that somewhere between changes setting the proxy_cmd param for in_make_rsync_cmd was lost. I will fix it quickly.

dimakr avatar Aug 06 '24 13:08 dimakr

@soyacz added the missed parameter in CommandRunner._make_ssh_command method (otherwise sending new packages to DB nodes via rsync would fail in some conditions).

Re-executed checks to test the changes:

  • new CLI command that sends new packages via proxy host using scp
  • regular longevity test that updates packages (regression check) The PR description is updated with these results.

Also checked sending new packages via proxy host using rsync. As the rsync is not installed on dn instances out of box, performed manual intervention to install it during the test:

> self.run('hostname').stdout
Out[14]: 'PR-provision-test-master-db-node-f99eca5e-3\n'

> self.run('rsync --version').stdout
...
bash: line 1: rsync: command not found

> self.run('sudo apt install -y rsync').stdout
Out[16]: 'Reading package lists...
...

> self.run('rsync --version').stdout
Out[17]: 'rsync  version 3.2.7  
...
...

unzipping any tar.gz rpms
Detected Linux distribution: UBUNTU22
Installed .deb packages before replacing with new .DEB files
['scylla\t6.0.1-0.20240612.bc89aac9d017-1', 'scylla-conf\t6.0.1-0.20240612.bc89aac9d017-1', 'scylla-cqlsh\t6.0.1-0.20240612.bc89aac9d017-1', 'scylla-kernel-conf\t6.0.1-0.20240612.bc89aac9d017-1', 'scylla-machine-image\t6.0.1-20240613.5b94160-1', 'scylla-manager-agent\t3.3.0~0.20240627.772cc4df8', 'scylla-node-exporter\t6.0.1-0.20240612.bc89aac9d017-1', 'scylla-python3\t6.0.1-0.20240612.bc89aac9d017-1', 'scylla-server\t6.0.1-0.20240612.bc89aac9d017-1', 'scylla-server-dbg\t6.0.1-0.20240612.bc89aac9d017-1', 'scylla-tools\t']
Installed .deb packages after replacing with new .DEB files
['scylla\t6.1.0~rc0-0.20240718.14222ad20520-1', 'scylla-conf\t6.0.1-0.20240612.bc89aac9d017-1', 'scylla-cqlsh\t6.1.0~rc0-0.20240718.14222ad20520-1', 'scylla-jmx\t6.1.0~rc0-0.20240718.14222ad20520-1', 'scylla-kernel-conf\t6.1.0~rc0-0.20240718.14222ad20520-1', 'scylla-machine-image\t6.0.1-20240613.5b94160-1', 'scylla-manager-agent\t3.3.0~0.20240627.772cc4df8', 'scylla-node-exporter\t6.1.0~rc0-0.20240718.14222ad20520-1', 'scylla-python3\t6.1.0~rc0-0.20240718.14222ad20520-1', 'scylla-server\t6.1.0~rc0-0.20240718.14222ad20520-1', 'scylla-server-dbg\t6.1.0~rc0-0.20240718.14222ad20520-1', 'scylla-tools\t6.1.0~rc0-0.20240718.14222ad20520-1', 'scylla-tools-core\t6.1.0~rc0-0.20240718.14222ad20520-1']
...
Update DB packages duration -> 822 s

dimakr avatar Aug 06 '24 17:08 dimakr