jetlag icon indicating copy to clipboard operation
jetlag copied to clipboard

Research and develop method to scale up an existing cluster

Open akrzos opened this issue 1 year ago • 5 comments

One of the desired features of Jetlag would be the ability to scale up a cluster after an initial cluster is installed. We need to research scale up methods for assisted-installer based cluster, and determine if these are feasible via a playbook to make this a simple task.

akrzos avatar Mar 07 '24 16:03 akrzos

A new feature GA'd in 4.17 that could be an implementation here https://docs.openshift.com/container-platform/4.17/nodes/nodes/nodes-nodes-adding-node-iso.html

Going through the docs initially, the generate-discovery-iso and boot-iso jetlag tasks already accomplish the steps required to add nodes through this method and already includes a httpd server to host the iso.

afcollins avatar Jan 22 '25 22:01 afcollins

I've been working on this using this doc as a template for workflow: https://access.redhat.com/solutions/6968529

radez avatar Jan 23 '25 18:01 radez

Oh! Ok.

That makes sense for versions less than 4.17 where this feature was added.

Maybe we look to use this new approach for 4.17 and onward?

afcollins avatar Jan 23 '25 18:01 afcollins

Can someone work on this feature with priority as still JetSki is the only option to deploy a large cluster? We are using JetSki to deploy a smaller cluster and then scale the nodes which helps in debugging hardware issues. However JetSki is not actively mainted whereas Jetlag has more users and contributors. So enhancing Jetlag to include scale up features will be helpful for people in transition to Jetlag.

venkataanil avatar Feb 11 '25 05:02 venkataanil

https://github.com/radez/jetlag/tree/scale_out I used this branch this morning to scale out on 4.17

  • Deploy your cluster with only the initial workers in the [worker] inventory group.
  • Add the new workers records to the [worker] inventory group
  • in vars/scale_out.yml Set current_worker_count to the initial number of workers and scale_out_count to the number of workers you've added to the [worker] group and intend to scale the cluster to.
  • Run ansible-playbook -i ansible/inventory/cloudX.local ansible/mno-scale-out.yml

I'm working on adding these instructions to a doc file and I'll put in a patch

radez avatar Feb 19 '25 17:02 radez

@radez would you consider this issue complete now? Is there any other scale up work needed?

akrzos avatar Apr 16 '25 14:04 akrzos

Closing as #613 was merged

akrzos avatar Jun 06 '25 14:06 akrzos