dashboard icon indicating copy to clipboard operation
dashboard copied to clipboard

Add warning banner to allocate the number of nodes + 1 vGPUs

Open IsaSih opened this issue 9 months ago • 4 comments

Setup Rancher version:v2.8-head Browser type & version: Chrome Version 124.0.6367.78 Harvester Version: v1.3.0

To Reproduce

  1. Set up vGPU profiles (multiple) in Harvester
  2. Import Harvester into Rancher
  3. Go to Virtualization management -> Harvester UI for cluster -> vGPU Devices and enable a vGPU with 2 allocatable.
  4. From Cluster Management, Create a new 2-node RKE2 cluster with Harvester as the downstream provider. Under Advanced options, add the vGPU with 2 allocatable resources ( same number as the cluster nodes)
  5. After the creation of the cluster is completed, edit the config file of the cluster
  6. Observe the logs of the failed process of provisioning.

Result Once the harvester cluster is redeployed for any reason (the user edits the config, the nodes go into an error state, etc), the new VMs spin up before the old ones are completely shut down, which causes the "un-schedulable" error as the vGPUs won't be available yet.

Expected Result We could add a warning banner in the UI to recommend that the user should provision N+1 allocatable vgpu, where N is number of nodes.

IsaSih avatar May 09 '24 10:05 IsaSih

This isn't a blocker for the release but we'll need to release note this in Harvester and see if we can fix this later.

Moving to 2.9.0 but this may need a 2.8.x backport.

gaktive avatar May 10 '24 15:05 gaktive

/backport v2.8.next2

gaktive avatar May 10 '24 23:05 gaktive

This isn't a blocker for the release but we'll need to release note this in Harvester and see if we can fix this later.

Moving to 2.9.0 but this may need a 2.8.x backport.

@gaktive Should a release note be added to the Rancher 2.8.4 RNs? If so, should this be considered a known issue?

LucasSaintarbor avatar May 14 '24 17:05 LucasSaintarbor

@LucasSaintarbor I'd say it's a known issue but it impacts Harvester more so this may have to target their release notes instead.

@rebeccazzzz is there a historical pattern for release notes in Rancher that tie to Harvester directly?

gaktive avatar May 14 '24 21:05 gaktive