'default' should not be allowed as a partition name in slurm.yaml
Problem Description
If default is used as a single partition name in slurm.yaml (under elastic_partitions:), the slurmctld controller fails to start. /var/log/slurm/slurmctld.log suggests that the PartitionName in slurm.conf is missing/invalid.
It turns out that default in not a valid partition name, but is used for setting defaults for all partitions (eg https://www.mail-archive.com/[email protected]/msg08392.html).
Batch Shipyard Version
3.9.0
Steps to Reproduce
Take a working configuration with a single Batch Pool and single partition, and change the elastic_partition in slurm.yaml to be named default.
Provision the cluster and attempt to run an sbatch job (should fail).
Login to the controller node (shipyard slurm ssh controller) and determine that slurmctld isn't running, check sudo tail /var/log/slurm/slurmctld.log
Expected Results
Expect shipyard cluster create to fail fast during schema validation if there are partitions named default.
Actual Results
Cluster appears to provision successfully but slurmctld fails to start.
Thanks for the issue report, this will be fixed in the next release.