*Description of changes:* EFA Installer 1.35.0 (Waiting release) AWS-ofi-nccl 1.12.0 By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms...
In current form, there are various files without specific orchestrator. This issue to organize per orchestrator: - kubernets/train.yaml - slurm/train.sbatch
Docker file in FSDP does not have specific version. It is best practice to specify versions and not used latest.
All nodes running the install script will change slurm global configuration that is shared across nodes.