NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

Set segment for gb systems when nodes <= 18

Open sudostock opened this issue 1 month ago • 1 comments

Make segment selection explicit for applicable systems.

Current logic relies on Slurm defaults to do the correct thing. However on at least one internal cluster admins have it configured such that segment unset becomes 'segment=2' with negative performance impact. Better to be explicit.

Remake of #15062

sudostock avatar Nov 21 '25 20:11 sudostock

@malay-nagda @guyueh1 respin of https://github.com/NVIDIA-NeMo/NeMo/pull/15062 cleaned up my fork before the previous was merged.

sudostock avatar Nov 21 '25 20:11 sudostock