Curator icon indicating copy to clipboard operation
Curator copied to clipboard

Initial slurm deployment scripts

Open ayushdg opened this issue 3 months ago • 3 comments

Description

Adds initial example slurm scripts for single and multi-node runs.

Usage

N/A

Checklist

  • [X] I am familiar with the Contributing Guide.
  • [N/A] New or Existing tests cover these changes.
  • [X] The documentation is up to date with these changes.

ayushdg avatar Oct 03 '25 22:10 ayushdg

@lbliii Could you point me to where I should also update this in the docs?

ayushdg avatar Oct 03 '25 22:10 ayushdg

@sarahyurick jfyi one thing I observed from the timeouts is that it always hangs right after the url_generation tests for wiki before timing out vs in successful runs the suite takes 12 minutes. Maybe there's something about one of the tests that hangs on these nodes.

ayushdg avatar Oct 06 '25 20:10 ayushdg

@sarahyurick jfyi one thing I observed from the timeouts is that it always hangs right after the url_generation tests for wiki before timing out vs in successful runs the suite takes 12 minutes. Maybe there's something about one of the tests that hangs on these nodes.

Yes I noticed the same thing and was not able to determine the root cause. I did not see it when testing locally either. Maybe we can open an issue if it continues to be a blocker.

sarahyurick avatar Oct 06 '25 20:10 sarahyurick