determined
determined copied to clipboard
ci: extend experiment timeout for slurm test
Ticket
None
Description
Slurm restart fails on main
because the underlying trials time out due to image pull. This PR does 2 things:
- Bypasses top-level config for trial timeout in the affected test to wait for image pulls.
- Adds the affected test suite to be testable on feature branches.
Test Plan
CI passes, specifically: test-e2e-slurm-restart
Checklist
- [ ] Changes have been manually QA'd
- [ ] New features have been approved by the corresponding PM
- [ ] User-facing API changes have the "User-facing API Change" label
- [ ] Release notes have been added as a separate file under
docs/release-notes/
See Release Note for details. - [ ] Licenses have been included for new code which was copied and/or modified from any external code