determined
determined copied to clipboard
feat: Log Signal Exp Config and Monitoring
Ticket
MD-493, MD-494
Description
- Update
log_policies
experiment config. To avoid any disruption, both legacy log policy and modern log policy are supported. - Store
signal
to the DB if a matchingpattern
is found in the log.
Test Plan
At the release party, testing this PR together with https://github.com/determined-ai/determined/pull/9959 could make things easier.
- Include below in the experiment config and verify if
Test Signal
are in run details page and run table.
log_policies:
- pattern: .*complete.*
actions:
- signal: Test Signal
- Legacy log policy should still be accepted.
log_policies:
- pattern: .*complete.*
action:
type: cancel_retries
-
If the experiment config doesn't specify any log policies, it will use the default log policies.
-
Default log policies can be unset.
log_policies: []
Checklist
- [ ] Changes have been manually QA'd
- [ ] New features have been approved by the corresponding PM
- [ ] User-facing API changes have the "User-facing API Change" label
- [ ] Release notes have been added as a separate file under
docs/release-notes/
See Release Note for details. - [ ] Licenses have been included for new code which was copied and/or modified from any external code