determined icon indicating copy to clipboard operation
determined copied to clipboard

feat: Log Signal Exp Config and Monitoring

Open jgongd opened this issue 5 months ago • 3 comments

Ticket

MD-493, MD-494

Description

  • Update log_policies experiment config. To avoid any disruption, both legacy log policy and modern log policy are supported.
  • Store signal to the DB if a matching pattern is found in the log.

Test Plan

At the release party, testing this PR together with https://github.com/determined-ai/determined/pull/9959 could make things easier.

  1. Include below in the experiment config and verify if Test Signal are in run details page and run table.
log_policies:
 - pattern: .*complete.*
   actions:  
     - signal: Test Signal
image image
  1. Legacy log policy should still be accepted.
log_policies:
 - pattern: .*complete.*
   action:  
     type: cancel_retries
  1. If the experiment config doesn't specify any log policies, it will use the default log policies. image

  2. Default log policies can be unset.

log_policies: []

Checklist

  • [ ] Changes have been manually QA'd
  • [ ] New features have been approved by the corresponding PM
  • [ ] User-facing API changes have the "User-facing API Change" label
  • [ ] Release notes have been added as a separate file under docs/release-notes/ See Release Note for details.
  • [ ] Licenses have been included for new code which was copied and/or modified from any external code

jgongd avatar Sep 17 '24 19:09 jgongd