runner icon indicating copy to clipboard operation
runner copied to clipboard

Improve reliability for waiting job to start before running steps.

Open TingluoHuang opened this issue 1 year ago • 0 comments

  • When update timeline record failed, instead of waiting for another update for the same record, we will auto retry the update with a short back-off.
  • Collect how long does it normally take for the first timeline record get updated.
  • Allow configure how long should we wait for the first job record updates to finish.

The change will be controlled by 3 feature flags:

  • DistributedTask.EnableJobRecordUpdatedTelemetry
  • DistributedTask.EnableRecordUpdateAutoRetry
  • DistributedTask.FirstJobRecordUpdateWaitTimeInSeconds

TingluoHuang avatar Sep 04 '24 20:09 TingluoHuang