azure-functions-host icon indicating copy to clipboard operation
azure-functions-host copied to clipboard

Identifying faulty language workers to be recycled

Open surgupta-msft opened this issue 3 years ago • 0 comments

This PR is currently on hold as we are discussing on some design scenarios.

Issue describing the changes in this PR

Resolves #7292 Design - doc

This PR contains changes to identify the faulty language workers (but not to recycle them yet). The plan is to first merge this PR, analyze the impact on production function apps to see how many workers will be recycled once this feature is in. Once we finalize the appropriate thresholds then part-2 of this feature will actually recycle the workers.

Summary of changes -

  1. For each worker, keeping track of total invocations, successful invocations and latency.
  2. Once every 5 minutes, checking workers' health by calculating failure rate and selecting the worst worker to be recycled (one with the highest failure rate above threshold).

Pull request checklist

  • [ ] My changes do not require documentation changes
    • [x] Otherwise: Documentation issue linked to PR
  • [ ] My changes should not be added to the release notes for the next release
    • [ ] Otherwise: I've added my notes to release_notes.md
  • [ ] My changes do not need to be backported to a previous version
    • [ ] Otherwise: Backport tracked by issue/PR #issue_or_pr
  • [ ] My changes do not require diagnostic events changes
    • Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
  • [ ] I have added all required tests (Unit tests, E2E tests)

Additional information

Additional PR information

surgupta-msft avatar Sep 28 '22 18:09 surgupta-msft