azure-functions-host
azure-functions-host copied to clipboard
Add compile errors to the functionerror collection
This PR surfaces compilation errors for customers using C# script (.csx files). This is part of an ongoing effort to make scale decisions in scale controller more intelligent -- if we see that a function is failing to compile, it should not be triggering scale out votes to begin with. The scale controller will poll the admin/functions/<name>/status API upon site metadata refreshes to check the health of each function in a given function app. If a function returns any errors from this endpoint, its scale out votes will be overridden with scale in votes, given it meets these other two conditions:
- It is not the only function in the app
- The current worker count is greater than 1
The scale controller PR is here.
A sample app on private stamp that surfaces compilation errors (simple HTTP trigger using C# script, with undefined global variable)
ScaleControllerEvents | where SiteName == "vchiangwcpsep1v1" | where PreciseTimeStamp between(datetime(2022-07-12 20:52:35.5662765) .. 1min) | project PreciseTimeStamp, Message
Sample API response returned from browser, using private site extension on production function app with broken HTTP trigger (undefined variable test in global scope):

Portal experience is not affected by compilation error surfacing:

Issue describing the changes in this PR
resolves #issue_for_this_pr
Pull request checklist
- [ ] My changes do not require documentation changes
- [ ] Otherwise: Documentation issue linked to PR
- [ ] My changes should not be added to the release notes for the next release
- [ ] Otherwise: I've added my notes to
release_notes.md
- [ ] Otherwise: I've added my notes to
- [ ] My changes do not need to be backported to a previous version
- [ ] Otherwise: Backport tracked by issue/PR #issue_or_pr
- [ ] My changes do not require diagnostic events changes
- Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
- [ ] I have added all required tests (Unit tests, E2E tests)
Additional information
Additional PR information Fixes #950
Pull request checklist
- [ ] My changes do not require documentation changes
- [ ] Otherwise: Documentation issue linked to PR
- [ ] My changes should not be added to the release notes for the next release
- [ ] Otherwise: I've added my notes to
release_notes.md
- [ ] Otherwise: I've added my notes to
- [ ] My changes do not need to be backported to a previous version
- [ ] Otherwise: Backport tracked by issue/PR #issue_or_pr
- [ ] My changes do not require diagnostic events changes
- Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
- [ ] I have added all required tests (Unit tests, E2E tests)
Additional information
Additional PR information
@chiangvincent can you share more details about what you're trying to address with the change? This would impact the portal behavior, and the compilation result information is already persisted in the stream using the mechanism expected to provide compilation feedback. I'm assuming your intention is to surface this for functions deployed with compilation errors (as opposed to interactive dev feedback), but it would be good to have an issue with the context (motivation, issues, current vs expected behavior, etc.)
@chiangvincent - please sign license/cla here - https://cla.dotnetfoundation.org/
@fabiocav - PR description above is updated. Any other concerns with the PR?
cc @glennamanns fyi
@fabiocav - gentle bump for review and any other concerns
Only note I have is that we might want these logs to be at least warning level in the scale controller table so that they're easy to see, otherwise lgtm. Suggest we still wait on @fabiocav's review as he has more context on the impact here
@fabiocav - gentle bump for review and any other concerns!
To the questions @fabiocav raised, addition of compilation errors seems to be inline with the function errors we already add. We're already adding errors to this collection for indexing errors, metadata validation errors, listener start errors, etc. I agree though that we should test with the portal to verify there are no unexpected issues, but Fabio is right that we'll be reporting compilation errors both in the log stream shown in the portal as well as in this error collection. Changes look good to me as long as portal testing doesn't show any issues.
I realize you're starting with just csx compilation, but the same issue exists for other languages. For example, a JS function might have syntax errors and fail on every invocation, but the fact that JS is interpreted complicates things.
Also: @paulbatum logged https://github.com/Azure/azure-functions-host/issues/950 in the distant past for this, FYI, so we should resolve that and other duplicate issues with this.
@mathewc - thanks for the additional info. We will post an update after testing this change on the portal via Private Site Extension.
I realize you're starting with just csx compilation, but the same issue exists for other languages. For example, a JS function might have syntax errors and fail on every invocation, but the fact that JS is interpreted complicates things.
Yes, we are starting with .csx as our logs indicate atleast ~4k apps throwing this error on a daily basis. We do not have evidence of the impact for out-of-proc function load errors yet. We are following up with Language Worker Owners to see if function load errors can be populated when a function fails to load , this will map closely to compilation exceptions.