PowerShell worker crashes with intermittent Grpc.Core.RpcException: Status(StatusCode=Unknown, Detail="Stream removed") exception
Investigative information
Please provide the following:
- Timestamp: 2023-01-10, 00:56:36.005 UTC
- Function App version: 4.14.0.19631
- Function App name: atsushin20220927func
- Function name(s) (as appropriate): N/A (the sole function on the app is a QueueTrigger function called ZoomMeetingReport, but invocations themselves do not fail).
- Invocation ID: N/A. The exception occurs between function invocations
- Region: Japan East
Repro steps
No specific repro steps. Queue items are added by a basic ConsoleApp. The issue occurs intermittently, over the span of days.
Expected behavior
The expected behavior is that gRPC stream between the worker and the host is not suddenly removed, at least without graceful worker shutdown. Kusto logs suggest that the stream is not removed by the worker:
at Grpc.Core.Internal.ClientResponseStream`2.MoveNext(CancellationToken token)
at Microsoft.Azure.Functions.PowerShellWorker.Messaging.MessagingStream.MoveNext() in /mnt/vss/_work/1/s/src/Messaging/MessagingStream.cs:line 42
at Microsoft.Azure.Functions.PowerShellWorker.RequestProcessor.ProcessRequestLoop() in /mnt/vss/_work/1/s/src/RequestProcessor.cs:line 75
at Microsoft.Azure.Functions.PowerShellWorker.Worker.Main(String[] args) in /mnt/vss/_work/1/s/src/Worker.cs:line 57
at Microsoft.Azure.Functions.PowerShellWorker.Worker.<Main>(String[] args)</Data></EventData></Event>
Actual behavior
Intermittent worker crashes in the middle of the request-processing loop.
Known workarounds
The issue resolves itself after some time.
Related information
Provide any related information
- Programming language used: PowerShell 7.2
- Links to source: source code attached below
- Bindings used: QueueTrigger
run.ps1
param([string] $QueueItem, $TriggerMetadata)
$ErrorActionPreference = "Stop"
Write-Host "Succeeded"
function.json
{
"bindings": [
{
"name": "QueueItem",
"type": "queueTrigger",
"direction": "in",
"queueName": "ps-zoom-meetings-queue-items",
"connection": "AzureWebJobsStorage"
}
],
"retry": {
"strategy": "fixedDelay",
"maxRetryCount": 0,
"delayInterval": "00:00:10"
}
}
@michaelpeng36 assigning this for initial investigation, but curious to know if you have an app you can use to repro this. If so, we can have some additional instrumentation added to see if that helps us identify the root cause.
Thanks for the response, @fabiocav. Yes, we have a repro app set up for this. I will share the details privately.
@brettsam / @michaelpeng36 have you had a chance to sync on this? I'll move this to sprint 141, but please update/close if there is more information about this issue.
Pushing to sprint 143, but @brettsam and @michaelpeng36 are actively looking at this.
Leaving this open for validation. Let's run some queries to see if this issue still persists.