azure-functions-python-worker
azure-functions-python-worker copied to clipboard
[BUG] 502 Bad Gateway on Python Function
Opened new issue by request of @balag0
Investigative information
Please provide the following:
- Timestamp: Wed, 27 May 2020 18:35:38 GMT
- Function App name: covidqmdataimport
- Function name(s) (as appropriate): DataImport
- Core Tools version: azure-functions==1.2.0, func is 3.0.2245
Repro steps
Provide the steps required to reproduce the problem:
Issue POST request with correct key and body: {"target":"covid-staging"}
Expected behavior
Provide a description of the expected behavior.
As function has no return, would expect a 202 Accepted return immediately, as the function needs to process in the background. Function execution is expected to last approximately 30 minutes.
Actual behavior
Provide a description of the actual behavior observed.
After approximately 4 minutes (4 minutes 5 seconds this call), function instead returns a 502 Bad Gateway exception.
Known workarounds
Provide a description of any known workarounds.
Contents of the requirements.txt file:
Provide the requirements.txt file to help us find out module related issues.
pandas==1.0.3 azure-functions==1.2.0 azure-cosmos==3.1.1 requests==2.23.0 beautifulsoup4==4.9.0 xlrd==1.2.0
Related information
Provide any related information
- Links to source
- Bindings used
Thanks. I see the invocation is failing with the following exception
Result: Failure Exception: RuntimeError: function 'DataImport' without a $return binding returned a non-None value Stack: File '/azure-functions-host/workers/python/3.7/LINUX/X64/azure_functions_worker/dispatcher.py', line 319, in _handle__invocation_request f'function {fi.name!r} without a $return binding '
These issues provide more information on the execption https://github.com/Azure/azure-functions-python-worker/issues/74 https://github.com/Azure/azure-functions-python-worker/pull/75
This probably means the function.json and the actual function code do not match. you should also be able to repro this error running the app locally.
I have updated the python function to no longer return a variable, but the execution at Wed, 27 May 2020 19:35:52 GMT and the execution at Wednesday, May 27, 2020, 19:32:45 GMT had the same issue after the code update.
Is there any way for me to see the error that you're seeing? It's not showing up in Application Insights.
Can you check the "traces" table
https://docs.microsoft.com/en-us/azure/azure-monitor/log-query/get-started-portal#write-and-run-basic-queries
running traces
as the kusto query with time range of 1 hr against application insights gives no results. Last 4 hours show successful deploys and invocations and some print statments, but nothing showing why the Postman query against the service is giving a 502 to me instead of a 202 or a 200. I can see the invocations triggered by the HTTP queries, and they look successful, so it's unclear why I would get an error back.
Hi @freeone3000,
In our most recent log, 2020-05-28 16:19:05,
The operation 'ScanLogs' with id '77fd14be-b8e5-47e0-af66-ad7bdb5231d6' did not complete in '00:02:00'.
In consumption plan, you cannot run a function with 30 minutes. It only allows the function app to run 10 minutes for each invocation. See if changing the setting in https://docs.microsoft.com/en-us/azure/azure-functions/functions-host-json#functiontimeout will help.
Since 2020-05-21, before initial raise of this issue, my host.json file looked like https://gist.github.com/freeone3000/7bd5189263bbc5dbf9381e883f432c59 which includes a functionTimeout of "01:30:00". It had already been set, so I don't think this is the issue.
I've ran the query again today, and am still getting 502 Gateway Timeouts. traces
still contains stdout from the service, but no useful logs.
Hi. Has there been any news on this issue? Do you need anything further from me?
Hi @freeone3000 how long are your function invocations expected to run ?
We've managed to get execution time down to around 250 seconds, and we see some successful invocations with it as a scheduled function, but we're still seeing issues when calling the function manually through postman.
@freeone3000 - is it consistently failing when using postman or intermittently - if it is consistently failing then there might just be an issue in how you are constructing the request. Also after reducing the execution time is it now consistently succeeding for the scheduled function ?
Consistently failing with postman, which I thought might be the issue so I also attempted to construct via requests, which also fails, unless it's within Azure Function Host, where it succeeds.
Reducing the execution time has caused 6/8 of the last runs to function, so it's improved, but not consistent - if one of our backend databases takes another minute or two longer than usual, we then fail.
If I try directly via curl:
curl --location --request POST 'http://SERVICE_NAME/api/DataImport'
--header 'Content-Type: application/json'
--header 'x-functions-key: XXXXXXXXX'
--data-raw '{"target":"staging"}'
I still get either a 404 or a 502. The above call should work reliably.
The header for the functions key looks correct. It is surprising that you get a 404 sometimes and 502 sometimes. What do you get if you try the "Test" in the portal ?
I see the following exceptions on the function app. @brettsam I found this issue from earlier with the exception. Looks like the host restarts when this exception is thrown and that causes further issues since the host doesn't start back up properly. Any ideas on what we can do avoid this exception?
@freeone3000 If possible could you split the functions into separate apps to see if we can isolate the failures.
t | Summary | Details | Level | |
2020-06-12 20:54:22.2660000 | An unhandled exception has occurred. Host is shutting down. | System.TimeoutException : The operation 'ScanContainer' with id '392f6cd0-0faf-43f9-9ad2-3a9d6fc34087' did not complete in '00:02:00'. at async Microsoft.Azure.WebJobs.Extensions.Storage.TimeoutHandler.ExecuteWithTimeoutT at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Extensions.Storage\TimeoutHandler.cs : 30 | 2 | |
2020-06-12 20:54:47.0800000 | A host error has occurred during startup operation '969372f7-4d20-4047-85fa-326fba4119bf'. | Microsoft.Azure.WebJobs.Script.HostInitializationException : Did not find functions with language [python]. at Microsoft.Azure.WebJobs.Script.Utility.VerifyFunctionsMatchSpecifiedLanguage(IEnumerable1 functions,String workerRuntime,Boolean isPlaceholderMode,Boolean isHttpWorker) at /src/azure-functions-host/src/WebJobs.Script/Utility.cs : 562 at async Microsoft.Azure.WebJobs.Script.ScriptHost.GetFunctionDescriptorsAsync(IEnumerable 1 functions,IEnumerable1 descriptorProviders) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 673 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.ScriptHost.InitializeFunctionDescriptorsAsync(IEnumerable 1 functionMetadata) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 493 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.ScriptHost.InitializeAsync(CancellationToken cancellationToken) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 295 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.ScriptHost.StartAsyncCore(CancellationToken cancellationToken) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 248 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Extensions.Hosting.Internal.Host.StartAsync(CancellationToken cancellationToken) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.WebHost.WebJobsScriptHostService.UnsynchronizedStartHostAsync(ScriptHostStartupOperation activeOperation,Int32 attemptCount,JobHostStartupMode startupMode) at /src/azure-functions-host/src/WebJobs.Script.WebHost/WebJobsScriptHostService.cs : 264 |
2 | |
2020-06-12 21:02:28.4200000 | An unhandled exception has occurred. Host is shutting down. | System.TimeoutException : The operation 'ScanContainer' with id '21282512-7cfc-4db1-a7a5-8a690e87955d' did not complete in '00:02:00'. at async Microsoft.Azure.WebJobs.Extensions.Storage.TimeoutHandler.ExecuteWithTimeoutT at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Extensions.Storage\TimeoutHandler.cs : 30 | 2 | |
2020-06-12 21:02:49.5870000 | A host error has occurred during startup operation 'ccb7e985-82cc-41bf-9efb-0034394e52d4'. | Microsoft.Azure.WebJobs.Script.HostInitializationException : Did not find functions with language [python]. at Microsoft.Azure.WebJobs.Script.Utility.VerifyFunctionsMatchSpecifiedLanguage(IEnumerable1 functions,String workerRuntime,Boolean isPlaceholderMode,Boolean isHttpWorker) at /src/azure-functions-host/src/WebJobs.Script/Utility.cs : 562 at async Microsoft.Azure.WebJobs.Script.ScriptHost.GetFunctionDescriptorsAsync(IEnumerable 1 functions,IEnumerable1 descriptorProviders) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 673 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.ScriptHost.InitializeFunctionDescriptorsAsync(IEnumerable 1 functionMetadata) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 493 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.ScriptHost.InitializeAsync(CancellationToken cancellationToken) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 295 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.ScriptHost.StartAsyncCore(CancellationToken cancellationToken) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 248 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Extensions.Hosting.Internal.Host.StartAsync(CancellationToken cancellationToken) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.WebHost.WebJobsScriptHostService.UnsynchronizedStartHostAsync(ScriptHostStartupOperation activeOperation,Int32 attemptCount,JobHostStartupMode startupMode) at /src/azure-functions-host/src/WebJobs.Script.WebHost/WebJobsScriptHostService.cs : 264 |
2 | |
2020-06-12 21:06:05.0800000 | An unhandled exception has occurred. Host is shutting down. | System.TimeoutException : The operation 'ScanLogs' with id '13f6975a-bd81-4448-b1cf-6e9c4c53b789' did not complete in '00:02:00'. at async Microsoft.Azure.WebJobs.Extensions.Storage.TimeoutHandler.ExecuteWithTimeoutT at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Extensions.Storage\TimeoutHandler.cs : 30 | 2 | |
2020-06-12 21:06:30.1560000 | A host error has occurred during startup operation 'fa16040c-20d9-485b-a3e8-372718490731'. | Microsoft.Azure.WebJobs.Script.HostInitializationException : Did not find functions with language [python]. at Microsoft.Azure.WebJobs.Script.Utility.VerifyFunctionsMatchSpecifiedLanguage(IEnumerable1 functions,String workerRuntime,Boolean isPlaceholderMode,Boolean isHttpWorker) at /src/azure-functions-host/src/WebJobs.Script/Utility.cs : 562 at async Microsoft.Azure.WebJobs.Script.ScriptHost.GetFunctionDescriptorsAsync(IEnumerable 1 functions,IEnumerable1 descriptorProviders) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 673 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.ScriptHost.InitializeFunctionDescriptorsAsync(IEnumerable 1 functionMetadata) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 493 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.ScriptHost.InitializeAsync(CancellationToken cancellationToken) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 295 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.ScriptHost.StartAsyncCore(CancellationToken cancellationToken) at /src/azure-functions-host/src/WebJobs.Script/Host/ScriptHost.cs : 248 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Extensions.Hosting.Internal.Host.StartAsync(CancellationToken cancellationToken) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Script.WebHost.WebJobsScriptHostService.UnsynchronizedStartHostAsync(ScriptHostStartupOperation activeOperation,Int32 attemptCount,JobHostStartupMode startupMode) at /src/azure-functions-host/src/WebJobs.Script.WebHost/WebJobsScriptHostService.cs : 264 |
2 |
When each function is spun off into their own app function, and care is taken to never throw an exception from beyond the python boundary, execution appears to proceed, even with no code changes beyond adding an over-broad try/except to every invoked main(). It appears that this individual App Function has entered into some broken state.