spark
spark copied to clipboard
[BUG]: Structured Streaming, Trigger Once, on Databricks never ends the job
Describe the bug When using Spark Structured Streaming on Azure Databricks, with the micro batching pattern (Trigger Once), the job is never terminated, although the .NET Spark application is well shutdowned. Tested with Spark .NET 1.0 & 1.1.1
And in the standard output
MyJob[0]
Executed batch id 1
MyJob[0]
[Job] Streaming query terminated.
Microsoft.Hosting.Lifetime[0]
Application is shutting down...
It ends up with the following log in the standard error output
ERROR: Query termination received for [id=d7a4b9ba-5fee-49da-b7b3-b7a96ae8b05c, runId=7da48d0e-5cc5-4997-9647-3f0505220a04]
I think this is due to this error, that is caught, and not propagate the end job state to Databricks. It is reproducible using any kind of source stream (kafka, or files).
To Reproduce Creates a Spark Structured Streaming job, using Trigger.Once and runs it in Databricks, the job will never end and has to be stopped manually.
Expected behavior The Job should be in a Succeeded state, and should stop.