azure-event-hubs-spark icon indicating copy to clipboard operation
azure-event-hubs-spark copied to clipboard

Read stream from checkpoint when the code is updated.

Open udossa opened this issue 2 years ago • 0 comments

Hi,

We use Spark-Streaming in scala to process incoming Event-Hub messages continuously. Our code is packaged in a "jar" and executed by a databricks job. We've enabled checkpoints in Azure Blobs. It works fine when the job stops due to external reasons, but when we stop the job to update the code and restart the job, it's not able to resume and process the data where it left off. We need to give a new folder for the checkpoint, then the job restarts and reads the data at the end of the stream.

Is it possible to use the checkpoint when we update the jar, then it will start processing the data, where it left off before. If not, what options do we have. ?

Cheers,

udossa

udossa avatar Aug 01 '22 14:08 udossa