azure-event-hubs-spark icon indicating copy to clipboard operation
azure-event-hubs-spark copied to clipboard

Pyspark documentation error

Open laurencewells opened this issue 4 years ago • 3 comments

Hi Listing the issue for anyone else having the same issue,

For the Pyspark documentation here: azure-event-hubs-spark/docs/PySpark/structured-streaming-pyspark.md

it references that the default starting position is start of stream:

image

The behaviour we were seeing is this is not the case and start of stream needs to be specifically set to stream from the start. The default behaviour, if nothing was entered was the end of stream.

laurencewells avatar May 06 '21 10:05 laurencewells

Thanks for pointing out the error in the documentation. I'll update the pyspark doc to fix this.

nyaghma avatar May 20 '21 21:05 nyaghma

Hi,

We ran into the same surprise yesterday, and did not understand why the job is starting at the end of the stream all the time. This led to time wasted debugging and making workarounds. Could it be a bit prioritized? It is really confusing for pyspark users.

Thanks! Best Regards, Stefan Prisca.

stefanprisca avatar Sep 02 '21 07:09 stefanprisca

@stefanprisca Same, we sank a good hour or two into debugging what was happening

laurencewells avatar Sep 02 '21 07:09 laurencewells