azure-cosmosdb-spark icon indicating copy to clipboard operation
azure-cosmosdb-spark copied to clipboard

Support readchangefeed from a certain point in time

Open nomiero opened this issue 5 years ago • 2 comments

We currently have support for reading changefeed starting from the beginning or from current time. We need to support also starting from any point in time.

nomiero avatar Mar 25 '19 18:03 nomiero

I feel the checkpointLocation feature can be used for this. I am not clear on a few things though. Where and in what format is the checkpoint location stored? Can we use the last saved checkpoint to restart from that point?

sandeepwww avatar Apr 06 '19 08:04 sandeepwww

The checkpoint is currently saved in HDFS, this will change soon though. And yes, if you stop the job and start it again with the same checkpointing location, it will continue from the same point. The problem now is that if you want to read documents from the last 3 days, you will have to go as far as populating the location yourself which is not a trivial thing to do. We are looking into making this experience better.

nomiero avatar Apr 07 '19 01:04 nomiero