kafka-connect-cosmosdb icon indicating copy to clipboard operation
kafka-connect-cosmosdb copied to clipboard

Add configuration option to support reading changes from Beginning of Time

Open ryankingston opened this issue 3 years ago • 2 comments

When configuring my Cosmos DB Source connector it seems that the connector is processing changes from the last time it was monitored by a source task. I would like process these changes from the beginning of time in my Cosmos DB instance, but I can't seem to find a solution for this. During configuration it seems the only two options for the "Use latest offset" attribute is processing from the last recorded offset or process from the last time it was monitored by a source task, nothing around beginning of time. Any suggestions here?

Note: Using Conduktor as the Kafka Platform.

ryankingston avatar Feb 17 '22 16:02 ryankingston

correct, this doesn't appear to be supported today. we could add startFromBeginning config option or something similar to how Functions handles this in the Cosmos DB trigger.

for now, you could look in to deleting the lease collection in Cosmos DB and letting the connector recreate a new leases collection. I believe this will have the same effect as telling the connector to "start from the beginning"

ryancrawcour avatar Feb 17 '22 21:02 ryancrawcour

another option would be to add a withStartTime configuration option that would allow a user to set a specific start date and time. to read from beginning of the life of the container you would set the config value to the min valid Java date & time.

similar to https://docs.microsoft.com/en-us/azure/cosmos-db/sql/change-feed-processor#reading-from-the-beginning

ryancrawcour avatar Feb 17 '22 21:02 ryancrawcour