amazon-kinesis-client icon indicating copy to clipboard operation
amazon-kinesis-client copied to clipboard

Correct way to read records from an epoch time

Open gogrisohil opened this issue 5 years ago • 1 comments

Hi,

We are trying to run the KCL multilang daemon using the properties file and have it read records from a certain point in time. We tried setting initialPositionInStream = AT_TIMESTAMP but then we got the error java.lang.IllegalArgumentException: Invalid InitialPosition: AT_TIMESTAMP. We then tried to set timestampAtInitialPositionInStream = 1617305352 and not set initialPositionInStream to anything. At that point the lease table for all shards pointed to LATEST instead of AT_TIMESTAMP. We were wondering what we're doing wrong to read records from a certain point.

We're using version 2.3.1 of the KCL.

Thank you.

gogrisohil avatar Apr 02 '21 20:04 gogrisohil

I did a little digging in the repository of how AT_TIMESTAMP behavior can potentially be parsed from the multilang daemon. It seems like most of the configuration is read and parsed in this file https://github.com/awslabs/amazon-kinesis-client/blob/master/amazon-kinesis-client-multilang/src/main/java/software/amazon/kinesis/multilang/config/MultiLangDaemonConfiguration.java

I saw a member variable, InitialPositionInStreamExtended, which seems to give us what we want in terms of setting AT_TIMESTAMP and the date field. I failed to see see how to specify that variable in the properties file, so I'm proposing this change https://github.com/awslabs/amazon-kinesis-client/pull/804 which parses that key in the properties file as a Long. The change works locally and allowed us to initialize a stream reader from a point in time.

I'm not sure if this is the best way to do it, but would love some feedback or direction on if there's a better alternative.

kevioke avatar Apr 05 '21 19:04 kevioke