amazon-kinesis-client MillisBehindLatest metric across _all

Currently several metrics, including MillisBehindLatest are reported to CloudWatch with one of the dimensions being a shard id. On the other side we find it very convenient to set CloudWatch alarms on top of this metric to be able to react, if any shard starts to lag behind. Now it is not possible to set up alerts without specifying the exact name of the shard. This is a limiting factor, because once you add and remove shards constantly, the shard names are being very dynamic and each time they change, you need to change the alarms accordingly, which is frustrating. In general as one want to react to any shard lagging behind, it would be very nice to have a global MillisBehindLatest without relating it to any shard in its dimensions. This can be the maximum across all shards, like MaxMillisBehindLatest.

Oct 19 '17 08:10 usrenmae

Kinesis does emit a Stream level metrics for iterator age, called GetRecords.IteratorAgeMillis. You should be able to setup alarm on that metric. That metric can be found under the Kinesis namespace in CloudWatch. If you set the statistic for that metric to Maximum it'll map the maximum millisBehindLatest from all the shards for that given period. Please feel free to reopen the issue, if you still have questions.

Oct 19 '17 20:10 sahilpalvia

Thanks for informing about the GetRecords.IteratorAgeMilliseconds metric. I wasn't aware of this one. After a closer look into it I figured out it's a global per-stream metric of the Kinesis service. What I'm interested in is a per-consumer metric. We have multiple consumers running on the same stream, some of them may catch up the event feed perfectly, but others may lag. My idea was to have a metric which can tell you which particular consumer is lagging behind. It's not possible to get this information out of the GetRecords.IteratorAgeMilliseconds metric of Kinesis stream itself, but KCL could provide this metric similar way it provides the MillisBehindLatest, but without the shardId dimension. Actually it is not convenient at all to have automation built around any shard-specific metrics, as shards are very dynamic on their own and may change in time, considering the fact that it is not possible to have an alarm on a metric with dimensions, but not specifying the dimension value. When monitoring is build on per-consumer basis, it's much more useful: one can setup permanent alarms on it and only in case of incident it's possible to trace back the particular shard with the shard-specific metrics already. Please re-opening the issue as suggested above.

Oct 20 '17 12:10 usrenmae

Thank you for the feedback. We agree with the change you have suggested, and will prioritize it accordingly against the other customer requests we receive.

Oct 20 '17 17:10 sahilpalvia

@sahilpalvia I also have same use case which we want to scale up/down based on how fast KCL application consumes. this metric will be helpful.

Feb 06 '18 06:02 StevenYCChou

We have a similar use case and would like this metric as well. We have two kcl consumers on the same kinesis stream. One has a low threshold requirement while the other has a much higher threshold of latency.

We've set the alarm at the lower threshold on the stream, but it alarms once or twice a day because of the higher latency kcl consumer. We have to treat it as an alarm situation each time which obviously causes a lot of time wasted.

We've considered using the shard level metric, however being on the limit of max alarms allowed and having a 60 shard stream, that is not possible currently.

Mar 15 '18 15:03 ghost

@sahilpalvia we also have exact same use case, can you provide any update on this?

Oct 07 '18 18:10 akumariiit

We don't have an update at this time. This is a feature we are interested adding, and will prioritize it with all customer requests.

For all of those interested can you please post a reaction on the parent post, this will assist us in prioritizing customer requests.

Oct 08 '18 19:10 pfifer

+1

Oct 08 '18 19:10 waffleshop

+1

Oct 09 '18 04:10 vinujan59

+1

Oct 09 '18 04:10 vik7

+1

Oct 31 '18 07:10 akumariiit

+1

Nov 28 '18 21:11 rkass

+1 We have more than 500 shards in Kinesis and more than 4 KCL application using same Kinesis. In AWS Cloudwatch console, we can not search all shard because Console search result limit is 500. so we do not use KCL Metrics. Although the number of indicators we can graph at one time is limited to 100 in console. This feature is essential for me to check lag of each KCL Application.

Mar 21 '19 08:03 winty56

+1

Jun 14 '21 07:06 kaisermario

@pfifer Any update?

Jun 14 '21 07:06 kaisermario

+1

Jun 14 '21 08:06 MeisterMasi

+1

Jun 14 '21 08:06 CCBow-501

Hello,

There are service side metrics emitted for monitoring stream-level behind-ness. For consumers using GetRecords, "GetRecords.IteratorAgeMilliseconds" metric will be emitted and all consumer applications will be contributing to this metric. Consumer applications using enhanced fanout will be emitting "SubscribeToShardEvent.MillisBehindLatest" metric along with the consumer name, so status of each consumer can be monitored individually.

Consider using these metrics as an alternative to client-side metrics for monitoring application health.

For more details please refer to: https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-cloudwatch.html

Jun 14 '21 17:06 yasemin-amzn

Hello @yasemin-amzn , "SubscribeToShardEvent.MillisBehindLatest" is a basic (stream level) metric according to: https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-cloudwatch.html

Stream-level data is sent automatically every minute at no charge.

Unfortunately we can't see this metric in our account.

Jun 22 '21 07:06 kaisermario

+1

Jun 24 '21 07:06 leifbladt

+1

Nov 30 '21 06:11 QwertV2

amazon-kinesis-client
amazon-kinesis-client copied to clipboard

MillisBehindLatest metric across _all_ shards

amazon-kinesis-client amazon-kinesis-client copied to clipboard

MillisBehindLatest metric across _all_ shards

amazon-kinesis-client
amazon-kinesis-client copied to clipboard