telemetry-analysis-service icon indicating copy to clipboard operation
telemetry-analysis-service copied to clipboard

Use kinesis for EMR status processing

Open jezdez opened this issue 7 years ago • 4 comments

AWS Cloudwatch recently gained the ability (thanks @robotblake!) to stream EMR events. That means that cluster state changes can be forwarded to the available EMR targets, including custom Lambda functions and SQS.

We should make sure of this since it would allow us to remove the need to query the AWS EMR API at all and instead have AWS post status updates to ATMO.

Naturally we could write a quick Lambda function that takes the JSON payload of a EMR status event and forward POST it to a ATMO API. But that has other side effects and may be brittle in case of ATMO deploys and the unavailability of its HTTP endpoint because of it.

Instead I suggest to use AWS' SQS FIFO queues as the target for the Cloudwatch events and then have a ATMO-side SQS processor that goes through the event updates periodically and updates our database.

jezdez avatar Mar 22 '17 08:03 jezdez

@maurodoglio @robhudson @robotblake @jasonthomas Does that sound sensible to you? The goal here is to make sure the cluster state in ATMO is always up-to-date.

jezdez avatar Mar 22 '17 08:03 jezdez

+1 on your proposal

vitillo avatar Mar 22 '17 09:03 vitillo

@vitillo Thanks!

jezdez avatar Mar 22 '17 10:03 jezdez

Turns out we can't use SQS in FIFO mode as that's not supported as a target by Cloudwatch (yet). Instead I'll use Kinesis.

jezdez avatar Apr 06 '17 09:04 jezdez