aws-appsync-community icon indicating copy to clipboard operation
aws-appsync-community copied to clipboard

Appsync integration with AWS IOT is really inefficient and uneconomical

Open jonsmirl opened this issue 3 years ago • 3 comments

It is simply not economical to run millions of IOT updates through Appsync where active GraphQL users scan for the 0.00001% they are interested in. A much more efficient solution has been proposed in this form post: https://forums.aws.amazon.com/thread.jspa?threadID=271817&tstart=0

Re: adding subscription resolvers for extended pubsub queries With the flexibility of GraphQL directives, you could support other type of subscriptions, for example, something like: type Subscription { onMachineStateUpdate(id: ID): MachineState @aws_iot_subscribe(topic: ["/MachineState/<id>"]) } Then a client could listen to IoT topics, through that subscription.

This scheme is very efficient since it leaves all of the IOT updates irrelevant to the active GraphQL users inside the IOT subsystem. When the GraphQL subscription starts, appsync would do the iot-subscribe and then tear it down when the user stops listening. MQTT messages from the IOT subscription would be applied as mutations. This is not something an external user can implement, the AppSync team needs to build this.

jonsmirl avatar Feb 23 '21 16:02 jonsmirl

Just to clarify: the problem you're trying to solve is that you want a client to be able to specify filters for incoming subscription data, is that right? Can you tell us more about your use case so I can better understand?

jpignata avatar Mar 18 '21 02:03 jpignata

Summary: I want to filter the data in IOT-core before it arrives inside AppSync. This has to be done dynamically as people subscibe/unsubscribe in their AppSync session.

It is a question of pushing the data versus pulling the data. This example pushes the data. https://github.com/aws-samples/aws-appsync-iot-core-realtime-example So if you have millions of IOT updates arriving all of them get pushed though AppSync. These millions of data points all get pushed through AppSync even if no one is listening to them.

Compare this to a pull model. User uses phone app to check some specific device via AppSync subscription. AppSync then internally does an IOT-Subscribe over to IOT-core using the supplied pattern (this has to be done in the context of the IAM user, there are policies in place on the IOT core restricting who can subscribe). Now only the IOT data that matches that subscription is copied and passed through AppSync. In my case that might be 10 data points instead of millions.

Our current work around to this is for the App is to have two parallel AWS connections. One connection to AppSync and one connection to IOT-Core and then merge the data in the App. What I would prefer is a single connection to AppSync with the IOT data coming in via the AppSync subscription. That model leaves 99.99% of the IOT data inside IOT-core where it is efficiently being pushed into DynamoDB (maybe Timestream in the future).

The basic problem here is that AppSync users only want to see a very tiny percentage of the real-time IOT data. Typically just one specific device out of millions of devices. So there needs to be a way to just pass that tiny sliver of data through AppSync instead of the complete set of all of the IOT data. In other words, do the filtering step inside IOT-core instead of inside AppSync.

Use Ring cameras for an example. There are tens of millions of Ring devices pushing data up to AWS. When a user opens their app AppSync could query historical data and then give real-time updates on their camera. Users watch their cameras in live mode 0.001% of the time. Is it better to push the real-time data for millions of cameras through AppSync just so that the few live users can use AppSync to filter for events on their personal camera? Or would it be better for AppSync to internally do an IOT-subscribe and pull only the relevant data into AppSync? Move the filter step into IOT-Core.

If you look at this example app, you might say -- why don't you just make a rule that matches the user's request? https://github.com/aws-samples/aws-appsync-iot-core-realtime-example You could, but then you need AppSync events for start/stop subscription (those events don't exist) in order to dynamically build the rule and then to destroy the rule. And if that process has errors, those rules just pile up sending pointless data into AppSync so there needs to be a robust error recovery scheme to stop that. If you consider what you are doing, you are simply rebuilding the IOT-subscribe command. And IOT-subscribe already has robust error recovery since when the session ends all of the resources are automatically freed. So if AppSync handles these IOT-subscribes internally, and the AppSync server dies -- IOT-Core will automatically clean up.

jonsmirl avatar Mar 18 '21 12:03 jonsmirl

Not sure if AWS made any changes for this. As stated AppSync will broadcast message only if there are connected listeners. Otherwise data can directly go to another place, other than AppSync. And I hope AppSync will not cost for these message that only pass by. I asked this question on aws forum with reference to here. https://repost.aws/questions/QUu7SM82q1TiWKfXVAjsolrg/app-sync-client-filter-for-iot

ismail-ozyigit1 avatar Sep 11 '22 07:09 ismail-ozyigit1