amplify-js icon indicating copy to clipboard operation
amplify-js copied to clipboard

PubSub reconnect

Open leantorres73 opened this issue 5 years ago • 62 comments

Related to https://github.com/aws-amplify/amplify-js/issues/1844, if I close the websocket connections and try to resubscribe does not reconnect the connection.

To Reproduce Steps to reproduce the behavior:

  1. Subscribe to a topic, then close the connection and then try to resubscribe to the same topic again.

Describe the solution you'd like There should be a way to reconnect to a closed websocket connection.

A method like removePlugabble must be included to fix and have the ability to unsubscribe and start over again.

leantorres73 avatar Apr 08 '19 14:04 leantorres73

We also need this. If you lose signal on a portable device you're left without a valid clean method of re-establishing a connection.

kirkryan avatar Jun 14 '19 12:06 kirkryan

Why are none of the maintainers acknowledging this and offering some support? We had to do some very complex reconnect functionality which is a little frail and always needs fine tuning. I believe the PubSub module should handle these scenarios that are very common (signal drops)

timoteialbu avatar Aug 21 '19 18:08 timoteialbu

Worst thing is that MQTT.js DOES have this functionality and that the AWS IoT client that is built on top of it simply ignores it for some reason. I’ve heard that the AWS team are working on a new release that will improve this but would like to see a comment here from the team.

As it stands - the smallest blip in connectivity for longer than a few seconds will cause the AWS IoT provider to disconnect and there is no way to reconnect it or tests its current state.

kirkryan avatar Aug 21 '19 18:08 kirkryan

@leantorres73 I tried this code on a react-native app

import React, { useEffect, useState } from 'react';
import { StyleSheet, Text, View, Button } from 'react-native';
import Amplify, { Analytics, API, PubSub } from 'aws-amplify';
import { AWSIoTProvider } from "@aws-amplify/pubsub/lib/Providers";

Analytics.disable();

Amplify.configure({
  Auth: {
    identityPoolId: "us-west-2:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx",
    region: 'us-west-2'
  }
});
Amplify.addPluggable(new AWSIoTProvider({
  aws_pubsub_region: 'us-west-2',
  aws_pubsub_endpoint: 'wss://xxxxxxxxxxxx-ats.iot.us-west-2.amazonaws.com/mqtt',
}));

let subscription;

function App() {
  const [message, setMessage] = useState('Open up App.js to start working on your app!');
  useEffect(() => {
    connect();
  }, []);
  
  
  function connect() {
    subscription = PubSub.subscribe('myTopic').subscribe({
      next: data => setMessage(JSON.stringify(data, null, 2)),
      error: error => setMessage(JSON.stringify(error, null, 2)),
      close: () => setMessage('Done'),
    });
  };

  function disconnect() {
    subscription.unsubscribe();
  };

  return (
    <View style={styles.container}>
      <Button title="connect" onPress={connect}></Button>
      <Button title="disconnect" onPress={disconnect}></Button>
      <Text>{message}</Text>
    </View>
  );
}

export default App;

const styles = StyleSheet.create({
  container: {
    flex: 1,
    backgroundColor: '#fff',
    alignItems: 'center',
    justifyContent: 'center',
  },
});

I was able to subscribe to the same topic without issues.

elorzafe avatar Aug 30 '19 21:08 elorzafe

@timoteialbu @kirkryan sorry for the late response, for errors on connection on observer I merge a pr this week that solves that problem (pr #3890) so you can reconnect from there, can you try latest unstable version.

elorzafe avatar Aug 30 '19 21:08 elorzafe

@timoteialbu @kirkryan sorry for the late response, for errors on connection on observer I merge a pr this week that solves that problem (pr #3890) so you can reconnect from there, can you try latest unstable version.

Hi @elorzafe - we’ll try it out this week thank you. I believe this will fix part 1 of the overall issue (detection of a disconnect), however there is still no clean way of re-establishing a connection. Simply running addPluggable restablishes the existing connection AND a new connection resulting in two client IDs and duplicate messages, what we really need is a method of either:

1: IOTpluggabke reconnect Or 2. IOTremovepluggable so that we can clear the old stale connection then run an addPluggable again

Does that clarify the issue?

kirkryan avatar Aug 31 '19 06:08 kirkryan

@kirkryan if you try to subscribe to the same topic (without adding the plugin) for me some reasion this not working for you? Let me know how it goes.

elorzafe avatar Sep 03 '19 18:09 elorzafe

@elorzafe - this doesn't work as the WebSocket itself is disconnected at this point and therefore there is no way to re-activate it, other than running an addPluggable - which then adds a new WebSocket but you then have 2 client ID's subscribed and start to receive duplicate messages!

What we really need is:

  1. A method to check the state of the WebSocket connection (pluggableState?)
  2. A method to reconnect (or auto-reconnect?) the WebSocket connection (reconnectPluggable?)
  3. A method to disconnect or remove the existing connection added by addPluggable, so that we can run it again and not have duplicate messages

Does that make sense?

kirkryan avatar Sep 03 '19 19:09 kirkryan

Hi @elorzafe - I upgraded our amplify to the latest version as of today. I can confirm that when the client kills the WebSocket (after approx 2 mins if you lock your phone), the app shows a code 7: Socket error undefined when you bring it back to the foreground (expected as the app is suspended therefore the client will hit timeout and the connection becomes invalid.

image

If you resubscribe to a topic it looks like the WebSocket is re-connected (without having to run an addPluggable command).

Therefore this leaves the final piece of the puzzle, how do we check the state of the client so that we can cleanly handle having to resubscribe to all relevant topics?

kirkryan avatar Sep 09 '19 11:09 kirkryan

I think that we have similar issue (using aws-amplify v. 1.1.40 ). For example, if I subscribe to two topics:

subscriptions = PubSub.subscribe([‘topic1’, ‘topic2’]).subscribe({
      next: data => setMessage(JSON.stringify(data, null, 2)),
      error: error => setMessage(JSON.stringify(error, null, 2)),
      close: () => setMessage('Done'),
    })

And then

subscriptions.unsubscribe()

When I try to subscribe again to ‘topic1’, I can no longer receive its messages. What is the recommended way to subscribe topics again with aws-amplify? @elorzafe ?

Edit: fixed typing error in my example.

mevert avatar Sep 17 '19 13:09 mevert

I think that we have similar issue (using aws-amplify v. 1.1.40 ). For example, if I subscribe to two topics:

subscriptions = PubSub.subscribe(‘topic1’, ‘topic2’).subscribe({
      next: data => setMessage(JSON.stringify(data, null, 2)),
      error: error => setMessage(JSON.stringify(error, null, 2)),
      close: () => setMessage('Done'),
    })

And then

subscriptions.unsubscribe()

When I try to subscribe again to ‘topic1’, I can no longer receive its messages. What is the recommended way to subscribe topics again with aws-amplify? @elorzafe ?

I just noticed that subscribing again to old topics after unsubscribing used to work with aws-amplify v. 1.1.30 but does not work anymore with the latest v. 1.1.40. Is it possible that there is some breaking change between these version? @elorzafe So I will have to use the old version. Edit: Did not get it working again even with 1.1.30 when I deleted my package-lock.json, node_modules and reinstalled everything. Perhaps I had some old version hanging or something... Would be really nice to get this fixed soon.

mevert avatar Sep 17 '19 15:09 mevert

subscriptions = PubSub.subscribe(‘topic1’, ‘topic2’).subscribe({
     next: data => setMessage(JSON.stringify(data, null, 2)),
     error: error => setMessage(JSON.stringify(error, null, 2)),
     close: () => setMessage('Done'),
   })

@mevert topic2 on that case is providerOptions not a second topic. The first parameter of subscribe is a string or string[] it should be like this.

subscriptions = PubSub.subscribe([‘topic1’, ‘topic2’]).subscribe({
      next: data => setMessage(JSON.stringify(data, null, 2)),
      error: error => setMessage(JSON.stringify(error, null, 2)),
      close: () => setMessage('Done'),
    })

elorzafe avatar Sep 20 '19 17:09 elorzafe

@kirkryan why do you need the state if the observable is closing the subscription and sending the error? I think is cleaner to handle reconnection logic than having a process that is checking state. Another option could be sending events on Hub that you can handle the resubcription logic.

How did you plan to check the state of the client?

elorzafe avatar Sep 20 '19 17:09 elorzafe

subscriptions = PubSub.subscribe(‘topic1’, ‘topic2’).subscribe({
     next: data => setMessage(JSON.stringify(data, null, 2)),
     error: error => setMessage(JSON.stringify(error, null, 2)),
     close: () => setMessage('Done'),
   })

@mevert topic2 on that case is providerOptions not a second topic. The first parameter of subscribe is a string or string[] it should be like this.

subscriptions = PubSub.subscribe([‘topic1’, ‘topic2’]).subscribe({
      next: data => setMessage(JSON.stringify(data, null, 2)),
      error: error => setMessage(JSON.stringify(error, null, 2)),
      close: () => setMessage('Done'),
    })

Sorry, I had typo when writing it here. However, were are using string[] in our real case. Any idea that did your previous changes to PubSub affect to this subscription problem since I don't have the same problem with "@aws-amplify/pubsub": "1.1.0" ?

mevert avatar Sep 21 '19 07:09 mevert

@mevert I will look into this

elorzafe avatar Sep 23 '19 19:09 elorzafe

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 23 '19 19:10 stale[bot]

Hi,

Is there any progress on this ? I am having a similar issue:

1 - My applicationA subscribe to a few topics 2- My applicationA opens my other applicationB 3- My applicationB does not use Amplify and only perform specific tasks. 4- After about 1-2 minutes (but could be more), applicationB reopens back applicationA 5- Upon foreground, applicationA's subscriptions call their error function with error: "Disconnected, error code 8"

react-native: 0.57.8 aws-amplify: ^1.2.2

How can i handle this ? is there a way to reconnect automatically ?

Thank you

nayan27 avatar Oct 25 '19 15:10 nayan27

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Nov 30 '19 21:11 stale[bot]

I would love to see a clear example of how to reconnect on disconnect.

peterecofit avatar Dec 01 '19 03:12 peterecofit

Also struggling with this using graphql subscriptions within react native.

graphqlOperation(onCreateMessage, {messageHouseId: this.props.house.id})
  ).subscribe({
    next: (message) => {
      const messageObj = message.value.data.onCreateMessage;
      const { user_id : sender_id } = messageObj.user;
      if(sender_id !== this.props.user.user_id) {
        this.props.sendMessage({message: [messageObj]});
      }
    },
    error: err => console.log(err),
    close: () => console.log('Done')
  });

When the app goes to the background the the subscription errors out after a couple of minutes. Should I resubscribe inside error: err => {...}??

camin-mccluskey avatar Dec 03 '19 22:12 camin-mccluskey

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jan 02 '20 23:01 stale[bot]

Still an issue, should not be closed!

houmark avatar Jan 02 '20 23:01 houmark

Very much agree. This needs a clear resolution documented!

peterecofit avatar Jan 02 '20 23:01 peterecofit

Bumping again. We need a clear solution!

peterecofit avatar Jan 23 '20 21:01 peterecofit

Keeping this from going stale. Need a fix soon

justinjoyn avatar Feb 01 '20 14:02 justinjoyn

Hi All, I came across this issue in Mid-January and implemented my own logic for re-connecting using PubSub.subscribe api of Amplify but recently I saw it not working as there was an underlying issue in the internal implementation of library(I dug deep into and noticed that if the client is disconnected from Network Connection and if the client re-connects before it runs into error part of PubSub.subscribe it works, but if it runs into error part then it never re-connects). So, I after researching for a quite long time I used another approach where I can now re-connect without any issue. But this approach does not use PubSub from amplify. I also tried contacting AWS support but none of their said things worked.

Work Around Explanation :-

  • I tried to use mqtt which not only has re-connect mechanism but also sends all the messages if the connection is lost for short time(I mean before the socket disconnects and tries to establish a new socket connection). But if entirely disconnects and re-connects you can have a mechanism where you can call all the required api to stay updated.

  • But to use the above library I am generating and signing my own Socket url which is quite easy with SigV4Utils.

  • Please find attached AWS_MQTT_IOT_Reconnect repo where you can find the code for all that is said above. Please read inline comments for better understanding.

Dinesh5799 avatar Mar 16 '20 06:03 Dinesh5799

Hi everyone. Has this issue been resolved ?

Thembelani avatar Jun 03 '20 10:06 Thembelani

This issue should be a priority, it impacts DataStore #6162 and it's giving us a tough time of using it. We are also in the midst of dropping it. Imagine you are building a food delivery system for a franchise and none of the branch can receive any new orders because of this, amend to that.

This issue is raised on Oct 2018 #1844 and has not been solved ever since. 3 more months before it reaches the 2 years anniversary.

Seriously? What is taking so long and why isnt this a priority? This just shows how unproven Amplify is and how no one should sink their feet into this blackhole if they'd really care for a stable and reliable app.

Come on, please do better than this. Please put in more resources into fixing bugs and getting Amplify to be as stable as possible. Hell, make apps with it and battle test this piece of thing. Does Amazon even use Amplify for their projects? You know what, don't use Discord. Build a chat communicaton app with Amplify. Let that be the testing ground.

If you guys aren't doing any of these, Amplify just looks like another pet project to me. Amplify team cranks out Amplify -> Only performs regular unit testing -> Developers do the integration -> Developers complaints -> Amplify team doesn't respond or takes months to respond How could we rely on such a fragile framework to take on any actual production projects when the Amplify team can't be bothered enough to ensure the essentials are working as advertised?

Either the Amplify team does not seem to be motivated enough to get this fixed because they don't neccessarily have to or that they don't understand how big of the problem this is causing.

nubpro avatar Jul 18 '20 20:07 nubpro

This issue is a significant blocker to customers. A library that's mobile and web centric with its bi-directional communication component not handling reconnects and for this long, after this many issues? Decide as a team / company if you want to support builders who will make actual products or if you want to keep pushing for flashy demos that go nowhere.


A workaround I've had to implement is to reach into the internals of PubSub and reset the pluggables myself here. You can take advantage of the code not caring about mutability and returning a reference to the array of pluggables itself by calling getProviders without a provider name (logic here).

To handle reconnects yourself, you'll need to remove the existing pluggable and create a new one, with a new client ID. The problem is on intermittent connection drops, the client id (either generated automatically if not passed in or the one you provided via options) is still considered connected, resulting in the mentioned Disconnected, error code 8 on socket close when you attempt to reconnect.

Even after cleaning up on client disconnect (here), on new subscribes, in the PubSub wrapper, the existing client id continues to be used here.

tl;dr:

  1. On intermittent disconnects (observable emits an error with Disconnected, error code 8) handle clearing existing pluggables with established client ids by reaching into the singleton PubSub implementation and clearing the _pluggables private value array.
  2. Create a new pluggable to generate / pass in a new client id and register it again.
  3. Re-attempt to subscribe to the observable with the topic and the underlying Paho client will reconnect with the new client id.

I'm happy to send in a pull request if the team wants to provide direction on how they think this should be solved knowing the above workaround and underlying issue.

cuongvo avatar Aug 04 '20 03:08 cuongvo

I can confirm this has been a long time problem for our field based units and we have had to implement workarounds.

AWS : your IoT implementation never reconnects if an app gets suspended (screen switched off) making it pretty much unusable.

kirkryan avatar Aug 08 '20 05:08 kirkryan