amazon-chime-sdk-js icon indicating copy to clipboard operation
amazon-chime-sdk-js copied to clipboard

Unable to catch failed websocket connections

Open kelvin2200 opened this issue 2 years ago • 4 comments

What happened and what did you expect to happen?

Certain clients are behind custom firewall and/or antivirus configurations. Usually when this is the case, and the Chime endpoints require whitelisting, the first thing that fails (among others) is the connection to the chime signaling service. While there are hacks and workarounds to catch that error in the browser, it would be nice for the SDK to have a listener we can use to detect that specific scenario and show the user a custom message.

Maybe such a listener can be added and we haven't yet found out how. Please correct me if I am wrong.

Have you reviewed our existing documentation?

Reproduction steps

add these 2 entries to: /etc/hosts (linux) 127.0.0.1 signal.m2.ec1.app.chime.aws 127.0.0.1 data.svc.ue1.ingest.chime.aws

and start a meeting

Amazon Chime SDK for JavaScript version

3.15.0

What browsers are you seeing the problem on?

all

Browser version

all

Meeting and Attendee ID Information.

No response

Browser console logs

[WARN] - Chime: stopped pinging (WebSocketFailed) [WARN] - Chime: will retry due to status code TaskFailed and error: serial group task AudioVideoStart/8bcbfa41-73d5-468e-9658-a3ae158d7979/d9cbd654-f27e-b7d2-ae94-5670c2a2c690 was canceled due to subtask AudioVideoStart/8bcbfa41-73d5-468e-9658-a3ae158d7979/d9cbd654-f27e-b7d2-ae94-5670c2a2c690/Timeout15000ms error: serial group task AudioVideoStart/8bcbfa41-73d5-468e-9658-a3ae158d7979/d9cbd654-f27e-b7d2-ae94-5670c2a2c690/Timeout15000ms/Peer was canceled due to subtask AudioVideoStart/8bcbfa41-73d5-468e-9658-a3ae158d7979/d9cbd654-f27e-b7d2-ae94-5670c2a2c690/Timeout15000ms/Peer/SubscribeAndReceiveSubscribeAckTask (once) error: serial group task Signaling was canceled due to subtask Signaling/Timeout15000ms (once) error: WebSocket connection failed

WebSocket connection to 'wss://signal.m2.ec1.app.chime.aws/control/8bcbfa41-73d5-468e-9658-a3ae158d7979?X-Chime-Control-Protocol-Version=3&X-Amzn-Chime-Send-Close-On-Error=1&X-Amzn-Version=3.14.1&X-Amzn-User-Agent=chrome-114' failed: create @ DefaultWebSocketAdapter.ts:15 serviceConnectionRequestQueue @ DefaultSignalingClient.ts:374 21:35:45.053 ConsoleLogger.ts:79 2023-07-27T18:35:45.053Z [ERROR] Chime - failed to connect

kelvin2200 avatar Jul 27 '23 18:07 kelvin2200

Right now I think the preferred way to listen to failure events on the meeting is via the https://aws.github.io/amazon-chime-sdk-js/interfaces/audiovideoobserver.html.

We also have metricsDidReceive observer that you should check out: https://github.com/aws/amazon-chime-sdk-js/blob/main/guides/17_Migration_to_3_0.md#:~:text=const%20observer%20%3D%20%7B%0A%20%20oldSendBandwidthKbs,%7D%0A%20%20%7D%2C%0A%7D%3B

michhyun1 avatar Jul 31 '23 19:07 michhyun1

@michhyun1 OK, we know about the audiovideoObserver, and client metrics, but:

  1. a user may not have a video device at all
  2. the observers will behave the same way when having a poor connection and there is packet loss

what would be needed is something that says specifically that the client cannot connect to the WS

kelvin2200 avatar Aug 07 '23 08:08 kelvin2200

I think the closet thing we have to that is https://aws.amazon.com/blogs/business-productivity/monitoring-and-troubleshooting-with-amazon-chime-sdk-meeting-events/

We throw a TaskFailed meeting event when we are unable to connect to the WS.

however, taskFailed can mean multiple things, not just WS connection failure. I'm not 100% sure if there might be some metadata within a meeting event that shows whether or not it was caused due to the opensignalingtask failing as opposed to some other task failing.

michhyun1 avatar Aug 07 '23 18:08 michhyun1