aws-appsync-community
aws-appsync-community copied to clipboard
RealTime WebSocket framing issue: frame ends with invalid UTF-8 character
Hello! I'm using AppSync RealTime over WebSocket connections and I'm experiencing intermittent problems that cause the WebSocket connection to break down and results in loss of messages.
I'm experiencing this problem in a Qt program (using Qt WebSockets).
Messages may contain non-ASCII/multi-byte UTF-8 characters and may be large enough such that the AppSync WebSocket server seems to decide to split the message into multiple frames. The problem seems to be incorrect framing of WebSocket messages from the AppSync RealTime endpoint: an individual frame ends with an incomplete UTF-8 character, causing a decoding error on the client (in the Qt WebSocket code).
I believe this is an issue that should be fixed in the AppSync server implementation, since this problem means the AppSync WebSocket server does not correctly implement the WebSocket RFC (6455 https://www.rfc-editor.org/rfc/rfc6455#section-1.2): it specifies that a text frame should always contain valid UTF-8.
If it helps, here is a screenshot of the point inside the Qt WebSocket implementation where the frame decoding fails:
Memory dump of frame.payload()
:
{"id":"<redacted>","type":"data","payload":{"data":{"subscribeToDeviceReplies":{"mac":"<redacted>","type":"reply","data":"\"{\\\"id\\\":\\\"<redacted>\\\",\\\"list\\\":[{\\\"description\\\":\\\"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\\",\\\"id\\\":0,\\\"name\\\":\\\"Lĺáàäsŝśáàänñóòö\\\",\\\"timestamp\\\":1662384560,\\\"uuid\\\":\\\"e84b1e81-a4d3-4b62-ab21-c1c397558622\\\"},{\\\"description\\\":\\\"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\\",\\\"id\\\":1,\\\"name\\\":\\\"Lĺáàäsŝśáàänñóòö\\\",\\\"timestamp\\\":1662384560,\\\"uuid\\\":\\\"e84b1e81-a4d3-4b62-ab21-c1c397558622\\\"},{\\\"description\\\":\\\"aaaaaa\\\",\\\"id\\\":2,\\\"name\\\":\\\"Lĺáàäsŝśáàänñóòö\\\",\\\"timestamp\\\":1662384500,\\\"uuid\\\":\\\"e84b1e81-a4d3-4b62-ab21-c1c397558622\\\"},{\\\"description\\\":\\\"aaaaaa\\\",\\\"id\\\":3,\\\"name\\\":\\\"Lĺáàäsŝśáàänñóòö\\\",\\\"timestamp\\\":1662384500,\\\"uuid\\\":\\\"e84b1e81-a4d3-4b62-ab21-c1c397558622\\\"},{\\\"description\\\":\\\"aaaaaaaaaaa\\\",\\\"id\\\":4,\\\"name\\\":\\\"Lĺáàäsŝśáàänñóòö\\\",\\\"timestamp\\\":1662384446,\\\"uuid\\\":\\\"e84b1e81-a4d3-4b62-ab21-c1c397558622\\\"},{\\\"description\\\":\\\"aaaaaaaaaaa\\\",\\\"id\\\":5,\\\"name\\\":\\\"Lĺáàäsŝśáàänñóòö\\\",\\\"timestamp\\\":1662384446,\\\"uuid\\\":\\\"e84b1e81-a4d3-4b62-ab21-c1c397558622\\\"},{\\\"description\\\":\\\"blaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\\",\\\"id\\\":6,\\\"name\\\":\\\"Lĺáàäsŝśáàänñóòö\\\",\\\"timestamp\\\":1662384413,\\\"uuid\\\":\\\"e84b1e81-a4d3-4b62-ab21-c1c397558622\\\"},{\\\"description\\\":\\\"blaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\\",\\\"id\\\":7,\\\"name\\\":\\\"Lĺ�
Notice how the very last character is the invalid character C3
, the first byte of the multi-byte á
(hex: C3 A1
). A UTF-8 multi-byte character should never be split.
Hopefully this can be fixed! Please let me know if I was unclear or if you need any additional information.