go-socket.io icon indicating copy to clipboard operation
go-socket.io copied to clipboard

Special characters in text message leads to truncated message

Open xplodwild opened this issue 4 years ago • 1 comments

Hi

There seems to be an issue with using LimitedReader with UTF-8 characters. Take the following message:

64:42["JoinGame",{"Name":"tétééééééé","Team":"red","GameCode":"a"}]

It will be read as:

64:42["JoinGame",{"Name":"tétééééééé","Team":"red","GameCod

Due to the fact that the 64 bytes told by socket.io are the number of characters, not the actual number of bytes (ie. UTF-8 é is two bytes), leading to a truncated message. Bypassing LimitedReader and using a regular Reader leads to the message being properly read. You need to use something else than LimitedReader for this...

...Or am I missing something huge? I've tried multiple client socket.io versions (2.x, 1.7.x and 1.2.x), all do the same 64 bytes calculation.

xplodwild avatar Mar 28 '20 18:03 xplodwild

The issue is that the length parameter sent by the javascript client isn't compatible to the server parser. Javascript uses UTF-16 for its strings internally which causes this.

I started working on my own socket.io/engine.io implementation prior to using this one and i tried to comply with the standard library as much as possible. We can probably implement some of it (after polishing of course, my code is not optimized and was just for testing)

https://github.com/adrianmxb/goseio/blob/master/pkg/eio/parser/parser.go Have a look, maybe you have a better idea on how we can solve this.

adrianmxb avatar Mar 29 '20 16:03 adrianmxb