web-server-chrome wrong responseLength with text response containing multi-byte characters

In webapp.js, the write() function of BaseHandler calculates the length of data as (data.length || data.byteLength). If data is a string containing multi-byte characters like those of a language other than English, the result will be wrong. String.length returns codepoint count rather than byte length, so part of the response will be cut off during transmission and the client will receive incomplete pages.

What makes this worse is that (AFAIK) javascript does not have a native function to get a string's byte length. Currently I work this around by omitting the response length and terminating the connection after all the data are transmitted. To get the real length of the data being send may involve some heavy coding to convert the data into a byte array and then determine its length. Curse Javascript :(

Sep 12 '15 12:09 Qrox

Hmm, what you're saying makes sense. For most of the applications I had in mind data (put into the write function) is actually only an ArrayBuffer so there should be no problem. (the .length was put in there for convenience when manually crafting small error responses like "404 File missing")

However you are right that if you put in a string with multi byte characters there will be problems...

Where are you getting your string from? Why don't use use an ArrayBuffer instead?

If for some reason you want to put in a string you could use https://github.com/inexorabletash/text-encoding

Sep 12 '15 13:09 kzahel

I am trying to dynamically generate some pages using user configs (including urls, texts and regexps), so the result can only be in text form. Using an encoding library will definitely work.

Also in handlers.js, renderDirectoryListing() calls write() with a string constructed from file names. If the file names contain multi-byte characters, presumably it will also give the wrong length (haven't tested).

Sep 12 '15 18:09 Qrox

Hm, thanks for the tip. I guess I never noticed that it was cutting off responses for renderDirectoryListing because most of my filenames are latin based. I will definitely look into computing the length using a text encoder library.

Sep 14 '15 01:09 kzahel

https://github.com/kzahel/web-server-chrome/commit/e66f6a969cb5a45a1a1003f1dbfa72a582448937 Tested on some files. Seems to work correctly now. I'll release a new version to the web store soon

Oct 01 '15 21:10 kzahel

Hmm. What a mess this is ... https://github.com/kzahel/web-server-chrome/commit/f5e1db087598b3994e24708ad872791aeedb9876 I am now sending charset=utf-8 when serving any text/plain text/html ... I guess most people would expect that. Not really sure how I'm supposed to determine the encoding of any random file on disk.

Oct 01 '15 21:10 kzahel

@kzahel Should this be closed?

May 18 '21 03:05 ethanaobrien

web-server-chrome web-server-chrome copied to clipboard

wrong responseLength with text response containing multi-byte characters

web-server-chrome
web-server-chrome copied to clipboard