web-server-chrome
web-server-chrome copied to clipboard
wrong responseLength with text response containing multi-byte characters
In webapp.js, the write()
function of BaseHandler
calculates the length of data
as (data.length || data.byteLength).
If data
is a string containing multi-byte characters like those of a language other than English, the result will be wrong. String.length
returns codepoint count rather than byte length, so part of the response will be cut off during transmission and the client will receive incomplete pages.
What makes this worse is that (AFAIK) javascript does not have a native function to get a string's byte length. Currently I work this around by omitting the response length and terminating the connection after all the data are transmitted. To get the real length of the data being send may involve some heavy coding to convert the data into a byte array and then determine its length. Curse Javascript :(
Hmm, what you're saying makes sense. For most of the applications I had in mind data (put into the write function) is actually only an ArrayBuffer so there should be no problem. (the .length was put in there for convenience when manually crafting small error responses like "404 File missing")
However you are right that if you put in a string with multi byte characters there will be problems...
Where are you getting your string from? Why don't use use an ArrayBuffer instead?
If for some reason you want to put in a string you could use https://github.com/inexorabletash/text-encoding
I am trying to dynamically generate some pages using user configs (including urls, texts and regexps), so the result can only be in text form. Using an encoding library will definitely work.
Also in handlers.js, renderDirectoryListing()
calls write()
with a string constructed from file names. If the file names contain multi-byte characters, presumably it will also give the wrong length (haven't tested).
Hm, thanks for the tip. I guess I never noticed that it was cutting off responses for renderDirectoryListing because most of my filenames are latin based. I will definitely look into computing the length using a text encoder library.
https://github.com/kzahel/web-server-chrome/commit/e66f6a969cb5a45a1a1003f1dbfa72a582448937 Tested on some files. Seems to work correctly now. I'll release a new version to the web store soon
Hmm. What a mess this is ... https://github.com/kzahel/web-server-chrome/commit/f5e1db087598b3994e24708ad872791aeedb9876 I am now sending charset=utf-8 when serving any text/plain text/html ... I guess most people would expect that. Not really sure how I'm supposed to determine the encoding of any random file on disk.
@kzahel Should this be closed?