express icon indicating copy to clipboard operation
express copied to clipboard

; charset=utf-8 in Content-Type header breaks compatibility with Mosaic 2.1.1

Open ssshake opened this issue 3 years ago • 3 comments

Hello, I know this has been asked before and I know the response is that CHARSET=UTF-8 is enforced in the package because of security concerns.

This has been discussed in this closed issue: https://github.com/expressjs/express/issues/3490

However I have a particular use case. I run a site called theoldnet.com and I serve up 1990's internet content to people like me who collect vintage computers.

This means we are trying to access the site using old browsers like Mosaic 2.1.1.

These old browsers do not like the semi-colon suffix on modern Content-Type headers. It breaks the ability to render the page entirely.

Can you tell me where this is enforced in the source code? If need be I can disable this myself and maintain my own fork. My service depends heavily on express and the entire point of the service is to provide cross-browser compatibility dating back to the worlds first web browsers.

I did a search for text/html; chartset=utf-8 in the code base and the dependencies. I could not find the cause of this charset being injected on every request.

Your help would be appreciated, thanks!

ssshake avatar Jan 12 '21 03:01 ssshake

Hi @ssshake you can actually disable this without actually altering the source code at all. Basically, the charset is because you are telling Express.js to send a string, which it has to encode in some specific charset. This encoding is what is added to the content-type header. BUT if you provide Express.js with the raw bytes, since Express.js is no longer doing the encoding, it does not add the charset parameter it used (since it didn't do any conversion).

Another option, of course, is to not use the .json / .send / etc. helpers and instead use the Node.js APIs directly (.write/.end).

And last but not least, another option is to use something like on-headers (https://www.npmjs.com/package/on-headers) to alter the content-type header to strip it off right before it is sent.

Depending on how, exactly, you are writing your responses would depend on which way is going to be the most straight-forward method.

dougwilson avatar Jan 12 '21 03:01 dougwilson

This is great information, thank you very much!

Using write/end sounds like the most drop-in solution to me but the first solution of the raw bytes sounds appealing. What I have is somewhat of a proxy and I really don't need express trying to handle anything except for in certain cases.

Could you tell me how I could get started understanding how to do the first option you suggested?

Regarding your last option I was trying to use a middleware to rewrite the headers but I'm guessing that's too early in the process.

ssshake avatar Jan 12 '21 14:01 ssshake

Hi @ssshake sorry for the delay in response; it may help if you wanted to provide an example of your use-case, especially because proxies should not have this issue, as they will use .write/.end typically (see https://www.npmjs.com/package/express-http-proxy and similar packages).

For the first option, you just provide a Buffer object as the argument to res.send. How to get a Buffer object will depending on what exactly doing, so I don't have a way to provide an example about that aspect unless you can provide some example code that is not working as you expect.

For the last option, you have to use something like on-headers (https://www.npmjs.com/package/on-headers) to alter the content-type header to strip it off right before it is sent.

dougwilson avatar Jan 21 '21 15:01 dougwilson