body-parser
body-parser copied to clipboard
iso-8859-1/windows-1252 support?
Hi,
A while ago I developed a web app to replace an old cgi-bin script from the nineties, which means that it had to assume that POST requests with a Content-Type of application/x-www-urlencoded are in iso-8859-1 (and also support "smart quotes" etc., aka. windows-1252). At the same time it had to be possible to switch to utf-8 mode.
I accomplished that by adding support for a defaultCharset option, which can either be utf-8 or iso-8859-1, and for explicitly switching to utf-8 mode by adding utf8=%E2%9C%93 (utf8=✓ as percent-encoded utf-8 octets) to the query string, a trick that I've seen others use as well.
Here are the diffs between my forks of body-parser and the qs module:
https://github.com/expressjs/body-parser/compare/master...papandreou:iso-8859-1 https://github.com/ljharb/qs/compare/master...papandreou:interpretNumericEntities
Would anyone be interested in this work? If so, I can clean it up and open PRs.
@dougwilson Any update on this ?
I'm interested in this feature but as @papandreou 's branch is way behind the official master. Seems like it's going to be hard to pull...
Is there any proper other way to handle ISO-8859-1 encoded data ?
Hi @xtrem65 I haven't even commented on here, let alone done any work on it :) I don't have any updates.
@Xtrem65, I guess the only update is that you're expressing interest, which is also a good thing :)
If @dougwilson and @ljharb think the work looks good and they want to adopt this feature, I'd be willing to bring the branches up to date.
@Xtrem65, have you tried switching to my body-parser-papandreou fork to see if it solves your problem? It should still work with current versions of express.
I’d be happy to review a PR.
@ljharb, that sounds great. Before I try to rebase it on qs master, does this approach look good? https://github.com/papandreou/qs/compare/9250c4cda5102fcf72441445816e6d311fc6813d...interpretNumericEntities
I'd need to review it when rebased :-) it looks like a good direction tho.
Rebased branch: https://github.com/papandreou/body-parser/tree/feature/iso-8859-1/take2
Requires https://github.com/ljharb/qs/pull/268 to be npm linked in.
Wow guys ! so much reactivity :D
I'd be happy to help if needed
@Xtrem65, you can help by trying it out :)
- Make a clone of https://github.com/papandreou/qs
- Check out the
feature/iso8859-1branch npm link- Make a clone of https://github.com/papandreou/body-parser
- Check out the
feature/iso-8859-1/take2branch npm link qsnpm linkcdto your projectnpm link body-parser
... Then see if you can solve your problem. Depending on the requests that you need to support, you might have to specify the defaultCharset option.