http icon indicating copy to clipboard operation
http copied to clipboard

Should http::Uri handle IDNA?

Open KiChjang opened this issue 5 years ago • 9 comments

Or should it delegate it to the url crate somehow?

KiChjang avatar Oct 06 '18 18:10 KiChjang

That's a good question. What would be involved if we were to say that this crate should handle it? What are the use cases? What should this crate do if it were decided to not handle IDNA?

I don't believe a server needs to handle this, as the recurved URI should have been encoded correctly. So is it mostly for clients?

seanmonstar avatar Oct 08 '18 21:10 seanmonstar

See https://github.com/sagebind/isahc/discussions/381 and related https://github.com/sagebind/isahc/issues/382 for why this is relevant. I like to use isahc, but as it is, isahc's reliance on http::Uri has turned out to be a problem, and I probably need to convert URL strings into the url::Url and then further into http://Uri to be able to handle many perfect perfectly fine real-life URLs.

Non-ASCII domain names become more and more wide-spread, and for a good reason.

So I hope someone can find a good way to accept parse non-ASCII strings as Uris.

troelsarvin avatar Mar 08 '22 19:03 troelsarvin

I try to fix this issue via allow unicode in uri: https://github.com/hyperium/http/pull/565

PTAL.

Xuanwo avatar Aug 19 '22 08:08 Xuanwo

@Xuanwo , it seems to me your pull request was closed but never accepted into the http crate, or am I misunderstanding Github?

troelsarvin avatar Apr 10 '23 09:04 troelsarvin

@Xuanwo , it seems to me your pull request was closed but never accepted into the http crate, or am I misunderstanding Github?

Yes. I gave up for working on fixing this issue any more. Feel free to reuse my work.

Xuanwo avatar Apr 10 '23 09:04 Xuanwo

It looks like #565 didn't implement Nameprep (Unicode "lowercasing" and normalization), so that's not a complete implementation of IDN for some languages.

Implementing Nameprep means shipping a Unicode normalization table, which has significant impacts on binary size. Some platforms have ways of dealing with this (wasm32 targets can use JavaScript's Url object, Windows has its own IDN APIs and CoreFoundation has a CFURL type), but they all make things a bit complicated.

reqwest currently uses url:Url for its public APIs but http::Uri internally on non-wasm32 targets. Adding IDN support to http::Uri would mean that they no longer have to use url::Url, but they'd have to break API compatibility for non-wasm32 targets as well.

Having http::Request use a generic Url-like trait make things more flexible – on wasm32, one could use web_sys::Url which is much smaller than url::Url (with idna and unicode_normalization crates).

micolous avatar Nov 28 '23 03:11 micolous