isahc icon indicating copy to clipboard operation
isahc copied to clipboard

Use url::Url instead of http::Uri

Open troelsarvin opened this issue 2 years ago • 6 comments

isahc's reliance on http::Uri makes it cumbersome to work with URLs which have hostnames with non-ASCII characters. I suggest that isahc switch to using url::Url instead of http::Uri.

Motivations:

  • In 2022, it should be straight forward to work with URLs which are international.
  • Also, sematically, I claim it is more clear for an HTTP client to work with URLs rather than URIs. It's hard to imagine isahc being used with non-URL URIs.

The following code shows the current state of affairs in some crates:

/*
    In Carto.toml:

[dependencies]
url = "2"
http = "0.2"
reqwest = "0.11"

*/

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let url_str = "https://www.fc-højvang.dk/";

                                                    // Crate: url
    let url = url::Url::parse(url_str)?;            //
    println!("url: {}", url);                       // Prints URL with IDNA

                                                    // Crate: reqwest
    let reqwest_url = reqwest::Url::parse(url_str)?;//
    println!("reqwest_uri: {}", reqwest_url);       // Prints URL with IDNA

                                                    // Crate: http
    let http_uri = url_str.parse::<http::Uri>()?;   //
    println!("http_uri: {}", http_uri);             // FAILS
    /*
      Error message:
      InvalidUri(InvalidUriChar)
    */

    Ok(())
}

troelsarvin avatar Mar 07 '22 21:03 troelsarvin

I appreciate your concern on this, and I agree that Isahc ought to handle this better. However, this would be a breaking change and would require a 2.0 release at the very least. The problem is that the http crate is part of Isahc's public API -- isahc::Request is just a re-export of http::Request, so with the current design we can't choose whether or not http::Uri is used. Only the http crate can make that determination.

For an example that demonstrates this better:

use isahc::Request; // remember this is just `http::Request`

fn main() {
    let request = Request::get("https://sejs-svejbækbiler.dk/")
        .body(())
        .unwrap(); // panics here with `http::Error(InvalidUri(InvalidUriChar))`
                   // before any Isahc code even runs

    isahc::send(request).unwrap();
}

I see only two possible ways that we could handle this:

  1. The http crate is updated upstream to handle IDNA, either by default or with a feature flag, which we would enable.
  2. We remove the http crate from the public API and replace it with our own types, which among many things use the url crate internally.

If we could push for the first approach to get resolved then that would be most ideal, since we could fix this without breaking any existing code out there using Isahc. The second approach may have its own benefits, but also some drawbacks as well as requiring a significant API redesign.

When the http crate was first introduced, it was under the hope of being a common "interface crate" that would be generic enough that both HTTP clients and servers could adopt it and allowing the use of interoperable types between otherwise independent crates. This theoretically allows for things such as middleware to be written in an entirely generic way that would allow it to be used with any HTTP client or server instead of being tied to a specific implementation. This sounded like a noble goal, and Isahc was an early adopter those years ago when it was still young. In practice, I am not sure how well this vision materialized and may be worth some investigation.


As far as the URI/URL distinction, last time I looked that was a crazy rabbit hole as to which is more technically correct per the HTTP spec. In syntax, URLs are strictly speaking, always absolute, but HTTP allows relative URLs (which is a misnomer), so it is a bit looser than just URLs, but it is stricter than URIs which are very broad indeed. Then there's a mostly semantic debate as to what we call these "URIs as allowed by the HTTP spec", which some opt to call URIs (technically correct but not very precise) and others call URLs (more colloquially used, but not strictly correct).

Personally I don't really care about the semantic debate, you can call them URLs or URIs, as long as you're following the HTTP spec. Which http::Uri does as far as I know, in the most strict reading of the spec.

sagebind avatar Mar 08 '22 03:03 sagebind

I've updated https://github.com/hyperium/http/issues/259 pointing here. Meanwhile, I wonder if it were possible to add some From/into functions to isahc, so that a url::Url could be used seamlessly with isahc through automatic conversoins between url:Url and http::Uri?

troelsarvin avatar Mar 08 '22 23:03 troelsarvin

Strictly speaking, it is not possible to implement conversions between url::Url and http::Uri from within Isahc due to the orphan rule; only the crate that defines a type can implement common traits for the type. So either the url crate or the http crate would have to implement a conversion from one to the other, though I'd doubt either crate would be willing to do that.

I'm not sure that would even help much in this scenario anyway, since Uri is still part of the public API and we can't change the http types to also accept a Url.

sagebind avatar Mar 08 '22 23:03 sagebind

First of all, replace http::Uri with url::Url is not acceptable.

The best solution for us is to address it from upstream. I submit PR https://github.com/hyperium/http/pull/565 for that.

Xuanwo avatar Aug 19 '22 08:08 Xuanwo

Xuanwo's attempt at having a fix accepted in the http crate was not successfull, unfortunately :-(

@sagebind, could it be time for a version 2 of isahc, allowing for the API change?

troelsarvin avatar Apr 10 '23 09:04 troelsarvin

@troelsarvin I'm not currently planning on making this breaking change, even though technically yes, it could be done as part of version 2. This change would not be insignificant as it would require removing the http crate from the Isahc public API, which currently forms the backbone of Isahc's API. I'm just not sure that the juice is worth the squeeze to upend the main crate API in order to resolve this issue right now, or if the effort required is available.

All I can say is that the changes for version 2 are already planned, and this isn't in that list.

sagebind avatar Apr 24 '23 02:04 sagebind