http-types icon indicating copy to clipboard operation
http-types copied to clipboard

Add support for languages (work towards `Accept-Language`)

Open vodik opened this issue 3 years ago • 7 comments

So I need good support for Accept-Language in a project I'm working on. I figured I'd give it a crack.

I haven't implemented the header itself just yet. The implementation should be straightforward enough, there is plenty of prior work to draw from. Where I have some uncertainty and would love some feedback and guidance is with the languages inside the header should be represented.

Is there any current thoughts on how languages should be represented? My PR is laughably naive - how conformant should it be to following RFC 4647? Should it implemented outside this crate like mime?

On this topic, from my searching there currently seems to be two crates that handle the RFC: locale_config and fluent-langneg, neither which seem suitable as dependencies. Maybe a new one has to be created?

vodik avatar Jan 25 '22 09:01 vodik

Thanks so much for opening this! Matching on language ranges was something we've wanted for a long time, but never quite made it to the front of the list. I really appreciate you taking this work on, and am excited for us to have this functionality!


Is there any current thoughts on how languages should be represented? My PR is laughably naive - how conformant should it be to following RFC 4647? Should it implemented outside this crate like mime?

That's a good question! Our mime types are actually implemented inside the crate, and ideally we'd do the same for language negotiation support as well. The closer we can stay to RFC4647, likely the better. That way we can implement the parsing /encoding once, and then only revisit it to fix bugs.

What do you think? - Do you reckon we could take that direction?

yoshuawuyts avatar Jan 25 '22 11:01 yoshuawuyts

I didn't realize the internal mime support is different than the mime crate. Okay, in that case, this I think might be a good plan:

Study the mime code and try to implement an equivalent parser/data structure. Start with just RFC 4647's language-range. Should be straightforward enough. Try to get that in, and then implement helpers for Accept-Language and Content-Language on top of that.

And with that, looking at the mime code - is it Cow based just for the convenience for mime::constants?

vodik avatar Jan 26 '22 01:01 vodik

And with that, looking at the mime code - is it Cow based just for the convenience for mime::constants?

Yeah, it indeed exists to allow mime types to be defined as constants. Though the motivation was more for performance, and less for convenience. We used to use Strings before, which meant having an extra N allocations per request.

yoshuawuyts avatar Jan 28 '22 13:01 yoshuawuyts

I've put up a new LanguageRange implementation. I wouldn't mind checking in now to see if I'm on the right track.

  1. I validate the makeup and length of subtags. I'm in favour of strict conformance for parsing, but I don't know if that's aligned with this project's goals. Too much? (Error messages not final)
  2. Sub-tags are case compared case insensitive. Should I preserve case or should I downcase to normalize? I lean towards preserve.

And if this is good, then I should be able to get the whole thing done this weekend.

vodik avatar Jan 28 '22 23:01 vodik

 ---- trace::server_timing::parse::test::decode_headers stdout ----
thread 'trace::server_timing::parse::test::decode_headers' panicked at 'assertion failed: `(left == right)`
  left: `Some(52.999999ms)`,
 right: `Some(53ms)`', src/trace/server_timing/parse.rs:162:9

Oh that looks like a bug.

vodik avatar Jan 30 '22 23:01 vodik

@vodik fixed the issue you reported in https://github.com/http-rs/http-types/pull/393 - hopefullly this unblocks you to make progress on this (:

yoshuawuyts avatar Feb 04 '22 17:02 yoshuawuyts

I think this is reasonable.

I'm going to glue it into my package and see how well I can glue LanguageRanges into fluent / unic_locale

vodik avatar Feb 06 '22 21:02 vodik