go icon indicating copy to clipboard operation
go copied to clipboard

net/http: support content negotiation

Open kevinburke opened this issue 8 years ago • 69 comments

Content negotiation is, roughly, the process of figuring out what content type the response should take, based on an Accept header present in the request.

An example might be an image server that figures out which image format to send to the client, or an API that wants to return HTML to browsers but JSON to command line clients.

It's tricky to get right because the client may accept multiple content types, and the server may have multiple types available, and it can be difficult to match these correctly. I think this is a good fit for Go standard (or adjacent) libraries because:

  • there's a formal specification for how it should behave: https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
  • it's annoying to implement yourself, correctly; you have to write a mini-parser.
  • it would take one function to implement, which makes it annoying to import an entire third party library for (assuming you find the right one)

I've seen people hack around this in various ways:

  • checking whether the Accept header contains a certain string
  • checking for the first matching value,
  • returning different content types based on the User-Agent
  • requiring different URI's to get different content.

There's a sample implementation here with a pretty good API: https://godoc.org/github.com/golang/gddo/httputil#NegotiateContentType

// NegotiateContentType returns the best offered content type for the request's
// Accept header. If two offers match with equal weight, then the more specific
// offer is preferred.  For example, text/* trumps */*. If two offers match
// with equal weight and specificity, then the offer earlier in the list is
// preferred. If no offers match, then defaultOffer is returned.
func NegotiateContentType(r *http.Request, offers []string, defaultOffer string) string

offers are content-types that the server can respond with, and can include wildcards like text/* or */*. defaultOffer is the default content type, if nothing in offers matches. The returned value never has wildcards.

So you'd call it with something like

availableTypes := []string{"application/json", "text/plain", "text/html", "text/*"}
ctype: = NegotiateContentType(req, availableTypes, "application/json")

If the first value in offers is text/* and the client requests text/plain, NegotiateContentType will return text/plain. This is why you have to have a default - you can't just return the first value in offers because it might include a wildcard.

In terms of where it could live, I'm guessing that net/http is frozen at this point. Maybe one of the packages in x/net would be a good fit? There is also a similar function for parsing Accept-Language headers in x/text/language. Open to ideas.

kevinburke avatar Feb 27 '17 18:02 kevinburke

Funny you should mention this, I was just looking for this for use in x/perf. Your proposed API is reasonable, but you should explicitly document what happens in exceptional circumstances:

  • What happens when multiple content types end up with the same q? I think I would expect the first one in offers to be chosen.
  • What happen if the Accept header(s) cannot be parsed?

quentinmit avatar Feb 27 '17 19:02 quentinmit

What happens when multiple content types end up with the same q? I think I would expect the first one in offers to be chosen.

That sounds reasonable to me.

What happen if the Accept header(s) cannot be parsed?

The spec includes a grammar but doesn't really say what happens if you can't match it, other than you should return a 406 if you can't match anything. I think we should just return the default?

kevinburke avatar Feb 28 '17 04:02 kevinburke

@bradfitz suggested (if we want to go forward with this proposal) it could live in net/http/httputil.

kevinburke avatar Mar 01 '17 18:03 kevinburke

I also mentioned that it seems like defaultOffer string could go away and we say that the first element in the slice was the default.

bradfitz avatar Mar 01 '17 18:03 bradfitz

Sorry, I tried to edit the description to cover that case.

I also mentioned that it seems like defaultOffer string could go away and we say that the first element in the slice was the default.

I think the problem with this would be, if you had "text/*" as the first element in offers, it would match if the client sent text/plain and text/plain would be returned from the function.

If we return the first value in offers as the default, then the response value would contain a wildcard ("text/*"), which it doesn't in any other circumstance. Maybe that's okay.

kevinburke avatar Mar 01 '17 19:03 kevinburke

@kevinburke What's wrong with returning text/plain if the first offer is text/*? I don't see why it has to return the exact string passed in—don't you need to know the negotiated type to set the header in the response?

Also, if negotiation fails does it return the empty string? I'd rather have an error to distinguish between expected failure, negotiation failed, and unexpected failure, bad request.

jimmyfrasche avatar Mar 01 '17 19:03 jimmyfrasche

@kevinburke, any response to @jimmyfrasche? This seems fine if the simplified signature works. You want to prepare a CL?

bradfitz avatar Mar 13 '17 20:03 bradfitz

Also, we like to make sure we append "; charset=utf-8" to Content-Types to be explicit. Can we make sure that that's still automatic?

bradfitz avatar Mar 13 '17 20:03 bradfitz

offers are content-types that the server can respond with, and can include wildcards like text/* or */*

I don't think wildcards should be allowed in the offers. The server knows exactly what content types it supports and returning a wildcard content type seems weird. Nor do I think the RFC says wildcards are OK in the content-type header. Content-Type (https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.17) mentions media-type (https://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7) that does not have wild cards. Accept (https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.1) mentions media-range which includes wild cards.

I agree with @jimmyfrasche that returning an error seems better so that the caller can know the reason if a default (first offered) is returned. maybe even not return a default at all. If there is a default the caller can handle that rather than build it as part of this method. For example there is a difference between not finding a match in an existing accept header and when the accept header is missing IMO.

Since the RFC for Accept supports multiple parameters (such as charset) how about returning something more than just a string like:

type NegotiatedContentType struct {
   ContentType string
   Params map[string]string
}

func NegotiateContentType(r *http.Request, offers []string) (NegotiatedContentType, error)

cellfish avatar Jul 20 '17 16:07 cellfish

@kevinburke, what's the status here?

It can also live in x/net if you want.

It's probably too late for Go 1.10, though.

bradfitz avatar Nov 15 '17 17:11 bradfitz

It's too late and I'm buried in work stuff unfortunately, I probably won't have time to try to offer an implementation. I remember when I tried to implement it I ran into non-trivial problems with the design and some kinds of inputs.

kevinburke avatar Nov 16 '17 19:11 kevinburke

I can do this. Any feedback on my proposal from 7/20 or do you wanna stick with just returning (string, error)?

cellfish avatar Nov 17 '17 05:11 cellfish

Any chance this could be decoupled from http? Content-Type (and arguably Accept) can be seen in non-http contexts and nothing in the logic described above ties it to http (apart from the initial retrieval of the header value). Other (non http) systems (eg messaging) may use headers, and this negotiator could be useful

ericbottard avatar Mar 07 '18 16:03 ericbottard

@cellfish, do Params happen in practice? I think just a string seems fine. People who care about more details can use https://golang.org/pkg/mime/#ParseMediaType

bradfitz avatar Mar 07 '18 16:03 bradfitz

@ericbottard, when you say decoupled from http, would you consider httputils still being an ok place because having a function that just takes a string seems trivial. @bradfitz, The only time I've seen params used in practice it was a pretty one-off case with application specific logic. I just figured that in the case someone actually care the library have probably parsed the parts anyway so why not return them structured... If you just care about the string the String method on the struct would obviously return things "nicely". For example if the accept header contains text/html;q=0.8;charset=UTF-8 and that is the best match I would assume you want the returning string to just be text/html;charset=UTF-8 (dropping the q-value). But I guess it is true that parsing can be simplified if params are only preserved and only the q-value is actually parsed (and dropped).

cellfish avatar Mar 08 '18 05:03 cellfish

@cellfish Having functions not taking http related (like http.Request) parameters is a first step. The functions not residing in http related packages is a second, nice to have, step. How about https://golang.org/pkg/mime ?

ericbottard avatar Mar 08 '18 08:03 ericbottard

@bradfitz The new Signed Exchanges spec makes use of params during content negotiation [ref]. AMP Packager is a Go server that needs to perform that negotiation [ref].

twifkak avatar Sep 26 '18 20:09 twifkak

What's the status of this proposal? Is there any chance to have this implemented in the nearest future?

lokhman avatar Oct 24 '18 20:10 lokhman

@lokhman The proposal has been accepted and is marked "help wanted". Anyone can, and is encouraged, to send a CL that implements this.

ghost avatar Oct 26 '18 16:10 ghost

I tried implementing this and ran into some problems/complexity with the implementation/behavior that made me unhappy with the whole thing. I dropped it and don't have plans to work further on it.

kevinburke avatar Oct 26 '18 16:10 kevinburke

@bontibon @kevinburke I'm currently working on one of my open-source projects that requires exactly this functionality (sort of a comprehensive web framework), so I decided to read about the topic, analyse various existing solutions and create own variant. I hope it should cover all edge cases, do the job quickly and memory efficiently.

So if you see this implementation valuable, please use it in the proposal or contribute to the existing project. The current version of the code is here: gowl/header.go.

lokhman avatar Oct 31 '18 13:10 lokhman

@kevinburke said he doesn't plan to work further on this, so I'm unassigning him and clarifying that this needs a fix.

mvdan avatar Jan 22 '19 11:01 mvdan

If no one is currently working on this, I would be happy to take it up. 😄

kshitij10496 avatar Mar 08 '19 09:03 kshitij10496

Go for it. No need to ask for permission. :)

agnivade avatar Mar 08 '19 11:03 agnivade

Permit me to note that Accept is not the only header that needs this sort of parsing. Accept-Encoding and Accept-Language have nearly the same syntax and need virtually the same parsing. A generic implementation would be welcome.

rothskeller avatar Mar 09 '19 05:03 rothskeller

Just noticed that there is already such a negotiation in the x/text/language package, in the ParseAcceptLanguage function

fgm avatar Mar 10 '19 18:03 fgm

Has anybody made any progress on this? I'd be interested in using some of this functionality. A suggestion for an incremental improvement that could be merged first: Make a variant of mime.ParseMediaType that just parses the first media type from a list, and returns a slice containing the unparsed portion. (Basically, check for , here.)

twifkak avatar May 03 '19 22:05 twifkak

I've just ported goautoneg which is used by Kubernetes and Prometheus to parse accept headers into a Github repository at https://github.com/markusthoemmes/goautoneg.

Would it be a valuable first step to add a ParseAccept function to httputil to get this going? We need to parse it first to get negotiation going anyway and it seems like these projects could take advantage of such an interim step.

markusthoemmes avatar Jul 13 '19 16:07 markusthoemmes

I'm interested in working on this, but I wonder if it makes sense to follow x/text/language's example, and provide a Matcher type, so that the provided CTs needn't be parsed for every HTTP request.

Any objections to this approach?

If so, what should we call it? AcceptMatcher?

flimzy avatar Aug 25 '19 09:08 flimzy

@markusthoemmes I see in your port of goautoneg that you're proposing in its readme be internalized into the stdlib you dropped the Negotiate function. Many web apps are also going to need to match against a content-type they have as a supported content type and shouldn't just use the results of the parser verbatim. Providing a Matcher like @flimzy suggested would go a long way and might be a good replacement for goautoneg's Negotiate if it's going to be left out.

As an example we use the goautoneg Negotiate function here in a way that I think many web apps would.

leighmcculloch avatar Dec 11 '19 19:12 leighmcculloch