requests icon indicating copy to clipboard operation
requests copied to clipboard

Discrepancies between self-prepared and session-prepared request headers

Open rittneje opened this issue 4 years ago • 9 comments
trafficstars

If we prepare a request normally (the prepare() method), it does not include any additional headers by default, even if the resulting prepared request is sent via a session.

However, if the request is prepared via the session (the prepare_request(Request) method), then it will include a few headers by default: Accept, Accept-Encoding, User-Agent, and Connection.

There are some (arguably buggy) HTTP servers that apparently will reject any request that includes the Accept or Accept-Encoding header with 406 Not Acceptable.

This situation led to major confusion when trying to fix our usage of this library to send all the requests through a single session in order to leverage cookie handling.

This default header logic does not appear to be documented anywhere. I also think it is very confusing that a self-prepared request and a session-prepared request would have different headers. Unfortunately, I assume this cannot be changed easily for fear of breaking backwards compatibility.

rittneje avatar May 27 '21 17:05 rittneje

I'm fairly certain this was at one point covered in our docs. With that in mind, why is this a bug report? How can you expect Request.prepare() to do something with session state when it doesn't have access to the session?

sigmavirus24 avatar May 27 '21 21:05 sigmavirus24

Accept, Accept-Encoding, and User-Agent have nothing to do with session state. Only the Connection: keep-alive header really makes sense to be different since without a session I assume there's no pool.

rittneje avatar May 27 '21 21:05 rittneje

Except that commonly people set these on the session and expect them to be session-level state. I think this is complicated by the fact that Request.prepare() is primarily used by folks who want complete control over their headers and everything else. This is why Session.prepare_request() was introduced - to do the things some people think should just :sparkes: magically :sparkles: happen and might be the "only right" behaviour (in their expectation).

At one point this was documented roughly as:

  • Request.prepare() + Session.send() is complete control and Session.send() shouldn't really be tacking on anything extra
  • Request + Session.prepare_request() + Session.send() has more control but also does the things that you might want Requests to handle (like reasonable default headers and applying session state)
  • Session.request (and friends) is for folks that don't need a more complex way of building requests
  • requests.request (and friends) is for quickstart and other quick usage that doesn't need session state persisted between requests

I can see where you're coming from, but it's not backwards compatible nor is it how it was designed or documented

sigmavirus24 avatar May 27 '21 22:05 sigmavirus24

To clarify, what I consider the issue here is not the existence of Session.headers itself, as in the ability to set headers on the session that get auto-applied to all requests. Rather, the issue is the default Session.headers, which (except for Connection) seem completely arbitrary, so it is very confusing why they are the defaults at all.

At the very least I think these default headers should be mentioned in the documentation, since for me at least it was entirely unexpected and undesirable.

rittneje avatar May 27 '21 22:05 rittneje

If it's not already documented, I'd merge a PR with that addition.

sigmavirus24 avatar May 27 '21 22:05 sigmavirus24

Definitely an issue, I can't disable User-Agent in a Session context even with a prepared request : cancer useragent

notAutomne avatar Jan 06 '22 10:01 notAutomne

You can remove it after the fact:

req = requests.Request(...)
prepared = session.prepare_request(req)
del prepared.headers['User-Agent']
session.send(prepared)

sigmavirus24 avatar Jan 06 '22 12:01 sigmavirus24

You can remove it after the fact:

req = requests.Request(...)
prepared = session.prepare_request(req)
del prepared.headers['User-Agent']
session.send(prepared)

Same issue :

notAutomne avatar Jan 06 '22 12:01 notAutomne

Ah, right. Unfortunately the API right now is too limited to allow us to provide a clean way around what urllib3 does by default now (where if I remember correctly they want a sentinel value).

sigmavirus24 avatar Jan 06 '22 12:01 sigmavirus24