toolbelt Consider removing headers filtered out by Google Appengine

Google Appengine doesn't allow the sending of some of the headers through the urlfetch service for security reasons: https://cloud.google.com/appengine/docs/python/outbound-requests#request_headers

Sending these headers in a request will cause a WARNING in the logs along the lines of "Stripped prohibited headers from URLFetch request: ['Content-Length']". This is annoying if you (a) are trying to monitor the logs for the health of your application and (b) you are not directly making those requests but rather a third-party library is, so you can't "fix" your code.

My suggested fix would be: offer a flag on monkeypatch(remove_unsupported_headers=False) which transparently removes these headers. It is initially set to false, so during development you see the warnings and you can investigate alternate solutions, but once you decided that removing them is the best approach, you can just enable this flag.

The list of "disallowed headers" is fairly static (in fact I don't think it changed since the urlfetch service was launched) so hardcoding them is a low-risk approach.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Sep 12 '16 08:09 cdman

Sending these headers in a request will cause a WARNING in the logs along the lines of "Stripped prohibited headers from URLFetch request: ['Content-Length']".

Can you not control the log level of App Engine log events in your application's logs?

My suggested fix would be: offer a flag on monkeypatch(remove_unsupported_headers=False) which transparently removes these headers. [snip] The list of "disallowed headers" is fairly static (in fact I don't think it changed since the urlfetch service was launched) so hardcoding them is a low-risk approach.

Given that none of us are active in the development of App Engine, I don't consider it a low risk. None of the core developers actually use the App Engine Adapter or monkey-patching facility that we provide. I'm already uncomfortable maintaining this section of requests-toolbelt.

Further, if we were to add this (and I'm disinclined to add it), we wouldn't hide the behaviour behind a flag to monkeypatch; it would always be on.

Sep 12 '16 14:09 sigmavirus24

Can you not control the log level of App Engine log events in your application's logs?

AFAIK, not easily, but you can post-hoc filter the log (for example only see messages above a certain level). However only looking at "CRITICAL and above" messages (to skip the WARNING messages) could potentially mask many other issues.

But even if you can control it (they expose it as part of the standard python logging API, so probably customizing that would work), it just seems as a very heavy hammer to raise the minimum log level for an entire module just to get rid of one log message in the module (again, with the potential of masking other problems).

I'm already uncomfortable maintaining this section of requests-toolbelt.

If I provided a couple of tests that run inside of the Google Appengine Development server (which is pure-python and can be run locally / on Travis - see https://github.com/travis-ci/travis-ci/issues/738#issuecomment-214710901), would that make you feel more comfortable / more inclined to include the change?

we wouldn't hide the behaviour behind a flag to monkeypatch; it would always be on.

My reasoning for having it default off with the option to turn it on is the following: it is good to have these warnings by default, since during development you can investigate the source of it / pester the developers of the third-party library. However, once you decided that there isn't anything you can do about it for the moment, you should be able to mute / work-around the exception so that you can focus on more important issues in production.

Sep 13 '16 06:09 cdman

The issue still persists, and it's not likely it's going anywhere.

I propose another, non-ideal, but safe solution to be put inside _AppEngineConnection.urlopen:

for prohibited_header in PROHIBITED_HEADERS:
    headers.pop(prohibited_header, None)

Where PROHIBITED_HEADERS is a module-level variable initialized with an empty list. That way requests_toolbelt obviously still functions the same out-of-the-box, it's maintenance burden doesn't increase, and users get a reasonable way to filter out headers.

I can create a pull request if this sounds good to you.

Example no-maintenance usage:

from requests_toolbelt.adapters import appengine
from google.appengine.api import urlfetch_stub
appengine.PROHIBITED_HEADERS = urlfetch_stub._UNTRUSTED_REQUEST_HEADERS

(Note for users: that list, as of GAE SDK 238, doesn't precisely match the list on https://cloud.google.com/appengine/docs/standard/python/outbound-requests#request_headers)

EDIT: I forgot that headers is a CaseInsensitiveDict. Fixed the code.

Mar 15 '19 14:03 egilewski

toolbelt toolbelt copied to clipboard

Consider removing headers filtered out by Google Appengine

toolbelt
toolbelt copied to clipboard