requests
requests copied to clipboard
Make sessions safe[r] in multi-process environment
tldr; in multi-process environment (Celery) sessions might lead to request/responses being mixed up.
It is unsafe to use Session in a multi-process environment - if the fork happens after Session initialisation the underlying connection pool will be shared across both processes, leading to potentially dangerous and hard to debug issues.
I'm not sure what should happen - whather a code change is necessary, or a documentation change is enough. Please advise :)
Use case
It is likely to happen if Session is created at module load time, like:
class MyClient(object):
session = requests.Session()
def do_things(self, **params):
self.session.get(**params)
or if a 3rd party client is imported at the module level, where it becomes totally invisible:
my_database_client = database.DatabaseClient(**params)
class MyDatabaseWrapper(object):
def do_things(self, **params):
my_database_client.update(**params)
This is particularly tricky in a Celery, where a previously working function might start causing troubles if invoked from Celery. Celery seems like a common Python use case.
I've seen this pattern in 3 different repos written by 3 different developers - it feels common enough for it to be a problem.
Reason
This is related to https://github.com/shazow/urllib3/issues/850 in urllib3, where it was stated that it's the callers' responsibility to worry about forking - in this case, it's Requests.
Expected Result
Ideally, a new Session with the same parameters would be started by Requests.
If that's too complicated, I'd expect an exception to be thrown if PID change was detected.
At the very least, docs should state the expected behaviour.
Actual Result
The responses are mixed up - one process might receive a response made for a call it didn't make.
Reproduction Steps
import os
import sys
import requests
MAX = 20
s = requests.Session()
for n in range(MAX):
pid = os.fork()
if pid == 0:
try:
# s.mount("http://", requests.adapters.HTTPAdapter()) # uncomment to fix
r = s.get('http://httpstat.us/300?sleep=100')
except Exception:
# ignoring intermittent http errors
pass
sys.exit(0)
else:
try:
r = s.get('http://httpstat.us/200')
except Exception as exc:
# ignoring intermittent http errors
pass
if r.status_code != 200:
print 'pid: {} Call {}/{}. Wrong status code detected {}'.format(
os.getpid(),
n,
MAX,
r.status_code
)
pid: 30996 Call 7/20. Wrong status code detected 300
pid: 30996 Call 9/20. Wrong status code detected 300
pid: 30996 Call 14/20. Wrong status code detected 300
pid: 30996 Call 17/20. Wrong status code detected 300
pid: 30996 Call 19/20. Wrong status code detected 300
System Information
$ python -m requests.help
pawel@pawel-C02V306VHTDH ~/Uber/tokenizer-python $ python -m requests.help
{
"chardet": {
"version": "3.0.4"
},
"cryptography": {
"version": ""
},
"idna": {
"version": "2.6"
},
"implementation": {
"name": "CPython",
"version": "2.7.14"
},
"platform": {
"release": "16.7.0",
"system": "Darwin"
},
"pyOpenSSL": {
"openssl_version": "",
"version": null
},
"requests": {
"version": "2.18.4"
},
"system_ssl": {
"version": "100020cf"
},
"urllib3": {
"version": "1.22"
},
"using_pyopenssl": false
}```
This command is only available on Requests v2.16.4 and greater. Otherwise,
please provide some basic information about your system (Python version,
operating system, &c).