vcrpy
vcrpy copied to clipboard
fails to re-establish a connection after keep-alive timout
vcrpy 2.0.1, requests 2.22.0
We are using requests with sessions to call an API endpoint to trigger a job and then call another endpoint in a loop to get the job status and thus wait until the job is done. The server is an Apache httpd with KeepAliveTimeout 5 and a Rails app.
The code is working fine outside of VCR, but as soon as we enable VCR recording, the "wait" part of the code fails (requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",)) on Python 2.7, requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) on Python 3.7)
Example code:
import time
import urllib3
urllib3.disable_warnings()
import requests
import vcr
BASE = 'https://foreman.example.com'
STATUS = '{}/api/v2/status'.format(BASE)
DELETE = '{}/katello/api/organizations/10/subscriptions/delete_manifest'.format(BASE)
def wait(task):
while task['state'] != 'stopped':
print("sleeping for {}".format(task['id']))
time.sleep(10)
r = s.get("{}/foreman_tasks/api/tasks/{}".format(BASE, task['id']))
task = r.json()
def delete():
print("delete")
r = s.post(DELETE, json={})
print(r.headers)
wait(r.json())
with vcr.use_cassette('katello.yaml'):
s = requests.Session()
s.verify = False
s.auth = ('admin', 'changeme')
delete()
output:
python test_katello_vcr.py
delete
{'status': '202 Accepted', 'x-request-id': 'd117a579-0f05-4e43-b2f9-7ebadba8e07f', 'x-xss-protection': '1; mode=block', 'x-download-options': 'noopen', 'x-content-type-options': 'nosniff', 'x-powered-by': 'Phusion Passenger 4.0.53', 'set-cookie': 'request_method=POST; path=/; secure; HttpOnly; SameSite=Lax, _session_id=20163f242bd29584ae7a8c5674508431; path=/; secure; HttpOnly; SameSite=Lax', 'strict-transport-security': 'max-age=631139040; includeSubdomains', 'foreman_version': '1.22.0', 'keep-alive': 'timeout=5, max=10000', 'server': 'Apache', 'content-security-policy': "default-src 'self'; child-src 'self'; connect-src 'self' ws: wss:; img-src 'self' data: *.gravatar.com; script-src 'unsafe-eval' 'unsafe-inline' 'self'; style-src 'unsafe-inline' 'self'", 'x-runtime': '0.209123', 'connection': 'Keep-Alive', 'x-permitted-cross-domain-policies': 'none', 'cache-control': 'no-cache', 'date': 'Wed, 07 Aug 2019 08:17:49 GMT', 'apipie-checksum': 'f287e71ef7536ad92a99a61830ffabc07a800b26', 'x-frame-options': 'sameorigin', 'content-type': 'application/json; charset=utf-8', 'foreman_api_version': '2'}
sleeping for ac06848f-1be6-462c-92a2-116a0d20a8eb
Traceback (most recent call last):
File "test_katello_vcr.py", line 32, in <module>
delete()
File "test_katello_vcr.py", line 25, in delete
wait(r.json())
File "test_katello_vcr.py", line 18, in wait
r = s.get("{}/foreman_tasks/api/tasks/{}".format(BASE, task['id']))
File "/home/egolov/Devel/theforeman/foreman-ansible-modules/venv/lib/python2.7/site-packages/requests/sessions.py", line 546, in get
return self.request('GET', url, **kwargs)
File "/home/egolov/Devel/theforeman/foreman-ansible-modules/venv/lib/python2.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/home/egolov/Devel/theforeman/foreman-ansible-modules/venv/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/egolov/Devel/theforeman/foreman-ansible-modules/venv/lib/python2.7/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",))
As you can see, the server answers with 'keep-alive': 'timeout=5, max=10000' to the POST, but we use time.sleep(10) to wait.
Lowering the sleep to 4 (so below the timeout) avoids the problem.
However, it should be the responsibility of the underlying http client to re-open the connection if needed. And requests does so just fine if used outside of VCR. Thus we think this is a bug in VCR.
Thank you for reporting this bug! Much appreciated.
We recently released v2.1.0 with a lot of fixes. Although this doesn't look like one of the fixes included. Could you try with the latest version too so we can rule that out before proceeding?
Would you be open to contributing a pull request to address this? I'd be more than happy to collaborate getting the TravisCI to pass to get it over the line. Then we can put out a fix shortly.
If you haven't got the bandwidth, that's fine too. Is there a toy repo that we could clone to replicate the issue at least?
Hey,
so as you expected, 2.1.0 doesn't fix the issue. And I must admit I don't really have the bandwidth to dive into the details of VCR itself to help you fixing this. However, I sat down and made you a smaller reproducer, so that's something ;)
import time
import requests
import vcr
URL = 'https://www.die-welt.net/'
with vcr.use_cassette('test.yaml', record_mode='all'):
s = requests.Session()
s.verify = False
s.get(URL)
time.sleep(10)
s.get(URL)
If I am not mistaken, you can point URL at any Apache httpd in default config, and it will have KeepAliveTimeout 5 and thus fail if you try a second GET after 10 seconds.
Awesome! I’ll have a crack at replicating it myself with that sample. That’s really helpful thanks.
I can confirm that this is still an issue with vcrpy 4.1.0. It took me a while to figure out that vcrpy is causing the problem and not the remote server, as the error message can be pretty misleading. I'd appreciate if this issue could get fixed.
Hello! I am also one of the unfortunate ones who has encountered this issue. Thankfully I managed to debug and fix it. See PR https://github.com/kevin1024/vcrpy/pull/605.