vcrpy
vcrpy copied to clipboard
Insufficient tooling for filtering sensitive requests
Hey, love the tool overall, but I don't think it works for what must be an extremely common use case. I'm testing something that does (1) sent an auth request, (2) use that auth to make a request.
From the docs, there are before_record_request
and before_record_response
, but neither of these seem to help. If I completely scrub the request as recommended with:
def before_record_cb(request):
if request.path == '/login':
return None
return request
Then this will send a live login request every time the test runs. So I need this recorded, but with the returned auth data scrubbed. Then I turn to using before_record_response
.
The problem here is I have no way of identifying which request is the auth request since I only receive the response
data in that method. AFAICT the only way around this is to infer based on the response data, which seems pretty hacky.
I think this would be solved if the method signature was before_record_response(response, request)
? Then something like this would be possible:
def before_record_response(response, request):
if request.path == '/login':
return None
return response
I am also using an API where I must (1) send an auth request, and (2) use that auth to make a request. Here's how I configured VCR:
def _scrub_username_and_password_from_body(request):
if request.body:
data = json.loads(request.body)
if 'username' in data:
data['username'] = "REDACTED_USERNAME"
if 'password' in data:
data['password'] = "REDACTED_PASSWORD"
request.body = json.dumps(data).encode()
return request
def _scrub_access_token(response):
if response['body']['string']:
data = json.loads(response['body']['string'])
data['access_token'] = "REDACTED_ACCESS_TOKEN"
response['body']['string'] = json.dumps(data).encode()
return response
@vcr.use_cassette( # type: ignore
before_record_request=_scrub_username_and_password_from_body,
before_record_response=_scrub_access_token,
filter_headers=[
("Authorization", "REDACTED_AUTHORIZATION"),
],
)
We (Cybersecurity Development at University of Illinois) had the same problem, and our solution is a custom YAML serializer method. This allows us to inspect the request and match based on URL, then modify the response that gets recorded.
@pytest.fixture
def cassette(request) -> vcr.cassette.Cassette:
my_vcr = vcr.VCR(...)
my_vcr.register_serializer("cleanyaml", CleanYAMLSerializer)
with my_vcr.use_cassette(f'{request.function.__name__}.yaml',
serializer="cleanyaml") as tape:
yield tape
class CleanYAMLSerializer:
def serialize(cassette: dict):
for interaction in cassette['interactions']:
clean_token(interaction)
clean_search(interaction)
clean_new_ticket(interaction)
return yamlserializer.serialize(cassette)
def deserialize(cassette: str):
return yamlserializer.deserialize(cassette)
def clean_search(interaction: dict):
uri = f"{URL}/path_we_need_to_sanitize"
if interaction['request']['uri'] != uri:
return
# ... more cleaning
Credit to @ddriddle, @tzturner, @mpitcel, and @zdc217 for the solution.