vcrpy icon indicating copy to clipboard operation
vcrpy copied to clipboard

Insufficient tooling for filtering sensitive requests

Open flushentitypacket opened this issue 4 years ago • 2 comments

Hey, love the tool overall, but I don't think it works for what must be an extremely common use case. I'm testing something that does (1) sent an auth request, (2) use that auth to make a request.

From the docs, there are before_record_request and before_record_response, but neither of these seem to help. If I completely scrub the request as recommended with:

def before_record_cb(request):
    if request.path == '/login':
        return None
    return request

Then this will send a live login request every time the test runs. So I need this recorded, but with the returned auth data scrubbed. Then I turn to using before_record_response.

The problem here is I have no way of identifying which request is the auth request since I only receive the response data in that method. AFAICT the only way around this is to infer based on the response data, which seems pretty hacky.

I think this would be solved if the method signature was before_record_response(response, request)? Then something like this would be possible:

def before_record_response(response, request):
    if request.path == '/login':
        return None
    return response

flushentitypacket avatar May 06 '20 17:05 flushentitypacket

I am also using an API where I must (1) send an auth request, and (2) use that auth to make a request. Here's how I configured VCR:

def _scrub_username_and_password_from_body(request):
    if request.body:
        data = json.loads(request.body)
        if 'username' in data:
            data['username'] = "REDACTED_USERNAME"
        if 'password' in data:
            data['password'] = "REDACTED_PASSWORD"
        request.body = json.dumps(data).encode()
    return request


def _scrub_access_token(response):
    if response['body']['string']:
        data = json.loads(response['body']['string'])
        data['access_token'] = "REDACTED_ACCESS_TOKEN"
        response['body']['string'] = json.dumps(data).encode()
    return response

@vcr.use_cassette(  # type: ignore
    before_record_request=_scrub_username_and_password_from_body,
    before_record_response=_scrub_access_token,
    filter_headers=[
        ("Authorization", "REDACTED_AUTHORIZATION"),
    ],
)

JayBazuzi avatar Mar 25 '21 14:03 JayBazuzi

We (Cybersecurity Development at University of Illinois) had the same problem, and our solution is a custom YAML serializer method. This allows us to inspect the request and match based on URL, then modify the response that gets recorded.

@pytest.fixture
def cassette(request) -> vcr.cassette.Cassette:
    my_vcr = vcr.VCR(...)
    my_vcr.register_serializer("cleanyaml", CleanYAMLSerializer)

    with my_vcr.use_cassette(f'{request.function.__name__}.yaml',
                             serializer="cleanyaml") as tape:
        yield tape

class CleanYAMLSerializer:
    def serialize(cassette: dict):
        for interaction in cassette['interactions']:
            clean_token(interaction)
            clean_search(interaction)
            clean_new_ticket(interaction)
        return yamlserializer.serialize(cassette)

    def deserialize(cassette: str):
        return yamlserializer.deserialize(cassette)


def clean_search(interaction: dict):
    uri = f"{URL}/path_we_need_to_sanitize"
    if interaction['request']['uri'] != uri:
        return
    # ... more cleaning 

Credit to @ddriddle, @tzturner, @mpitcel, and @zdc217 for the solution.

edthedev avatar Dec 05 '22 17:12 edthedev