requests-cache icon indicating copy to clipboard operation
requests-cache copied to clipboard

Add option to ignore nested parameters in a JSON request body

Open timbmg opened this issue 3 years ago • 4 comments

Is it possible to ignore a nested parameter? For example the data of my requests looks like this:

{
  "data": {
    "key": "value",
    "timestamp": "2022-08-04"
  }
}

In this example, I would like to ignore timestamp but not key when creating the cache key. Yes, I could overwrite the cache_key function, but feels like this should be part of the library? Maybe like this: data.timestamp

timbmg avatar Aug 04 '22 15:08 timbmg

Interesting idea. Do you happen to have any cases where the key you want to ignore is more than 1 level deep (like data.key_1.key_2)?

If not, that would make things much easier. I think having a single "root" element in a request body (data in your example) is fairly common. We could possibly add an option to apply ignored_parameters to anything under that root key. That would be a much simpler change to make than supporting arbitrary levels of JSON keys using dot notation.

JWCook avatar Aug 07 '22 23:08 JWCook

No, I don't have nested keys that are more than one level deep. So for my use case, a single level would be sufficient.

timbmg avatar Aug 09 '22 14:08 timbmg

I could work on a PR on this @JWCook As far as I see, this would be adding some logic to the filter_sort_* functions oft cache_keys.py.

timbmg avatar Aug 12 '22 17:08 timbmg

That would be great! I don't think the filter_sort_* functions need to change, but normalize_json_body and everything up the call chain from there will need to change.

To get you started, the new normalize_json_body() would look something like this:

def normalize_json_body(
    original_body: Union[str, bytes],
    ignored_parameters: ParamList,
    content_root_key: str = None,
) -> Union[str, bytes]:
    """Normalize and filter a request body with serialized JSON data"""
    if len(original_body) <= 2 or len(original_body) > MAX_NORM_BODY_SIZE:
        return original_body

    try:
        body = json.loads(decode(original_body))
        if content_root_key and isinstance(body, dict) and content_root_key in body:
            body[content_root_key] = filter_sort_json(body[content_root_key], ignored_parameters)
        else:
            body = filter_sort_json(body, ignored_parameters)
        return json.dumps(body)
    # If it's invalid JSON, then don't mess with it
    except (AttributeError, TypeError, ValueError):
        logger.debug('Invalid JSON body')
        return original_body

Then:

Usage would then look like:

session = CachedSession(content_root_key='data')

Does that sound reasonable to you?

JWCook avatar Aug 15 '22 00:08 JWCook