crawlee-python icon indicating copy to clipboard operation
crawlee-python copied to clipboard

feat: Include whitelisted HTTP headers in extended_unique_key computation.

Open AkhilProto opened this issue 4 months ago • 0 comments

Description

This pull request enhances the compute_unique_key function in the src/crawlee/_utils/requests.py file to include HTTP headers in the unique key computation and adds corresponding unit tests. The most important changes include adding new parameters for headers and whitelisted headers, modifying the logic to compute the unique key, and adding tests to verify the new functionality.

Enhancements to compute_unique_key function:

  • src/crawlee/_utils/requests.py: Added headers and whitelisted_headers parameters to the compute_unique_key function to include HTTP headers in the unique key computation.
  • src/crawlee/_utils/requests.py: Modified the logic to compute the unique key by including whitelisted headers if provided.

Issues

  • Closes: #548

Testing

  • tests/unit/_utils/test_compute_unique_key.py: Added unit tests to verify the new functionality of including headers in the unique key computation using pytest.

Checklist

  • [X] CI passed

AkhilProto avatar Oct 14 '24 11:10 AkhilProto