crawlee-python
crawlee-python copied to clipboard
feat: Include whitelisted HTTP headers in extended_unique_key computation.
Description
This pull request enhances the compute_unique_key
function in the src/crawlee/_utils/requests.py
file to include HTTP headers in the unique key computation and adds corresponding unit tests. The most important changes include adding new parameters for headers and whitelisted headers, modifying the logic to compute the unique key, and adding tests to verify the new functionality.
Enhancements to compute_unique_key
function:
-
src/crawlee/_utils/requests.py
: Addedheaders
andwhitelisted_headers
parameters to thecompute_unique_key
function to include HTTP headers in the unique key computation. -
src/crawlee/_utils/requests.py
: Modified the logic to compute the unique key by including whitelisted headers if provided.
Issues
- Closes: #548
Testing
-
tests/unit/_utils/test_compute_unique_key.py
: Added unit tests to verify the new functionality of including headers in the unique key computation usingpytest
.
Checklist
- [X] CI passed