crawlee-python issues

feat: Include whitelisted HTTP headers in extended_unique_key computation.

### Description This pull request enhances the `compute_unique_key` function in the `src/crawlee/_utils/requests.py` file to include HTTP headers in the unique key computation and adds corresponding unit tests. The most important...

AkhilProto

feat: Added get_public_url method to KeyValueStore

3

### Description - Implement get_public_url method in KeyValueStore ### Issues - Closes: #514 ### Testing - Unit tests added ### Checklist - [ ] CI passed

akshay11298

t-tooling

feat: Key-value store context helpers

1

- This adds a `get_key_value_store(id, name)` context helper to `BasicCrawlingContext` - Also, push_data calls are held until the request handler terminates successfully (same as in JS version) - This is...

janbuchar

t-tooling

adhoc

tested

docs: Guide for scaling the crawlers

2

### Description - Guide for scaling the crawlers ### Issues - Closes: #476 ### Testing - TODO ### Checklist - [ ] CI passed

akshay11298

fix!: Added the always-enqueue paramter to bypass deduplication

### Description - Earlier the requests that were generated from `Request.from_url` with the same `unique_key` generated on the same URLs were considered identical requests but this parameter if set to...

shivansh-bhatnagar18

docs: Improved API documentation for BasicCrawler class

6

### **Description** This PR improves the API documentation for the `BasicCrawler` class by providing clear and concise explanations for all arguments and methods. The documentation now adheres to Google style...

belloibrahv

t-tooling

docs: Added [Dataset, Keystore] documentation

2

### Description - TODO ### Issues - Closes: #TODO ### Testing - TODO ### Checklist - [ ] CI passed

saymedgm

docs: added guide for result storages (Dataset, KeyValueStore)

### Description This PR adds documentation on Crawlee's result storage types, specifically the Key-Value Store and Dataset, providing usage examples and file structures for efficient data management. - Closes: #479...

Manish-k723

Create a new guide for result storages (`Dataset`, `KeyValueStore`)

9

- We should create a new documentation guide on how to work with result storages (`Dataset`, `KeyValueStore`). - Inspiration: https://crawlee.dev/docs/guides/result-storage - Check the structure of other guides - [docs/guides](https://github.com/apify/crawlee-python/tree/master/docs/guides), and...

vdusek

documentation

t-tooling

hacktoberfest

docs: Added documentation on how to Avoid getting blocked

4

### Description - Added 5 files, out of which 2 will aren't currently being used, when crawlee-python will complete puppeteer crawler, those can be used. - Added additional information apart...

MostlyKIGuess

t-tooling

crawlee-python
crawlee-python copied to clipboard

Metadata

feat: Include whitelisted HTTP headers in extended_unique_key computation.

feat: Added get_public_url method to KeyValueStore

feat: Key-value store context helpers

docs: Guide for scaling the crawlers

fix!: Added the always-enqueue paramter to bypass deduplication

docs: Improved API documentation for BasicCrawler class

docs: Added [Dataset, Keystore] documentation

docs: added guide for result storages (Dataset, KeyValueStore)

Create a new guide for result storages (`Dataset`, `KeyValueStore`)

docs: Added documentation on how to Avoid getting blocked

← Metadata

Owner

Metadata

crawlee-python crawlee-python copied to clipboard

Metadata

← Metadata

Owner

Metadata

crawlee-python
crawlee-python copied to clipboard