crawlee-python issues

Crawlee for Python Hacktoberfest 2024 🧡 🐍

# Crawlee for Python Hacktoberfest 2024 [Starting Oct 1, 2024] ![Hacktober 2024 Crawlee](https://github.com/user-attachments/assets/03e4c145-cf22-4618-b27f-8d434f9dd3f5) # Prizes 🏆 - 1-2 Accepted Pull Request: Crawlee Exclusive Sticker Sheet. - 2 or more Accepted...

souravjain540

t-tooling

hacktoberfest

feat: Improved project bootstrapping

4

This adds a unified `crawler` template. The original `playwright` and `beautifulsoup` templates are kept for compatibility with older versions of the CLI. The user is now prompted for package manager...

janbuchar

t-tooling

tested

Optionally include Apify integration in project bootstrapping

e.g. with an `--apify` flag - this should add SDK to requirements and activate the `Actor` context manager in the main function

janbuchar

enhancement

t-tooling

Check CI status on `master` branch in release workflow

1

More details in https://github.com/apify/crawlee-python/pull/466#issuecomment-2312331905

janbuchar

enhancement

t-tooling

Sync templates to apify/actor-templates on update

1

janbuchar

t-tooling

infrastructure

Make sure that detection of available memory is consistent with JS Crawlee

- https://github.com/giampaolo/psutil/issues/1011 - https://github.com/apify/crawlee/blob/master/packages/utils/src/internals/memory-info.ts#L53 - JS version has special cases for AWS lambda and docker

janbuchar

t-tooling

`CurlImpersonateHttpClient` warning on Windows

Using the `CurlImpersonateHttpClient` adds this warning message upon program start, which doesn't seem to be fixed if I add in the command that it asks for ``` asyncio.set_event_loop_policy(WindowsSelectorEventLoopPolicy()) await crawler.run(["https://www.mtggoldfish.com/metagame/modern#paper"])...

janbuchar

bug

t-tooling

Create a new guide for crawling features

- We could create a new documentation guide for all crawling-related features we provide. - The guide should include the following: - `enqueu_links` helper function, - Crawling limitations and controls:...

vdusek

documentation

t-tooling

crawlee-python
crawlee-python copied to clipboard

Metadata

Crawlee for Python Hacktoberfest 2024 🧡 🐍

feat: Improved project bootstrapping

Optionally include Apify integration in project bootstrapping

Check CI status on `master` branch in release workflow

Sync templates to apify/actor-templates on update

Make sure that detection of available memory is consistent with JS Crawlee

`CurlImpersonateHttpClient` warning on Windows

Create a new guide for crawling features

← Metadata

Owner

Metadata

crawlee-python crawlee-python copied to clipboard

Metadata

← Metadata

Owner

Metadata

crawlee-python
crawlee-python copied to clipboard