crawlee-python
crawlee-python copied to clipboard
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...
Since we use [HTTPX](https://pypi.org/project/httpx/) , Isn't better to use [curl_cffi](https://pypi.org/project/curl-cffi/) instead or use it an optional client that switches when the HTTPX gets defeated in the AntiBot war. I am...
When using VS Code, pylance (v2024.7.1, latest) frequently reports ```reportPrivateImportUsage``` errors when importing classes.  These errors can be fixed by defining a ```__all__``` list in the module's ```__init__.py``` file,...
When I try to ```pipx run crawlee create my-crawler```. I get the following error I have already installed Crawlee and its visible in the ```pip list``` ``` ⚠️ crawlee is...
[](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [rimraf](https://togithub.com/isaacs/rimraf) | [`^5.0.0` -> `^6.0.0`](https://renovatebot.com/diffs/npm/rimraf/5.0.8/6.0.0) | [](https://docs.renovatebot.com/merge-confidence/)...
The library is sick It would be an beautiful addon if we can add an selenium crawler. Connecting to selenium webdriver or remote drivers
Recently, the creators of Ruff (Astral) released a new package installer and resolver called [uv](https://github.com/astral-sh/uv), written in Rust. Perhaps we could integrate it into our CI pipelines, as installing everything...
### Description Based on the PR https://github.com/apify/apify-sdk-python/pull/171, @janbuchar suggested the usage of some run-time checking for Python. E.g. [typeguard](https://github.com/agronholm/typeguard), it can be applied either using a decorator `@typechecked` for a...
We just launched Crawlee for Python, and as it is in its early stages, we are looking for feedback and your contributions! 👀 To the first 10 developers who give...
[](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [@typescript-eslint/eslint-plugin](https://typescript-eslint.io/packages/eslint-plugin) ([source](https://togithub.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin)) | [`7.15.0` -> `7.16.0`](https://renovatebot.com/diffs/npm/@typescript-eslint%2feslint-plugin/7.15.0/7.16.0) |...