Jindřich Bär
Jindřich Bär
### Which package is the feature request for? If unsure which one to select, leave blank None ### Feature While the current KVS implementation can work with Node.JS streams (e.g....
There are often reasons to make multiple separate RQs in one Crawlee project (e.g., having `CheerioCrawler` for processing most of the pages and a separate keep-alive `PlaywrightCrawler` instance for processing...
When evaluating an asynchronous function in the main world execution mode, the return value is always `{ }`. Example: ```python from camoufox.sync_api import Camoufox with Camoufox( main_world_eval=True ) as camoufox:...
Makes the current `Session` instance the single source of cookies for the current request. Closes #2744
Aligns the `@crawlee/impit-client` implementation with the respective RFC and browsers' behaviour. Closes #2586
Due to the design of Crawlee request handlers, the user-supplied request handler can return before the response stream is consumed. Because of this, we are waiting until the stream is...
With the `HttpClient` abstraction, we now allow users to switch HTTP client implementations in a standardized manner. E.g., the `HttpCrawler` implementation still contains references to `got-scraping`, which forces us to...
The current `SessionPool` implementation generates a large number of `Session` instances before ever reusing one. This limits the use cases for this class and wastes resources. Related discussion (https://github.com/apify/crawlee/pull/3199/files#r2452714909). Closes...
In a fashion similar to https://github.com/apify/crawlee/issues/3198, we should extract cookie handling from all parts of Crawlee and treat the `Session` instances as the single source of truth for the current...
### Which package is this bug report for? If unsure which one to select, leave blank @crawlee/basic (BasicCrawler) ### Issue description When instantiating multiple crawler instances at once, their `useState`...