scrapy-zyte-api icon indicating copy to clipboard operation
scrapy-zyte-api copied to clipboard

Zyte API integration for Scrapy

Results 23 scrapy-zyte-api issues
Sort by recently updated
recently updated
newest added

To do: - [ ] Confirm the user-facing API is as agreed with @VMRuiz and @proway2. - [x] Make existing tests pass. - [x] Restore compatibility with Scrapy 2.0.1+. -...

The downloader middleware of scrapy-zyte-api was created to prevent AutoThrottle to affect requests driven through Zyte API, and instead let Zyte API itself control throttling on the server side, sending...

enhancement
discuss

## Background Retries issued by `zyte_api.aio.retry.RetryFactory` are somewhat hidden. They are logged as DEBUG messages (so they are not seen by default in new projects with LOG_LEVEL: INFO) and, I...

Resolves #118, resolves #119, resolves #120. To do: - [x] Test both snippets manually, make sure they work as expected.

I have seen 2 people now having trouble with HTTP cache in combination with scrapy-zyte-api. They set `HTTPCACHE_ENABLED` to `True`, and they get `NotSupported("Response content isn't text")`. I could not...

To do: - [x] Update after https://github.com/scrapy-plugins/scrapy-zyte-api/pull/150 - [x] Solve conflicts. - [x] Complete coverage. Fixes #243.

In the example below ZyteApiProvide makes 2 API requests instead of 1: ```py @handle_urls("example.com") @attrs.define class MyPage(ItemPage[MyItem]): html: BrowserHtml # ... class MySpider(scrapy.Spider): # ... def parse(self, response: DummyResponse, product:...

Even if `httpResponseHeaders` is not `True`, if the actual response data is plain text, we should interpret it as such.

enhancement

When looking at the list of spider jobs in Scrapy Cloud, there's a column dictating the spider's `close_reason` message. Some users have raised that it's not apparently clear from this...