scrapy-zyte-api issues

Provide an MVP implementation of a session middleware

3

To do: - [ ] Confirm the user-facing API is as agreed with @VMRuiz and @proway2. - [x] Make existing tests pass. - [x] Restore compatibility with Scrapy 2.0.1+. -...

Gallaecio

Allow disabling AutoThrottle bypassing

4

The downloader middleware of scrapy-zyte-api was created to prevent AutoThrottle to affect requests driven through Zyte API, and instead let Zyte API itself control throttling on the server side, sending...

Gallaecio

enhancement

discuss

Track retries in the crawler's stats

## Background Retries issued by `zyte_api.aio.retry.RetryFactory` are somewhat hidden. They are logged as DEBUG messages (so they are not seen by default in new projects with LOG_LEVEL: INFO) and, I...

curita

Document how to handle action failures

1

Resolves #118, resolves #119, resolves #120. To do: - [x] Test both snippets manually, make sure they work as expected.

Gallaecio

HTTP cache not working in some cases

5

I have seen 2 people now having trouble with HTTP cache in combination with scrapy-zyte-api. They set `HTTPCACHE_ENABLED` to `True`, and they get `NotSupported("Response content isn't text")`. I could not...

Gallaecio

Remove the experimental name space from cookie parameters

2

To do: - [x] Update after https://github.com/scrapy-plugins/scrapy-zyte-api/pull/150 - [x] Solve conflicts. - [x] Complete coverage. - [ ] Discuss the points raised in comments below. Fixes #243.

Gallaecio

ZyteApiProvider could make an unneeded API request

16

In the example below ZyteApiProvide makes 2 API requests instead of 1: ```py @handle_urls("example.com") @attrs.define class MyPage(ItemPage[MyItem]): html: BrowserHtml # ... class MySpider(scrapy.Spider): # ... def parse(self, response: DummyResponse, product:...

kmike

Use text response objects for text responses without headers

1

Even if `httpResponseHeaders` is not `True`, if the actual response data is plain text, we should interpret it as such.

Gallaecio

enhancement

add request count to max requests close reason

2

When looking at the list of spider jobs in Scrapy Cloud, there's a column dictating the spider's `close_reason` message. Some users have raised that it's not apparently clear from this...

BurnzZ

Fix typing issues for typed Scrapy.

1

wRAR

scrapy-zyte-api
scrapy-zyte-api copied to clipboard

Metadata

Provide an MVP implementation of a session middleware

Allow disabling AutoThrottle bypassing

Track retries in the crawler's stats

Document how to handle action failures

HTTP cache not working in some cases

Remove the experimental name space from cookie parameters

ZyteApiProvider could make an unneeded API request

Use text response objects for text responses without headers

add request count to max requests close reason

Fix typing issues for typed Scrapy.

← Metadata

Owner

Metadata

scrapy-zyte-api scrapy-zyte-api copied to clipboard

Metadata

← Metadata

Owner

Metadata

scrapy-zyte-api
scrapy-zyte-api copied to clipboard