Adrián Chaves
Adrián Chaves
After https://github.com/scrapy/scrapy/pull/6269 pypy gets brotlicffi support. However, it currently runs in all pypy jobs in CI. It would be best to have a separate CI job for pypy+extra-deps, same as...
To do: - [ ] Confirm the user-facing API is as agreed with @VMRuiz and @proway2. - [x] Make existing tests pass. - [x] Restore compatibility with Scrapy 2.0.1+. -...
The downloader middleware of scrapy-zyte-api was created to prevent AutoThrottle to affect requests driven through Zyte API, and instead let Zyte API itself control throttling on the server side, sending...
Resolves #118, resolves #119, resolves #120. To do: - [x] Test both snippets manually, make sure they work as expected.
I have seen 2 people now having trouble with HTTP cache in combination with scrapy-zyte-api. They set `HTTPCACHE_ENABLED` to `True`, and they get `NotSupported("Response content isn't text")`. I could not...
To do: - [x] Update after https://github.com/scrapy-plugins/scrapy-zyte-api/pull/150
Even if `httpResponseHeaders` is not `True`, if the actual response data is plain text, we should interpret it as such.
e.g. ``` request_or_response_url: Reuse[ResponseUrl] | RequestUrl ``` Should use RequestUrl unless ResponseUrl is also requested separately.
The current docs are inaccurate, as they limit the scope of the parameter to the duplicate filter of the scheduler, while there are additional built-in middlewares that take this parameter...
Alternative to #298. I’m not sure which one is the better approach, though. But supporting this feature was part of one of the original iterations of JMESPath support, and [it...