Adrián Chaves comments

Results 860 comments of


                                            Adrián Chaves

dictionary parameter support

I think this may need a bit of community discussion. I also worry that it would not be obvious for users of this method how project settings are ignored when...

Identical requests sent by Scrapy vs Requests module returning different status codes

From your findings, could the header order be the issue? Some antibot software I believe takes that into account.

Identical requests sent by Scrapy vs Requests module returning different status codes

I don’t think we should enable it by default. But maybe we should document this as one thing to try when getting unexpected responses.

Identical requests sent by Scrapy vs Requests module returning different status codes

> I tried, but failed, to create a DOWNLOADER_MIDDLEWARES that would use requests.get() to fetch the pages. Has anyone ever done this ? Sounds interesting as a proof of concept....

Identical requests sent by Scrapy vs Requests module returning different status codes

> I'm pretty sure the difference is in order and/or case of headers. For those we have https://github.com/scrapy/scrapy/issues/2711 and https://github.com/scrapy/scrapy/issues/2803, so if that’s the case we could probably close this...

LinkExtractor calls process_value before applying allow and deny

Can you provide a [self-contained, minimal example](https://stackoverflow.com/help/minimal-reproducible-example) to reproduce the issue?

LinkExtractor calls process_value before applying allow and deny

This is what a minimal example looks like to me: ```python from scrapy.http import TextResponse from scrapy.link import Link from scrapy.linkextractors import LinkExtractor def process_value(url): return url link_extractor = LinkExtractor(...

LinkExtractor calls process_value before applying allow and deny

Ah, I see. So the problem is that `process_value` is called before `allow` is taken into account. I assume there are scenarios where the current behavior is desired. For example,...

LinkExtractor calls process_value before applying allow and deny

I agree that if not modified it should be documented.

Set an upper limit for the cache, otherwise it will cause oom

I see you have removed the setting now, not sure why. I think it would be best to keep it. It does complicate the implementation, though. When creating an instance...