Mikhail Korobov comments

Results 479 comments of


                                            Mikhail Korobov

pydispatcher: find a better supported alternative

hey @blshkv! There is some history to this topic, see https://github.com/scrapy/scrapy/issues/8 :)

Use priority queues for Downloader slot queues

@nramirezuy yes, it affects e.g. robots.txt middleware or any other middleware which needs to do extra high-priority requests before processing other requests. We (me and @shirk3y) had this issue with...

Use priority queues for Downloader slot queues

@nramirezuy hm, yes, it seems you're right about cookies. At least it is something worth checking, a good point. But I don't think the question about cookies is relevant here...

Exceptions in middleware don't return exit code 1 in `scrapy crawl` & `scrapy check`

I think that's intended, because a crawl doesn't stop on these exceptions. That's the same as with exceptions in the request callbacks - they're logged, but crawl continues.

Make spider names optional

I think the implementation is fine. Is it blocked on naming @Gallaecio?

refactor LogStats extension to log IPM and RPM to stats on spider_close

The feature looks good to me, but it needs an update to the current master branch.

Scrapy capitalizes headers for request

I think it'd be good to not capitalize header names by default, and pass them as-is.

Scrapy capitalizes headers for request

@redapple yep.

Selector mis-parses html when elements have very large bodies

+1 to use huge_tree=True by default, though we need to make a check - parsel must work with old lxml which don't support this parameter. It may be also good...

selector create_root_node memory issues

Hey @GeorgeA92! That's a nice analysis, but I think we should clarify some parts here. 1) We're talking about peak memory usage. get_virtual_size is not returning amount of currently allocated...