Georgiy Zatserklianyi
Georgiy Zatserklianyi
This PR is ready for review. Tests for per slot settings implemented using mockserver.
I suppose it can be implemented by updating code here: https://github.com/scrapy/scrapy/blob/23537a0f9580bfb28ac5d8b88f37df47e838f463/scrapy/core/downloader/handlers/__init__.py#L70-L75
@Gallaecio > It would be great if a plugin like https://github.com/scrapy-plugins/scrapy-playwright did not had to force you to drive all requests through its download handlers, and instead you could drive...
@Gallaecio @Duckweeds7 I have some concers about it > or crawls a large number of tasks, it will lead to info.downloaded becomes very large If total size of all downloaded...
@ejulio, @Gallaecio `ScrapyAgent._cb_bodydone` method.. chooses response class: https://github.com/scrapy/scrapy/blob/e22a8c8c36e34ffaf12ef9e330624df654582605/scrapy/core/downloader/handlers/http11.py#L395-L400 `responsetypes.from_args` method performs several checks with following logic: - default response type is plain `Response`, - if scrapy identify something that require...
> Imagine those sites only support low traffic, so you want to limit concurrency to 2 per site. Also imagine that your Splash instance can only handle up to 3...
Lets test this script with various settings script ```python import scrapy; from scrapy.crawler import CrawlerProcess class BooksToScrapeSpider(scrapy.Spider): name = "books"; start_urls = [f"https://books.toscrape.com/catalogue/page-{i}.html" for i in range(1,32)] custom_settings = {"DOWNLOAD_DELAY":1}...
@Gallaecio Created new testcases for checking selectors with `bytes` input.
@pawelmhm > So just removing this single line gives us 20% improvement in memory usage. I think that real impact of this is much more than 20% At this moment...
Hello @kmike. Thank You for feedback. > 1.We're talking about peak memory usage. get_virtual_size is not returning amount of currently allocated memory, it returns a maximum which the process used...