scrapy-webdriver
scrapy-webdriver copied to clipboard
I get the following stacktrace when I run the `pip install` given in the README.md. ```python >pip install https://github.com/sosign/scrapy-webdriver/archive/master.zip ``` ``` Collecting https://github.com/sosign/scrapy-webdriver/archive/master.zip Downloading https://github.com/sosign/scrapy-webdriver/archive/master.zip \ 20kB 2.6MB/s Complete output...
Middleware was emitting requests with dont_filter=True, causing multiple uncaught duplicates. dont_filter is not needed by itself, but it was protecting request queue from exhaustion -- middleware emits one request at...
- works in production without raising deprecation warnings - Request Queue test fails, appears to be related to Scrapy v1 (?) Test output: https://gist.github.com/fraserharris/df51b43184c8c35620cd
class MySpider(CrawlSpider): start_urls = [ "http://www.example.com", ] rules = ( Rule( LxmlLinkExtractor( allow=[r'\w+\/\d+$', r'\w+\/\d+-p\d+$'], ), follow=True ), Rule( LxmlLinkExtractor( allow=(r'\d+.html$'), ), 'parse_action', ), ) def parse_action(self, response): yield WebdriverRequest(response.url, callback=self.parse_item)...
Why are you including distribute 0.6.27 with this package? It makes no sense and is breaking pip and fabric after installing this package. For example: ``` $ fab Traceback (most...
I'm currently seeing that its stuck on downloading for a long time, could it be that the request timed out so it won't continue? Are requests currently not concurrent because...
To address the issue in Issue #4 of a single exception in a Spider's parse method causing the spider to finish.
If exception is raised in parse method of a WebdriverResponse/WebdriverRequest whole spider closes/exits and doesnt continue Steps to reproduce: In any of your parse methods which parse WebDriverResponses raise an...
PhantomJS webdriver does not perform ActionChain when doing a WebdriverActionRequest. See http://stackoverflow.com/questions/16744038/python-bindings-to-selenium-webdriver-actionchain-not-executing-in-phantomjs for a reproducible example.
I saw the request is replaced with dont_filter=True, if I remove that the spider will just stop when it gets to the same url. I need to use the offsite...