scrapy-splash
scrapy-splash copied to clipboard
Scrapy+Splash for JavaScript integration
As mentioned under #157 in [this](https://github.com/scrapy-plugins/scrapy-splash/pull/157#issuecomment-358129157) comment: this feature must be backward compatible.
A few weeks ago, the chromium project announced [headless chromium](https://chromium.googlesource.com/chromium/src/+/lkgr/headless/README.md) as new, clean way to open websites in a non-UI server context. The announcement had quite an impact in the...
Originally appeared [on StackOverflow](https://stackoverflow.com/questions/44294579/scrapy-splash-missing-scheme-in-request-url-render-html-but-urls-have-sche). When `SPLASH_URL` setting is missing the `http://` scheme, scrapy's error is not very helpful ``` SPLASH_URL = 'localhost:8050' (...) 2017-06-01 14:44:35 [scrapy.core.scraper] ERROR: Error downloading Traceback...
I want to work on this issue. https://github.com/scrapy/scrapy/issues/2673
In Selenium, we can do it by using window_handles and switch_to_window method. Before clicking the link first store the window handle as `window_before = driver.window_handles[0]` after clicking the link store...
We could create a middleware which adds 'splash' meta key to all requests, or to all requests matching some pattern. It could also decode the results to make the whole...
it should be possible to run Splash in the same event loop as Scrapy, similar to how it worked in gtk-based scrapyjs.
Corrected a typo
Continuation of #15. Related to #11, although since it works at the spider level, not at a wider, project level, I wouldn’t say it fixes it.
This is about https://github.com/scrapinghub/scrapyjs/issues/14 and https://github.com/scrapinghub/scrapyjs/issues/11 opening this for discussion, needs tests and improvements, would be cool to get some feedback if it's in good direction