scrapy-selenium
scrapy-selenium copied to clipboard
about proxy support
Good day! Whether there was any solution in support of proxy servers after all? Unfortunately, the standard expression request.meta['proxy'] does not work with SeleniumRequest. Dear zhangtemplar apparently also failed to finalize this point, as i see. With respect to you...
Is my understanding correct that there is no proxy support in "scrapy-selenium" and each request will expose the real IP?
yep
As mentioned @free01man and the work of @zhangtemplar scrapy-selenium can be a wonderful tool with this feature, specially in combination with: scrapy-rotating-proxies
https://github.com/TeamHG-Memex/scrapy-rotating-proxies
Any progress here
Commenting to keep updated
any updates?
While not a completly satisfactory solution, you can pass the proxy address to the webdriver
as an option.
e.g. in your settings:
DOWNLOADER_MIDDLEWARES['scrapy_selenium.SeleniumMiddleware'] = 800
SELENIUM_DRIVER_NAME = 'chrome'
SELENIUM_DRIVER_EXECUTABLE_PATH = which('chromedriver')
SELENIUM_DRIVER_ARGUMENTS = ['--headless', '--proxy-server=http://127.0.0.1:8118']
Verify that the proxy is working by doing a request to an IP service e.g.
SeleniumRequest(url='http://ifconfig.me/ip')
and check the Response.text
for the IP address.
This workaround works for a single, static proxy only.
This may be useful if you are using privoxy
with tor
or do not wish to rotate the proxy IP's, it won't work with proxy rotation middlewares like scrapy-rotating-proxy
.
If you need it to work with another middleware, clone the repo and modify it.
As mentioned @free01man and the work of @zhangtemplar scrapy-selenium can be a wonderful tool with this feature, specially in combination with: scrapy-rotating-proxies
https://github.com/TeamHG-Memex/scrapy-rotating-proxies
Hi, I am using the scrapy-rotating-proxies at the same time, but I am not sure is it really hide my REAL IP? Any evidence that you can show that is really hide your REAL IP? Thank you.
What about with auth