selenium-wire
selenium-wire copied to clipboard
Remote Webdriver with Auth Proxy connects but not using Proxy IP
I'm running a docker image locally, and using selenium-wire to connect to an authenticated proxy. This, of course, works perfectly when I do NOT use a remote WebDriver. I've looked at all the other issues, and I've not been able to solve my problem. This is what i have for my selenium-wire options:
options = {
'auto_config': False,
'addr': '0.0.0.0',
'port': 8087,
'proxy': {
'http': 'http://'+username+':'+password+'@'+ip_assignment["ip-address"]+':'+str(ip_assignment["port"]),
'https': 'http://'+username+':'+password+'@'+ip_assignment["ip-address"]+':'+str(ip_assignment["port"]),
'no_proxy': 'localhost,127.0.0.1' # excludes
}
I set the port to 8087 just because that's what other "GitHub issues" said to do.
And since I'm running docker locally, I also added this to my Firefox options:
options.add_argument(f"--proxy-server=host.docker.internal:8087")
when I create my driver:
driver = webdriver.Remote(
command_executor=selenium_connection, # for prod
#command_executor="http://localhost:4444/wd/hub", # for local development
desired_capabilities=DesiredCapabilities.FIREFOX,
options=options,
keep_alive=True,
browser_profile=firefox_profile,
seleniumwire_options=selenium_options
)
It, the driver, successfully runs and the browser shows up. I get these logging message
25-May-22 15:30:02 PM UTC | INFO | Using default request storage
25-May-22 15:30:03 PM UTC | INFO | Created proxy listening on 0.0.0.0:8087
25-May-22 15:30:07 PM UTC | INFO | SUCCESS: created firefox using connection
I didn't add the first or second lines of logging info. I believe that comes from the selenium-wire driver code. But it's not using the proxy I manually added. Instead it's using the "addr" and "port" from the selenium driver. When I look in the browser, It doesn't have it setup either.
wkeeling Do you think you can help me?
You are running selenium grid as docker or is it running on host machine where selenium-wire script docker running?
Selenium grid is running remotely on another machine. The script is in a docker image is built locally and running
It looks like your grid is not able to reach selenium-wire.
can you login into machine(where selenium grid hosted) and ping ip of selenium wire running in docker?
So my friend was the one who set up the selenium grid. I just took a look at it. He has it running on Google Cloud Run. The URL is https://account-refresh-selenium-n3vdi73cpa-uc.a.run.app/ It looks like it is running in a docker container. It seems to me that it should to be able to ping where my selenium-wire is running from. Since selenium grid is running in a container on Cloud Run, I'm not sure how to manually run pings.
I put my docker image running selenium-wire in Google Compute Engine. (I mentioned this in another git issue) It has the external IP of: 35.206.XX.XXX
Now it's not even trying to use an authenticated proxy. I look at the network settings in Firefox (in the selenium grid) and it has "Use system proxy settings" checked instead of "Manual proxy configuration" (and it doesn't fill in my authenticated proxy information).
selenium-wire proxy settings
selenium_options = {
'auto_config': False,
'proxy': {
'http': 'http://'+username+':'+password+'@'+ip_assignment["ip-address"]+':'+str(ip_assignment["port"]),
'https': 'https://'+username+':'+password+'@'+ip_assignment["ip-address"]+':'+str(ip_assignment["port"]),
'no_proxy': 'localhost,127.0.0.1' # excludes
},
'addr': '35.206.XX.XXX', #external IP address (The compute engine is: 35.206.XX.XXX)
'port': 4444,
}
Q1. Is addr suppose to be where selenium-wire is running?
Firefox options
options.add_argument("--proxy-server={}".format('35.206.XX.XXX:4444'))
Q2. This (above) is suppose to telling the selenium grid machine where selenium-wire is running, right?
selenium_connection = RemoteConnectionV2( remote_driver_host, keep_alive=False )
selenium_connection.set_remote_connection_authentication_headers()
driver = webdriver.Remote(
command_executor=selenium_connection, # for prod
desired_capabilities=DesiredCapabilities.FIREFOX,
options=options,
keep_alive=True,
browser_profile=firefox_profile,
seleniumwire_options=selenium_options
# keep_alive=False,
)
It's absolutely baffling why I can't get this to work. If You can help me figure this out via zoom. I'd appreciate that soooo much!
Also when I ping the compute engine from my machine, it pings successfully!
Hello @mirisr
addr
option is correct.
And you did everything correctly in terms of sending correct parameters.
Can you access selenium-wire, from anywhere? Or are you using any security group that can access.
Does the docker map the port number with host machine. i.e. using -p 8087:8087 when running docker command at selenium wire.?
That may be my issue. I just assumed if I could ping the ip address that it would be enough. I will respond once I can get the port number mapped on Google's Compute Engine. If it's possible.
@sanjeevtrz
According to this: Publishing container ports
"Container ports have a one-to-one mapping to the host VM ports. For example, a container port 80 maps to the host VM port 80. Compute Engine does not support the port publishing (-p) flag, and you do not have to specify it for the mapping to work."
"To publish a container's ports, configure firewall rules to enable access to the host VM instance's ports. The corresponding ports of the container are accessible automatically, according to the firewall rules."
So I created a new VM instance that allows for http/https traffic and created a firewall rule that now allows inward traffic through port 8080.
vm_ip = os.environ.get("VM_IP")
vm_port = int(os.environ.get("VM_PORT"))
# selenium-wire proxy settings
selenium_options = {
'auto_config': False,
'proxy': {
'http': 'http://'+username+':'+password+'@'+ip_assignment["ip-address"]+':'+str(ip_assignment["port"]),
'https': 'https://'+username+':'+password+'@'+ip_assignment["ip-address"]+':'+str(ip_assignment["port"]),
'no_proxy': 'localhost,127.0.0.1' # excludes
},
'addr': vm_ip,
'port': vm_port
}
firefox_options.add_argument(f"--proxy-server={vm_ip}:{vm_port}")
driver = webdriver.Remote(
command_executor=selenium_connection,
desired_capabilities=DesiredCapabilities.FIREFOX,
options=firefox_options,
keep_alive=True,
browser_profile=firefox_profile,
seleniumwire_options=selenium_options
)
According to my logs: I do see the right stuff being shown:
Selenium Options: {'auto_config': False, 'proxy': {'http': 'http://<myusername:password>@<proxyip>:4444', 'https': 'https://<myusername:password>@<proxyip>:4444', 'no_proxy': 'localhost,127.0.0.1'}, 'addr': '35.20X.XXX.XX', 'port': 8080}
But then I get this:
Failed to initiate Firefox browser: Error starting proxy server: gaierror(-9, 'Address family for hostname not supported')
So it's still not working.
If I change 'addr' to 0.0.0.0
selenium_options = {
'auto_config': False,
'proxy': {
'http': 'http://'+username+':'+password+'@'+ip_assignment["ip-address"]+':'+str(ip_assignment["port"]),
'https': 'https://'+username+':'+password+'@'+ip_assignment["ip-address"]+':'+str(ip_assignment["port"]),
'no_proxy': 'localhost,127.0.0.1' # excludes
},
'addr': '0.0.0.0',
'port': vm_port
}
and still have the Firefox options as
firefox_options.add_argument(f"--proxy-server={vm_ip}:{vm_port}")
The logs say:
Showing Firefox options: --proxy-server=35.20X.XXX.XX:8080
Created proxy listening on 0.0.0.0:8080
@wkeeling if this is right, which I'm currently inclined to believe is. Why isn't my authenticated proxy being used? And it runs the driver successfully, but it does not use my authenticated proxy.
@wkeeling Do you know why I'm getting 'Address family for hostname not supported' when my external ip address is open to the network (I tested it using an online ping that pings from different places) and it's ipv4.
The external ip address of my vm-instance running selenium wire is in the correct format: 35.20X.XXX.XX
@mirisr are you seeing a traceback with that error message?
@wkeeling I don’t have it with me right now, but off the top of my head, I traced it backed to server.py in selenium-wire/third party/server I believe. In the init where it grabs addr in options.
@wkeeling
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/seleniumwire/thirdparty/mitmproxy/server/server.py", line 41, in __init__
super().__init__(
File "/usr/local/lib/python3.10/site-packages/seleniumwire/thirdparty/mitmproxy/net/tcp.py", line 624, in __init__
self.socket.bind(self.address)
socket.gaierror: [Errno -9] Address family for hostname not supported
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/libs/main/session_refresher.py", line 45, in get_new_sessions
amazon_session = AmazonSession(messaging_account, country)
File "/usr/local/bin/libs/auth/amazon_session_v2.py", line 86, in __init__
self.driver = BrowserBot().setup_firefox(self._email)
File "/usr/local/bin/libs/browser/browser_bot.py", line 78, in setup_firefox
driver = webdriver.Remote(
File "/usr/local/lib/python3.10/site-packages/seleniumwire/webdriver.py", line 272, in __init__
config = self._setup_backend(seleniumwire_options)
File "/usr/local/lib/python3.10/site-packages/seleniumwire/webdriver.py", line 40, in _setup_backend
self.backend = backend.create(
File "/usr/local/lib/python3.10/site-packages/seleniumwire/backend.py", line 24, in create
backend = MitmProxy(addr, port, options)
File "/usr/local/lib/python3.10/site-packages/seleniumwire/server.py", line 61, in __init__
self.master.server = ProxyServer(ProxyConfig(mitmproxy_opts))
File "/usr/local/lib/python3.10/site-packages/seleniumwire/thirdparty/mitmproxy/server/server.py", line 49, in __init__
raise exceptions.ServerException(
seleniumwire.thirdparty.mitmproxy.exceptions.ServerException: Error starting proxy server: gaierror(-9, 'Address family for hostname not supported')
I work in similar set up. I never pass addr
because it would take localhost. You need only port
@sanjeevtrz
I work in similar set up. I never pass
addr
because it would take localhost. You need onlyport
isn't that the same as this change I made earlier: https://github.com/wkeeling/selenium-wire/issues/550#issuecomment-1143679786
No 0.0.0.0
works for Mac only for Linux '127.0.0.1'
So it better to leave it to be default as this would be 'localhost'
No
0.0.0.0
works for Mac only for Linux '127.0.0.1' So it better to leave it to be default as this would be 'localhost'
Interesting then, that it still runs. I'll try it out.
I work in similar set up. I never pass
addr
because it would take localhost. You need onlyport
@sanjeevtrz That didn't work. It still runs, but not using the authenticated proxy I provided in "proxy" args for selenium-wire options.
When I look at the Firefox network settings, it doesn't even have the radio button selected for "manual proxy configuration". Instead it has selected "Use system proxy settings". So something is still not working.
I use chrome however, if you want to use proxy with Firefox. Try installing a plugin and entering a proxy. You can do automation of installation of plugins and using proxies with authentication.
https://www.lambdatest.com/blog/adding-firefox-extensions-with-selenium-in-python/
I think you don't need selenium-wire for the above approach.
@mirisr apologies for the delayed reply, but are you still having issues with this? I notice that you're using
firefox_options.add_argument(f"--proxy-server={vm_ip}:{vm_port}")
to set the proxy for Firefox, but I believe --proxy-server
is a Chrome option - so Firefox will ignore it.
yes, never got it to work. Then how would I set it up for Firefox?
Try passing a Proxy
object containing the config:
proxy = webdriver.Proxy()
proxy.http_proxy = f'{vm_ip}:{vm_port}'
proxy.ssl_proxy = f'{vm_ip}:{vm_port}'
firefox_options = webdriver.FirefoxOptions()
firefox_options.proxy = proxy
driver = webdriver.Remote(
command_executor=selenium_connection,
desired_capabilities=DesiredCapabilities.FIREFOX,
options=firefox_options,
keep_alive=True,
browser_profile=firefox_profile,
seleniumwire_options=selenium_options,
)
Botasaurus Framework supports SSL with authenticated proxy sych as http://username:password@proxy-provider-domain:port.
Installation
pip install botasaurus
Example
from botasaurus import *
@browser(proxy="http://username:password@proxy-provider-domain:port") # TODO: Replace with your own proxy
def visit_ipinfo(driver: AntiDetectDriver, data):
driver.get("https://ipinfo.io/")
driver.prompt()
visit_ipinfo()
You can learn about Botasaurus Here.