htmlunit-driver
htmlunit-driver copied to clipboard
how to start selenium-server with htmlunit?
Hi, I want to use htmlunit driver for webscrawling, now I'm using chromedriver via selenium which takes a lot of CPU. Since my scripts are in Python and my knowledge of java is 0% I figured after some research I must start it via selenium-server remote connection. But I spent a lot of time trying to figure out this out but whatever I tried it just didn't work. Could someone please help?
I can use htmlunit-driver with OLD selenium-server.
java -cp selenium-server-standalone-3.141.59.jar:htmlunit-driver-2.64.0-jar-with-dependencies.jar org.openqa.grid.selenium.GridLauncherV3
(Note: "org.openqa.grid.selenium.GridLauncherV3" defined as "Main-Class" on META-INF/MANIFEST.MF)
Sample test written in Python:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
selenium_hub_url = "http://localhost:4444/wd/hub"
htmlunit_capabilities = DesiredCapabilities.HTMLUNITWITHJS.copy()
driver = webdriver.Remote(
command_executor = selenium_hub_url,
desired_capabilities = htmlunit_capabilities
)
driver.get("https://example.com")
print("Title: " + driver.title)
assert "Example Domain" in driver.title
driver.quit()
pip install selenium=3.141.0
python test.py
Unfortunately, I can't use htmlunit-driver with selenium-4.x.y.
java -jar selenium-server-4.4.0.jar --ext htmlunit-driver-3.64.0-jar-with-dependencies.jar standalone -I htmlunit
or
java -cp selenium-server-4.4.0.jar:htmlunit-driver-3.64.0-jar-with-dependencies.jar org.openqa.selenium.grid.Bootstrap standalone -I htmlunit
02:01:02.290 INFO [LoggingOptions.configureLogEncoding] - Using the system default encoding 02:01:02.298 INFO [OpenTelemetryTracer.createTracer] - Using OpenTelemetry for tracing 02:01:03.302 INFO [NodeOptions.getSessionFactories] - Detected 8 available processors 02:01:03.362 INFO [NodeOptions.discoverDrivers] - Discovered 3 driver(s) 02:01:03.391 WARN [NodeOptions.lambda$addSpecificDrivers$20] - Could not find htmlunit driver on PATH. java.lang.reflect.InvocationTargetException (...snip...) Caused by: org.openqa.selenium.grid.config.ConfigException: No drivers were found for [htmlunit] at org.openqa.selenium.grid.node.config.NodeOptions.addSpecificDrivers(NodeOptions.java:477) at org.openqa.selenium.grid.node.config.NodeOptions.getSessionFactories(NodeOptions.java:211) at org.openqa.selenium.grid.node.local.LocalNodeFactory.create(LocalNodeFactory.java:79) ... 22 more
I think the issue is that there's no implementation of WebDriverInfo to provide the information that NodeOptions needs to match htmlunit.