htmlunit-driver icon indicating copy to clipboard operation
htmlunit-driver copied to clipboard

Add ability to run as Grid node

Open sbabcoc opened this issue 1 year ago • 5 comments
trafficstars

sbabcoc avatar Feb 07 '24 08:02 sbabcoc

Hi @sbabcoc,

many thanks for the contribution. So far i have not done anything with the grid version of selenium but i think this is a rally great addition to the current impl.

I'm a bit busy during the next days but will have a closer look at this at the end of the next week.

But i have a first question - do you have an idea how to test this as part of our CI?

rbri avatar Feb 11 '24 09:02 rbri

Hi, @rbri, Many thanks for your tireless efforts to provide this useful headless browser. It's very complex, undoubtedly requiring a lot of effort to maintain and enhance.

The test for this would be to ensure that HtmlUnitDriver can be instantiated as a RemoteWebDriver instance. The behavior of a working implementation should be indistinguishable from a direct instance of HtmlUnitDriver. This was possible in Selenium 3, because the Grid Node implementation provided special-case handling for HtnlUnitDriver through an in-memory wrapper. This special-case support was removed in Selenium 4.

For now, it's probably best to ignore this PR. It's nowhere near ready to be merged. I've been poking at this for a while and keep falling into the gaping holes in my knowledge about how Selenium driver executable are assembled. What I'm imagining is needed is a Java executable that wraps the HtmlUnitDriver class in a LocalNode container that routes requests to the correct WebDriver functions and sends back the responses.

I sent you an email the other day requesting suggestions for what should be included in the HtmlUnitOptions class that will be needed to complete this wrapper. It seems that the WebClientOptions class encapsulates everything, so perhaps we can create an extension of AbstractDriverOptions that builds one of these.

sbabcoc avatar Feb 11 '24 19:02 sbabcoc

@rbri I have this PR to a point where all of the HtmlUnit options can be set via a custom Capabilities class (HtmlUnitDriverOptions). This is essentially a Capabilities wrapper for the WebClientOptions class, with an added BrowserVersion property. This could potentially open up scenarios that weren't previously accessible. I still need to create a launcher to enable standalone execution.

sbabcoc avatar Mar 11 '24 04:03 sbabcoc

To make the capabilities completely portable, I think I'll need to define a serialization mechanism for the BrowserVersion class and the types it uses.

sbabcoc avatar Mar 11 '24 06:03 sbabcoc

@rbri I've added encode/decode implementation for the data types used by HtmlUnit that aren't supported by Selenium's JSON implementation. I think the HtmlUnitDriver capabilities are fully implemented now, and the root constructor is

HtmlUnitDriver(Capabilities)

sbabcoc avatar Mar 18 '24 07:03 sbabcoc

@sbabcoc for private reasons i was not able to really work on HtmlUnit ff. Sorry for that

But i found some time to have a look at your work during the last days. Thanks a lot for all that. I think we can go forward a bit here

  • move the HtmlUnitOptions class out of the way to have a new HtmlUnitOptions class like all the other drivers (done)
  • switch step by step to use the new Options API instead of the old Capabilities
    • i like to do that step by step based on your work
  • make a plan for the next steps

Hopefully that sounds reasonable for you. And hopefully it is also ok to add the usual copyright header to all the files i will create based on your work. Will add you as author and as contributor. OK?

Please cross your finger that i find some time to work on this during the next week.

rbri avatar May 12 '24 11:05 rbri

and one more point - do we really need the com.nordstrom.tools dependency? I really like to have a less dependencies as possible. Maybe we can copy some code (i guess you are the author of the lib also)

rbri avatar May 12 '24 11:05 rbri

Thanks for looking this stuff over! This is your project, so you determine style, structure, and strategy. I'll add the standard copyright notices to all of the new files (except for the ServiceLoader declaration files). Regarding the com.nordstrom.tools dependency, I'm using the JarUtils class from this project to assemble the class path needed to run the remote HtmlUnitDriver from command line. I've looked for other options but haven't found any. If there's another strategy that achieves the same objective without an additional dependency, that would be great! I haven't finished implementing all of the W3C WebDriver routes yet, so it's not functionally complete. You can see what's left to do in HtmlUnitDriverServer. This also doesn't include exposing the additional HtmlUnit-specific features like setAutoProxy. Let me know how you'd like to proceed from here. Thanks!

sbabcoc avatar May 12 '24 15:05 sbabcoc

I resolved the merge conflicts.

sbabcoc avatar May 12 '24 15:05 sbabcoc

The structure and storage of HtmlUnitWebElement objects presents a challenge for remote operation. Element IDs are unique per driver, in that they're sequentially assigned integers. However, there's currently no mechanism to associate these IDs to the objects they represent. My current solution is to add a map to ElementsMap that provides this association. I'll see how that works out.

sbabcoc avatar May 12 '24 19:05 sbabcoc

@rbri I've migrated the remote-specific functionality to a new project: https://github.com/sbabcoc/htmlunit-remote All of the revisions in this PR are now devoted to providing complete configurability via the HtmlUnitDriverOptions (Capabilities) class.

sbabcoc avatar Jun 05 '24 23:06 sbabcoc

@rbri I think this is finally ready for review. Since I migrated the remote-specific implementation to a new project. this PR is much less daunting. The API for new HtmlUnitDriverOptions object is full featured, but it could probably use more documentation. Feedback is always welcomed!

sbabcoc avatar Jun 06 '24 06:06 sbabcoc

@sbabcoc great, will be on vacation for one week and then have a look, Thanks a lot sound (and looks) promising

rbri avatar Jun 06 '24 06:06 rbri

By the way... The remote wrapper that I've put together uses a NettyServer instance to serve test pages, which is a lot lighter weight than the Jetty servlets used by the HtmlUnitDriver unit tests. It's a lot faster to launch, and I only need to launch it once to serve pages for all of the tests in the unit test suite,

The wrapper itself is now working in Grid 4. The driver service, the server that manages driver sessions, and the driver sessions themselves all run within the Node process.

sbabcoc avatar Jun 14 '24 00:06 sbabcoc

@sbabcoc i thinks we can go with this - will do some cleanup with next commits....

rbri avatar Jun 21 '24 06:06 rbri

@sbabcoc THANKS a lot for this huge contribution. Have done the code cleanup and some minor things. Please start a new PR for your next adjustments.

Will publish a snapshot build and try to switch to the options at least in HtmlUnit. Let's see how it works. If everything looks fine i will try to make a release soon - ok for you?

rbri avatar Jun 21 '24 07:06 rbri

Thanks! The comprehensive capabilities class (HtmlUnitDriverOptions) is required for my 'htmlunit-remote' project to work. I hope you won't need to do too much clean up work.

sbabcoc avatar Jun 21 '24 07:06 sbabcoc

maybe we can place a link to your remote driver project in the readme - any suggestion where you like to have this?

rbri avatar Jun 21 '24 07:06 rbri

Titus Fortner suggested hosting this in the Selenium Community repo (https://github.com/selenuimhq-community). Once you publish HtmlUnitDriver 4.22.0, we'll coordinate the publication of HtmlUnit Remote. I'm currently using the 'com.nordstrom.ui-tools' group ID, but this will also change.

sbabcoc avatar Jun 24 '24 06:06 sbabcoc

great, the original plan was to do a monster release with htmlunit and the driver this weekend. But then i found one more bug. So i spend the weekend on this. Hopefully i'm now ready, will try to make the htmlunit release during the next days.

rbri avatar Jun 24 '24 06:06 rbri

There's an issue with WebElement.isDisplayed() when running remotely, because RemoteWebElement handles this via JavaScript, and the script it uses doesn't work with Rhino. I don't have the details on the precise cause of this discrepancy, but where should I log the issue?

sbabcoc avatar Jun 24 '24 19:06 sbabcoc

I think i saw this so.e time ago. Please report to the htmlunit driver. Will have a closer look.

rbri avatar Jun 24 '24 21:06 rbri

Issue #135 seems to be the one

rbri avatar Jun 24 '24 21:06 rbri

I just opened #149 about the isDisplayed issue.

sbabcoc avatar Jun 24 '24 22:06 sbabcoc