selenium icon indicating copy to clipboard operation
selenium copied to clipboard

[java] POC: Porting WebDriver classic command to BiDi

Open pujagani opened this issue 1 year ago • 6 comments

Thanks for contributing to Selenium! A PR well described will help maintainers to quickly review and merge it

Before submitting your PR, please check our contributing guidelines. Avoid large PRs, help reviewers by making them as simple and short as possible.

Description

This POC demonstrates the changes in porting the WebDriver Classic command to use BiDi under the hood. The changes port GET and PRINT command.

Motivation and Context

The next big step for Selenium towards Selenium 5 is porting WebDriver Classic commands to use WebDriver BiDi protocol. This is the first step towards it. The changes are not trivial, a lot of pieces need to be considered. Instead of moving the mountain, trying to make progress towards it in decent-sized chunks of work.

Reason for choosing GET command: demonstrate the need for requiring driver instance since capabilities like pageLoadStrategy are required

Reason for choosing the PRINT command: Required additional step to serialize parameters into PrintOptions. That piece of code can be moved to PrintOptions but the build kept breaking due to cycling dependency so didn't dig deep into it.

Includes the items that need to be considered/included: TODO:

  • Handling commands if the browser supports BiDi but not the command
  • Error handling - correct mapping to error codes and throwing respective exceptions to avoid any breaking changes to the user
  • The current approach should work for most commands but atoms will need special handling (I have an idea of what is needed and how to do this)
  • Actions, JavascriptCommandExecutor etc - they all pass the commands through the RemoteWebDriver, so the current approach will work seamlessly as long we add the commands to BiDiCommandExecutor
  • Add a mapper to BiDi LocalValue to map argument based on type to LocalValue instance (I will be implementing this next)

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [X] New feature (non-breaking change which adds functionality)
  • [X] Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • [X] I have read the contributing document.
  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the documentation accordingly.
  • [ ] I have added tests to cover my changes.
  • [ ] All new and existing tests passed.

pujagani avatar Nov 23 '23 07:11 pujagani

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 58.07%. Comparing base (57f8398) to head (cfeccd0). Report is 571 commits behind head on trunk.

:exclamation: Current head cfeccd0 differs from pull request most recent head 22bc8a6

Please upload reports for the commit 22bc8a6 to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##            trunk   #13190      +/-   ##
==========================================
- Coverage   58.48%   58.07%   -0.42%     
==========================================
  Files          86       88       +2     
  Lines        5270     5340      +70     
  Branches      220      224       +4     
==========================================
+ Hits         3082     3101      +19     
- Misses       1968     2015      +47     
- Partials      220      224       +4     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Nov 23 '23 08:11 codecov-commenter

This is great. I love seeing all that extra code removed from the driver classes.

What is the separation of concerns between BiDiDelegator and BiDiCommandExecutor? Can't BiDiCommandExecutor just extend HttpCommandExecutor directly? It looks like it is essentially only storing the list of commands?

Oh, this is when the user passes an existing HttpCommandExecutor in to the RemoteWebDriver constructor... Would it make sense not to use BiDi at all for that, though? Do we want to override a user's custom command executor even if we could? At least in the short term? Then toggle which Executor to use in createExecutor()? Or does TracedCommandExecutor make that approach more complicated than it is worth?

titusfortner avatar Nov 25 '23 18:11 titusfortner

BiDiDelegator knows where to delegate the commands and that's it. BiDiExecutor will have reference to methods/class that handles converting the Classic command to BiDiCommand and also maintain browsing context to identify where to run the command since almost all BiDi commands need the browsing context id to be sent to run commands. For atoms, it will have additional methods to convert arguments to how BiDi expects it to run scripts.

I think we don't want to override the user's custom executor. I have considered that case. So we do a check that if the user passes an executor that is an instance of BiDiDelegator then we set the BiDiCommandExecutor for it. Otherwise, the executor passed in will be used as it is. So essentially, the default behaviour for Selenium would be to use BiDi unless the user wants to pass their own CommandExecutor then we run with that. I think we need to allow users to be able to do that since there could be a possibility that web sockets are not supported by their company proxy etc.

pujagani avatar Nov 27 '23 06:11 pujagani

Details or gaps found due to failing tests:

  1. AlertsTest - fails because the page has alert when the page loads. BiDi doesn't finish loading the page completely due to the alert and hence times out when page load strategy is normal. However, WebDriver Classic, doesn't get blocked due to the alert and moves on.
  2. Page loading tests - Any tests that make use of page load timeouts fail because BiDi does not take into consideration the page load timeout values set. I think action item here is creating a github issue for the same in the w3c BiDi repo.

pujagani avatar Nov 29 '23 14:11 pujagani

@pujagani is there a long therm plan to support classic w3c commands or is the plan to shift to BiDi at a certain time?

I am currently playing with BiDi and noticed it is relying on a stable tcp-ip connection, what seems to be not the case in the company i am working at. Looks like our datacenters sometimes have short connection outtakes or at least some connections are dropped (might be the VPN, who knows).

We did not notice them till now, as HTTP clients do automatically reconnect when a connection is stale. Is there a way to reconnect the Websocket connection when it breaks? This is propably no easy task especially when the driver is sending a event to a broken connection.

joerg1985 avatar Feb 07 '24 13:02 joerg1985

Thank you @joerg1985 for sharing this. The plan is to move to BiDi under the hood Selenium 5 onwards (whenever it happens). I don't know if there is a way to reconnect when the WebSocket connection breaks. We might need to track this as a separate issue to ensure this move to BiDi is resilient to such network changes.

pujagani avatar Feb 12 '24 04:02 pujagani