clients icon indicating copy to clipboard operation
clients copied to clipboard

[PS-1918] Make autofill doc-scanner traverse into ShadowRoot

Open RafaelKr opened this issue 2 years ago • 9 comments

This PR is another take on fixing autofill inside ShadowRoot (#713). Thanks to @chadm-sq there was a previous attempt in #2283 but this just had a recursive usage of querySelectAll. The problem with that implementation is that querySelector and querySelectorAll methods don't traverse into elements with closed ShadowRoot.

Firefox 63 introduced the privileged accessor Element.openOrClosedShadowRoot (MDN documentation) to allow webextensions to access shadow roots regardless of their open/closed state.

Other APIs such as querySelectorAll are still affected by shadow dom isolation, so to find shadow roots an extension will have to walk the entire tree via NodeIterator or TreeWalker to check for shadow roots. Since shadow roots may be attached at later point extensions may also have to wait for page modifications via mutation observers before attempting to access shadow DOM, so overall this approach is less ergonomic than using querySelector, but at least it's possible.

– https://stackoverflow.com/a/60460821

  • mutation observers: IMO autofill and form submission logic should be combined in the long run. Please see https://github.com/bitwarden/clients/pull/2066#issuecomment-1326558758 and the following comment for more info.
  • NodeIterator vs TreeWalker: In https://stackoverflow.com/a/64551276 the author is comparing querySelectorAll('*'), NodeIterator and TreeWalker. The last one is the fastest by far (although I didn't test this myself and the test conditions are a little bit different. But as we can't use querySelectorAll we're left with TreeWalker anyways).

So to also support closed ShadowRoots we need a TreeWalker in combination with openOrClosedShadowRoot. Note that openOrClosedShadowRoot is only available to WebExtensions. This PR is my take on it.

This change is fully backward compatible. If openOrClosedShadowRoot is not available it will always return undefined and we will never traverse into ShadowRoots just as the behavior was before this change.

Original description copied from PR #2283 from @chadm-sq:

Shadow-roots are a recent invention that makes a web page an environment that can support independently-operating components. They give separate namespaces that the well-known element-query functions work upon, and for which separate searches and answers are given. It's becoming normal for a web page to create UI structures that are maintained by separate code, so (eg) a search form can not interfere with a log-in form.

Since searches do not pass from one (document or shadow) root to other (doc or sh) root, the form username-and-password search functions need to intentionally descend into each new root as it finds it.

Closes: https://github.com/bitwarden/clients/issues/713

Type of change

- [x] Bug fix
- [ ] New feature development
- [ ] Tech debt (refactoring, code cleanup, dependency upgrades, etc)
- [ ] Build/deploy pipeline (DevOps)
- [ ] Other

Objective

Autofill finds relevant elements within open and closed ShadowRoots.

Code changes

apps/browser/src/content/autofill.js:

  • Query elements by using a TreeWalker instead of document.querySelector[All]. The reason for this is that querySelectorAll doesn't traverse into elements with ShadowRoot.
  • Recursively traverse into elements with openOrClosedShadowRoot inside TreeWalker loop.
  • Use new query logic everywhere inside autofill.js. This also means we need to use filter functions to find elements with specific nodeNames and/or attributes instead of CSS selector strings.
  • Add two new instanceof Element checks to prevent Failed to execute 'getComputedStyle' on 'Window': parameter 1 is not of type 'Element'." errors.

This change to openOrClosedShadowRoot is fully backward compatible. If openOrClosedShadowRoot is not available it will always return undefined and we will never traverse into ShadowRoots just as the behavior was before this change. Also all other changes should be fully backward compatible.

Testing requirements

  • Normal log-ins still auto-fill.
  • Pages with shadow roots now auto-fill. E.g. login.clear.com.br

Before you submit

  • Please add unit tests where it makes sense to do so (encouraged but not required) I did NOT add any unit tests yet. I leave this to the Bitwarden team. I also DID NOT test my changes from CSS selectors to the filter functions, but I crafted them carefully and autofill works. Would be happy if someone also reviews these carefully again.

  • ~~If this change requires a documentation update - notify the documentation team~~

  • ~~If this change has particular deployment requirements - notify the DevOps team~~

RafaelKr avatar Nov 25 '22 15:11 RafaelKr

Thank you for your contribution! We've added this to our internal Community PR board for review. ID: PS-1918

bitwarden-bot avatar Nov 25 '22 15:11 bitwarden-bot

Hey @eliykat, thanks for reviewing this so quickly!

My main question is about this not-very-fun warning from MDN:

Non-standard: This feature is non-standard and is not on a standards track. Do not use it on production sites facing the Web: it will not work for every user. There may also be large incompatibilities between implementations and the behavior may change in the future.

Are there any standard/less experimental ways of traversing a closed shadowroot, that you know of? (I'll also raise this with the team for discussion.)

Thanks for pointing this out, this made me think about it again and I noticed we can just use return element.openOrClosedShadowRoot || element.shadowRoot;, so we fallback to element.shadowRoot for every browser which doesn't support element.openOrClosedShadowRoot.

So we now have the two ways chrome.dom.openOrClosedShadowRoot (Chrome) and element.openOrClosedShadowRoot (Firefox) and every other browser uses element.shadowRoot which just finds open ShadowRoots, not closed ones.

RafaelKr avatar Nov 28 '22 08:11 RafaelKr

@eliykat I have no Safari, could you please test if it works for open ShadowRoot there?

RafaelKr avatar Nov 28 '22 08:11 RafaelKr

Thanks for your review @chadm-sq! I addressed all of your feedback.

RafaelKr avatar Nov 29 '22 13:11 RafaelKr

I'll test this on Safari (as requested), plus a few other major browsers. I'll take another, closer look at the diff, and I've also asked for a second set of eyes from the team (given that autofill.js is fairly arcane and not covered by automated testing).

eliykat avatar Nov 30 '22 06:11 eliykat

Thanks @eliykat! I just did 2 new changes:

  1. I unnested an if-statement by using continue in bitwarden/clients@6f80de7 (#4119)
  2. I missed one occurence where I didn't handle attribute values correctly. I just fixed this and pushed this as a fixup to ea1c84b https://github.com/bitwarden/clients/blob/ea1c84b623881dd775c9c9b41bc27b05a20a343b/apps/browser/src/content/autofill.js#L694-L695

RafaelKr avatar Nov 30 '22 06:11 RafaelKr

I just tweaked the commit messages for documentation purposes a little bit. From my side this is now ready to merge.

Edit: Also just triggered a rebase on master.

RafaelKr avatar Dec 01 '22 07:12 RafaelKr

I've given this a quick test on Safari (using my Codepen, plus quick regression test on Facebook, Google and Amazon login pages). As expected, it fills open ShadowDOMs but not closed ShadowDOMs (given that Safari does not have those experimental APIs), and no regressions on those sites. QA will give it a more rigorous going over, but for now that looks good to me.

I also did a rough fairly non-scientific benchmark: I placed console.time/timeEnd statements around the calls to collect and fill in the message listener at the bottom of autofill.js. I took the average and the median times for all calls to each function for each of the above sites. Variance between master and this branch was +/- 2.2ms at the most, and I expect there would be a large variance between repeated tests anyway. As I said, non-scientific. But I wanted some idea of how these changes affected the overall speed of collect & fill operations, and in that respect I think we're good.

eliykat avatar Dec 12 '22 01:12 eliykat

Thanks for your time, benchmarks and feedback, I appreciate it.

I addressed all of your feedback now, some feedback with some questions.

I also untangled the conditions for the urlNotSecure function. One thing I did is checking for document.location.protocol === 'https:' instead of document.location.protocol !== 'http:'. I'm not 100% sure if this is fine or if we better should keep it the other way.

RafaelKr avatar Dec 14 '22 13:12 RafaelKr

Hey @djsmith85, thanks for reviewing! ❤️ I just updated the urlNotSecure (hopefully the way you expected!). It's now the same as on master, but using queryDocAll instead of document.querySelectorAll.

I didn't test things again, as I unfortunately currently have less time available. But if you test this on a http page for which you saved https credentials and it shows the confirm dialog I think its finally ready to ship! 😊

RafaelKr avatar Jan 16 '23 15:01 RafaelKr

I just addressed the CodeScene analysis violations. Now we don't have violations left, only a warning.

RafaelKr avatar Jan 16 '23 15:01 RafaelKr

CI branch has been updated, I'll get a final review + quick test in on this next week.

eliykat avatar Jan 19 '23 21:01 eliykat

Looks great, thanks again @RafaelKr! I'll pass it on to our QA team for testing.

eliykat avatar Jan 24 '23 02:01 eliykat

@eliykat GitHub still shows you requested changes. When clicking on the link it shows nothing. Seems to be a bug? Does this matter for your QA team? I could re-request a review.

RafaelKr avatar Feb 06 '23 15:02 RafaelKr

Was a pleasure to iterate through this with you guys! Thanks for merging.

RafaelKr avatar Feb 20 '23 08:02 RafaelKr

Hello @RafaelKr. I am an Engineering Manager at Bitwarden. I first wanted to thank you so much for this contribution. Unfortunately, our team had to revert these changes out of the browser and release 2023.3.1 version due to performance issues that several of our users saw. These appeared to be isolated to Firefox and Edge and were on sites that had a lot of input fields (e.g. an admin panel maintaining lists of entities), and the slowdown appeared to be in the TreeWalker implementation.

I wanted to make you aware, and also see if you would be willing to help work through these issues and hopefully come up with a solution that addresses ShadowRoot without these particular performance issues in an edge case. We don't want to disregard all of the work you have done and want to see a solution that can achieve both.

trmartin4 avatar Apr 05 '23 22:04 trmartin4

Hello @RafaelKr. I am an Engineering Manager at Bitwarden. I first wanted to thank you so much for this contribution. Unfortunately, our team had to revert these changes out of the browser and release 2023.3.1 version due to performance issues that several of our users saw. These appeared to be isolated to Firefox and Edge and were on sites that had a lot of input fields (e.g. an admin panel maintaining lists of entities), and the slowdown appeared to be in the TreeWalker implementation.

I wanted to make you aware, and also see if you would be willing to help work through these issues and hopefully come up with a solution that addresses ShadowRoot without these particular performance issues in an edge case. We don't want to disregard all of the work you have done and want to see a solution that can achieve both.

Hello @trmartin4. Thank you very much for reaching out. First, I'm sorry for the inconvenience I may have caused for some of the Bitwarden users and the Bitwarden team members involved in working out a patch release. I already saw the revert today but didn't have time to look further into it.

I'm willing to have a look at this the coming days and see what I can do.

According to many performance tests I checked out before writing the first PoC for this PR TreeWalker itself is very performant. Even more performant than document.querySelectorAll which was used previously. So I assume it's not TreeWalker itself but some other part of the code I changed. I'm confident I'll be able to pinpoint bottlenecks and find a solution for these parts.

Quickly skimming through some related issues I saw you were able to confirm the performance regression internally. Could you maybe provide a link to the specific page(s) where you were able to reproduce the issue?

If I can pinpoint the bottleneck(s) I'll try to write an own reproduction page which can later be used to test for performance regressions.

I'll have a deeper look into this as soon as I find time for it. Probably I can provide a fix by the end of the month, so maybe we can try again with 2023.4 or 2023.5

Best regards

Edit: I could reproduce it on the GitHub PR page https://github.com/encointer/encointer-wallet-flutter/pull/1101 posted in https://github.com/bitwarden/clients/issues/5107#issuecomment-1493414292

RafaelKr avatar Apr 05 '23 23:04 RafaelKr

I have Bitwarden Version 2023.4.0 on Chrome Version 113.0.5672.126. I can't manage to make Bitwarden work with inputs in shadow dom. Other password manager work. If I use the copy custom field name I get: "Unable to identify a valid form element. Try inspecting the HTML instead."

image image

VeloAddict avatar May 30 '23 11:05 VeloAddict

@VeloAddict this PR was reverted due to a performance regression. Unfortunately I didn't have time to debug and resolve the issue yet.

RafaelKr avatar May 30 '23 11:05 RafaelKr

@RafaelKr

Hello, I'm a Front End Engineer with Bitwarden. Thank you for providing this contribution, even if it had to be reverted due to the experienced performance issues. Todd asked our team to investigate possible bottlenecks with the implementation, and I believe that we have found a path forward that leverages part of your implementation to deeply query all possible ShadowDOM elements.

What we found was that using your TreeWalker implementation for querying all elements relating to found fillable elements was proving to be fairly expensive. (An example of this would be when the implementation attempts to find label text for an element using the method getLabelTag at line 287 of the implementation.)

We found that if we instead handle query logic relating to individual found elements a bit more conservatively, and only use the TreeWalker API to initially query all possible elements found within the DOM/ShadowDOM, performance improves dramatically.

We have a larger scale refactor that is in the works for autofill.ts which modernizes and cleans up the implementation. Within this refactor, we will be applying a fix heavily inspired by your work that will address the issues Bitwarden's extension has with autofilling ShadowDOM elements.

I'll make sure to link this ticket to the PR that addresses that problem when it is ready for review.

cagonzalezcs avatar Jun 12 '23 19:06 cagonzalezcs

@cagonzalezcs

Hello Cesar, thank you very much for your feedback, I really appreciate it. Also I love this moving forward, I had a little sneak peak at your pm-2130-shadow-dom-experiment branch and it's very good to see this rebuilt from the ground up. The current implementation of course has by far too much queries all walking through the full DOM and running it just initially to collect all relevant elements is a much better solution.

I don't know how much of all the discussion you followed, but I would like to give you a heads up to my comments https://github.com/bitwarden/clients/pull/2066#issuecomment-1326558758 and https://github.com/bitwarden/clients/pull/2066#issuecomment-1327312681. There I already suggested collecting the elements just once, but more importantly I suggested also using it not only for autofill but move the collecting logic into an own service and use it also for the saving/updating logic.

Also I had the idea to experiment with MutationObserver in this service, which could make it possible to really run the full DOM traversal only on initial page load and then only listen to changes from then on.

RafaelKr avatar Jun 13 '23 08:06 RafaelKr

@RafaelKr

Thanks for providing that context, it'll be helpful as our team continues to work through improving the autofill behavior. I like your idea for implementing some kind of watcher logic using MutationObservers to anticipate changes in the DOM structure. I had some thoughts around that as well, but held off on any deep investigation of how we'd approach implementing and leveraging behavior surrounding that API.

Right now, the efforts we're working through involve refactoring the existing behavior of the autofill content script in a near parity manner. We're approaching it that way to ensure we introduce as little as risk as possible for breaking the behavior overall, while setting ourselves up to more easily add features and bug fixes down the line. While it is still in progress, that work can be previewed within PR #5453

We've got a couple of tasks that we're working to complete in order to ensure feature parity of this refactor with the legacy autofill content script, but once this is merged in, the fix for ShadowDOM elements and other improvements are going to follow.

cagonzalezcs avatar Jun 13 '23 19:06 cagonzalezcs

is this still reverted? is there an open ticket about autofill not working with the shadowDOM?

martinmckenna avatar Jun 04 '24 15:06 martinmckenna

This PR is reverted, but was used in re-working the ShadowDOM support with the PR linked above your comment.

If a specific website isn't working, then likely its a different issue.

cagonzalezcs avatar Jun 04 '24 17:06 cagonzalezcs

hmm yeah my shadow root email field is being auto-filled, but not the password. The input has a <label> of "password" and type="password" so not sure if I'm missing something else.

The login.clear.com.br in OP seems to work though and the only difference I see is the name="password" property

martinmckenna avatar Jun 04 '24 18:06 martinmckenna

A bug was introduced where ShadowDOM elements that contain any child nodes are ignored. This has been resolved in a recently merged PR - https://github.com/bitwarden/clients/pull/9063

I'm willing to bet that's the issue you're seeing with the element.

That work should be released with an upcoming release of the extension.

cagonzalezcs avatar Jun 04 '24 21:06 cagonzalezcs

A bug was introduced where ShadowDOM elements that contain any child nodes are ignored. This has been resolved in a recently merged PR - #9063

I'm willing to bet that's the issue you're seeing with the element.

That work should be released with an upcoming release of the extension.

When will the Firefox MV3 extension with that fix be released?

larena1 avatar Jun 04 '24 21:06 larena1

That fix doesn't relate to MV3 specifically, and any updates to the extension moving forward for the moment will be cross-version compatible with MV2 on FF/Safari and MV3 on Chrome.

That said, the next update should begin rolling out next week if nothing causes a delay.

cagonzalezcs avatar Jun 04 '24 22:06 cagonzalezcs

Thanks, is there any particular reason why MV3 is only tested with Chrome and not FF? Been waiting years for proper incognito support.

larena1 avatar Jun 04 '24 22:06 larena1

We worked to solidify MV3 in Chrome first due to the hard deadline that Google set for sunsetting MV2. The process for that implementation was approached in a backwards compatible manner for a variety of reasons. The largest reason for this is due to differences in how Chrome's background service worker and FF/Safari's non-persistent background events page function.

That said, it is our intent to move forward with support of MV3 on FF/Safari in the near future, though our current focus is stability of MV3 in Chrome.

Regarding incognito support in Firefox, that has been addressed thanks in part to the work done for MV3. Within the next update, you should see better functionality when using the extension between an incognito context and the main browsing context in FF (similar to how Chrome handles the behavior).

cagonzalezcs avatar Jun 04 '24 22:06 cagonzalezcs