webdriver
webdriver copied to clipboard
Pointer Actions should specify which viewport to reference
This came up from a PR we have for Pointer Actions. The only specific viewport reference in the spec is when we mention the browsers viewport for the move out of bounds error. Other than that we just use the term viewport without any reference as to which.
In our case it came up when executing pointer actions inside of an iframe whether it should use the browser viewport or the iframes viewport. There were no web platform tests that clarified which to use and the spec also doesn't clarify. When testing it appeared Firefox uses the iframes viewport.
We should consider using relative adjectives such as "current frames viewport" and add tests to validate that behavior for Actions.
@AutomatedTester This is a current interop spot for us. Firefox seems to reference the currently selected viewport but this has some failings. For instance if you switch to an iframe that isn't in the browsers viewport, but the element in that iframe is within the iframes viewport, Firefox will think "It's in view" and try to click then fail because it's not within the browsers viewport. Should there be some verbiage around scrolling every parent frame into view first?
This seems like a bug in the firefox implementation. actions has always been designed to work as though it were "above the glass" so the frame would need to be scrolled in before the other things worked. This would need the scrolling primative we have talked about in the past to make it work as actions should not scroll by themselves.
This is also a problem with other commands that require positional information, such as Element Click. I filed https://github.com/w3c/webdriver/issues/1141, but it has not been addressed yet.
The way I read @InstyleVII’s example, the Firefox implementation is correct per the current spec definition because the spec does not take into account that the current browsing context (the <iframe>) is outside the viewport of the top-level browsing context
@andreastt So the expectation is that the pointer input should be relative to the current viewport and not the root instance?
We set aside the issue of inter-frame actions (such as dragging an item from one iframe into another), but it's desirable to be able to do this, and (I think) was one of the reasons we wanted elements to be within the viewport when Actions interact with them. I agree with @AutomatedTester's take on this.
Not sure I agree with your assessment @shs96c. @InstyleVII’s example is when an element you interact with inside an <iframe> is in the viewport, but the frame itself isn’t.
This will naturally cause Firefox’ interaction steps to fail because it uses an (x, y) coordinate it passes to the event queue for synthesising a click. I agree with @InstyleVII’s suggestion that we first ought to scroll the frame into view, but there are no provisions for doing this in the spec.
I think the key issue is whether Actions are supposed to be relative to the current selected frame/viewport or not. If they are, then I'd argue Firefox is correct however should still fail because the element itself wasn't scrolled into view (and this brings up the subsequent should we scroll it into view discussion).
However if instead Actions should always be relative to the root viewport of the page and not take into account frame context then Firefox is incorrect, this is currently how Edge works simply because it was quicker to implement. If we think this is wrong then we can update to match Firefox but the spec should be clarified first to either say currently selected viewport or root viewport when processing actions.
I would like to relive this conversation as I am stumbling upon this in a WebdriverIO issue. Given we have an application like this:

Calling performActions with the following parameters:
[
{
"type": "pointer",
"id": "finger1",
"parameters": {
"pointerType": "mouse"
},
"actions": [
{
"type": "pointerMove",
"duration": 0,
"x": 64,
"y": 18
}
]
}
]
where 64px/18px are the x/y coordinates of the center of the button within the iframe, it would actually hit (as in a mouse move event listener on that button would be triggered) the button in Firefox but not in Chrome. In Chrome I need to pass in the relative position of the iframe to the root document, e.g.:
- "x": 64,
+ "x": 64 + 97,
- "y": 18
+ "y": 18 + 67
@AutomatedTester @shs96c I suggest to add wording to the spec saying that actions are always relative to the document coordinates. if you agree I could take a stab at this.
CC'ing @k7z45 to let the chromedriver team know about this different behavior.