[beforefocus/focusNext()] Present & discuss
Meta issue to discuss https://github.com/openui/open-ui/pull/1259 by @mshoho
Explainer is here: https://open-ui.org/components/beforefocus.explainer/
I would be happy to join the meeting to present, give more context, answer questions. If someone could help me to figure out how to join (preferably a day in advance to make sure I come prepared).
@mshoho this is the weekly event we have: https://discord.com/events/714891843556606052/1407361268662407269/1408148648755200000. It is on our discord channel, if you would like to become a member: https://discord.gg/9pPpZ4jf.
We meet (almost) every Thursday, at 8pm CET. It is on the agenda for this week (so roughly in 1hr:30mins) but if you cannot attend we can shift to next week.
Some concerns that probably need security checking:
- Allowing focus to shift into an iframe or to the address bar would be new capability.
- Call focusNext repeatedly, watch for focus events or document.activeElement changes. Suddenly, you “see” that the UA added a secret input between two of your fields → info leak.
- introspection into the UA’s navigation algorithm, which can vary across browsers/OSes/extensions, giving away a high-entropy fingerprint.
Some concerns that probably need security checking:
- Allowing focus to shift into an iframe or to the address bar would be new capability.
Not really. Current default action for Tab press does go into iframes and address bar.
- Call focusNext repeatedly, watch for focus events or document.activeElement changes. Suddenly, you “see” that the UA added a secret input between two of your fields → info leak.
Not sure I am fully following the scenario. Could you elaborate? One of the things I am thinking about is that focusNext() could be only callable within the boundaries of a trusted key press event (i.e. when there is a real keyboard usage involved), not randomly in the background. Because the key point of this proposal is to be able to address a variety of keyboard navigation scenarios, it makes sense to couple it with the keyboard usage.
- introspection into the UA’s navigation algorithm, which can vary across browsers/OSes/extensions, giving away a high-entropy fingerprint.
I am not sure it is that much of a concern. In most of the cases I can deduce the focus order just by looking at the DOM. Just querySelectorAll('focusable element selector') would more or less give it out. Do you have particular examples?
The Open UI Community Group just discussed [beforefocus/focusNext()] Present & discuss.
The full IRC log of that discussion
<Penny> KEy question is : did the group see the proposal that Marcus merged. Marcus has been working on keyboard nav for some time at Microsoft, called Tapster, in MS repo on Github.<Penny> Tapster is a large group of workarounds, but the problem is not everything can be worked around.
<Penny> They mentioned that this framework is a collection of workarounds, and not everything can be addressed with workarounds. Marcus also noted that Tapster provides high-level keyboard navigation abstractions like "grouper" and "mover."
<Penny> Continuously trying to create a high-level abstraction that fits all cases might be a large ongoing endeavor and we should focus on low-level changes that browsers already implement, such as the before Focus event and the focusNext method.
<masonf> q?
<Penny> There are security concerns related to these methods and we could mitigate by tying them to real keyboard presses or screen reader actions
<Penny> Marcus has come up with something realistic that doesn't break backward compatibility and is relatively simple to implement.
<masonf> q+
<Penny> mshoho intends to prototype in September
<Penny> And can present live use cases after this
<keithamus> q+
<Penny> (also s/Marcus/Marat/g, sorry Marat!)
<Penny> Mason thinks the proposal is a cool idea and has heard similar requests before, ack the difficulty of implementing sequential focus logic in user space due to its complexity and variability across browsers.
<Penny> Would calling "focusNext" require user activation, as this would likely address several security concerns?
<Penny> Marat: focusNext should reuse the browser's default routine for tab presses and it seems that allowing the method to be called only with a real key press would address security concerns, as their goal is to influence focus movement without altering the DOM.
<Leo> q+
<Penny> masonf: user activation is a browser state indicating an affirmative user action, like a keyboard press or mouse click. The other issue comment concerns the before focus event rather than the focusNext function, as there are two parts to the proposal.
<Penny> masonf: concerns about adding synchronous events during focus changes and the cancelability of the event, particularly what happens to focus if the event is cancelled.
<masonf> ack mason
<Penny> Marat: it's acceptable for focus to move to the body if the event is cancelled, aligning with the standard browser behavior for losing focus. If the script fails to handle it properly, it's the script's fault.
<sarah1> q+
<masonf> ack keith
<Penny> Marat: while such functionality gives developers power and responsibility, it offers control and flexibility that could simplify higher-level abstractions and potentially lead to more native and robust solutions, reducing reliance on workarounds
<Penny> keithamus: generally thinks this is a good idea, but has the same concerns.
<Penny> keithamus : if focusNext is called repeatedly, especially in conjunction with key press event listeners, could it potentially move focus into the browser's UI? focusNext should only operate within the document and not traverse into the browser's UI.
<Penny> Marat: all events should remain within the page.
<Penny> Marat: If focusNext is called and focus moves to the address bar, nothing further should happen from within the page, except for a
<Penny> focusout event.
<Penny> Marat: if focus moves into the page from the address bar, a before focus event should occur, similar to existing focus and blur events.
<Penny> Marat: if focus is outside the page (in the address bar or devtools), calling focusNext should have no effect, to prevent unexpected behavior when the page is not active.
<Penny> keithamus: is "trapped Boolean" useful for focusNext, it could circumvent the default behavior against focus trapping, an anti-pattern,there is a decision for modal dialogues not to trap focus within the page.
<Penny> Marat: there is a circular list where "trapped Boolean" is needed to prevent focusNext from moving focus to the address bar after the last item. There is difficulty of handling cross-origin iframes when determining if focus should enter or skip the iframe before a key press.
<Penny> keithamus: What is the motivating use case for preventing default on the before focus event and why one would want to cancel focus changes when they can already be observed and redirected?
<Penny> Marat: for complex, virtualized lists, especially those with third-party components or iframes, handling focus by altering the DOM isn't enough. There is difficulty of programmatically managing focus movement in such scenarios, making the proposed low-level APIs crucial for providing necessary control.
<masonf> q?
<Penny> Marat: current workarounds, like "bumper inputs," are hacky, not performant, and cause issues with accessibility tools like screen readers and automated testing.
<masonf> ack leo
<Leo> https://github.com/openui/open-ui/pull/1260#issuecomment-3211641378
<Penny> keithamus : there are complexities on the user side and agree that the proposal is a useful low-level implementation.
<Penny> keithamus: a middle-ground solution, possibly like "focus group," might also be needed, as the proposed low-level APIs would still require significant heavy lifting from developers.
<Penny> keithamus: these solutions could be used together to address the problem.
<Penny> Leo : there is the "focus group" proposal being championed by the Edge team. Both proposals address different scopes of the problem and can advance in parallel. Recently presented the proposal at the ARIA working group and will update it based on feedback.
<masonf> ack sarah
<Penny> Marat: there is another unsolvable problem, iframes stealing focus from the main application, which the "before focus" event could address.
<Penny> sarah1 : Tabster programmatic tab control has led to focus breaking for entire applications. There is an example of this happening in Edge, where focus became trapped and impossible to move with the keyboard.
<Penny> Marat: agree on that point
<keithamus> q+
<Penny> sarah1 : believes that for use cases like managing focus in arrow regions, Focus group would be a better solution. Regarding iframe focus management, issues like an iframe stealing focus should be resolved with the iframe owner. Unsure if the surrounding app should have control over an embedded iframe's autofocus behavior.
<sarah1> ack me
<Penny> Marat: it's not always possible to communicate with iframe owners, especially in the context of platform apps with third-party developers.
<Penny> sarah1 : for preventing untrusted iframes from stealing focus, the "Focus Out" event could be used on the surrounding application to check the related target and potentially prevent the focus shift.
<Penny> sarah1: proposed that if the use case is focused on iframes, it might be better to look at more targeted solutions for managing focus into and out of iframes rather than a generic "before focus" event.
<masonf> ack keithamus
<sarah1> q+
<Penny> Marat: Even native concepts like the modal dialogue are not flexible enough for complex scenarios, such as when a context menu needs to extend beyond the dialogue's boundaries while maintaining the same modality. A simple concept like a modal dialogue cannot be made sufficiently flexible, it would be even more challenging to create high-level abstractions like "focus group" that would work for everyone. Lower-level solutions provide developers
<Penny> with more control and flexibility. Having something high level that works for everyone will not be possible. Acknowledge that new low level solution increases ways things can be broken, but there are already a lot of ways to break the application.
<masonf> ack sarah
<Penny> keithamus : could consider extending sandboxing attributes for iframes to prevent autofocus as a more targeted solution to the iframe focus issue, rather than a generic "before focus" event.
<Penny> sarah1: looked at the practical usage of a low-level Tapster hook and found that 75% of its usage across teams negatively impacted accessibility.
<Penny> sarah1 : this is being used to create arrow groups, which is a common pattern that this hook was designed to address.
<keithamus> q?
<masonf> q+
<Penny> marat: high-level abstractions are still necessary because accessibility experts often disagree on solutions, leading to multiple, non-ideal outcomes.
<Penny> Marat: need a realistic, low-level solution that can be implemented quickly, not the decade-long development of the modal dialogue. Goal is to make his framework obsolete by having browsers natively support these features, but there's a need for solutions now.
<masonf> ack mason
<sarah1> q+
<Penny> masonf: web standards are never a quick solution and don't to assume a low-level API is easier to ship than a high-level one, as there is a strong bias against adding footguns and that is a risk with this proposal.
<masonf> ack sarah
<Penny> Marat: the "footgun" aspect of the current solutions is a result of workarounds necessitated by the lack of native browser support.
<Penny> Marat: many of the "footgun" issues would be resolved if they didn't have to rely on these workarounds and hacks.
<Penny> masonf: it's possible to create a bad experience with existing APIs, a "footgun" is when the platform itself makes it easy to do so with just a few lines of code.
<Penny> Marat: already a wide variety of footguns with existing focus events
<Penny> masonf : likely true
<Penny> sarah1: believes focus.next would be useful, but she is worried about the beforefocus event. Appreciate that Tapster provides a fertile ground to observe how developers use these tools and hopes to create higher-level utilities that avoid common mistakes.
<Penny> Marat can start preparing a proposal for the high-level concepts in Tapster for discussion. But anticipate this would take a long time.
<Penny> scribe-
<bkardell> sorry I missed the meeting today
@mshoho
Not really. Current default action for Tab press does go into iframes and address bar.
Sorry, I meant for a Javascript API. What's the use case for programmatically sending focus into an iframe or the address bar?
Not sure I am fully following the scenario. Could you elaborate?
I think the concern is related to an extension like a password manager popping up UI in response to a user action and a call to focusNext potentially observing and aborting that.
One of the things I am thinking about is that focusNext() could be only callable within the boundaries of a trusted key press event
That would likely address some of the security concerns, but it also seems possibly too limiting. Consider the virtual list example: (1) click the "Edit next item" button, (2) the list wants to asynchronously adjust the DOM to show the next editable item and then call focusNext. Wouldn't the required asynchrony lose the trusted state needed to call focusNext?
In most of the cases I can deduce the focus order just by looking at the DOM.
I know that Macs have settings around what elements are focusable, and it seems like with this API, that setting could be discovered. I really have no clue if that's a practical problem but seemed with considering.