html
html copied to clipboard
[Proposal] Invoker Buttons - allowing popover/dialog and more to be invoked without JS
Following on from https://github.com/whatwg/html/issues/3567 and https://github.com/whatwg/html/pull/9456 where we tried to specify a way to open dialogs without JavaScript, @nt1m and @smaug---- raised concerns that the attribute was not extensible.
I've taken the feedback, and instead I'm proposing a new set of attributes that allow for opening dialogs and popovers, and also allow for extensibility for other interactions. I'll quote the summary:
Adding
invokertargetandinvokeractionattributes to<button>and<input type="button">/<input type="reset">elements would allow authors to assign behaviour to buttons in a more accessible and declarative way, while reducing bugs and simplifying the amount of JavaScript pages are required to ship for interactivity. Buttons withinvokertargetwill - when clicked, touched, or enacted via keypress - dispatch anInvokeEventon the element referenced byinvokertarget, with some default behaviours.
In addition, adding an
interesttargetattribute to<button>,<a>,<input>elements would allow disclosure of high fidelity tooltips in a more accessible and declaritive way. Elements withinteresttargetwill - when hovered, long pressed, or focussed - dispatch anInterestEventon the element referenced byinteresttarget, with some default behaviours.
I'm soliciting feedback on this, and if we think this is more tenable than https://github.com/whatwg/html/pull/9456 I'm happy to go forward with specs/implementations.
Nice proposal and explainer! I'm generally supportive of the direction this is going, and I like that you've really expanded the capabilities to include things other than <dialog> and popovers. I also see you've added an interesttarget mechanism via events to be able to handle both the "activate" and "hover" use cases, which is great.
A few small comments related to parts of the proposal:
In the style of popovertarget, this document proposes we add invokertarget, and invokeraction as available attributes to
<button>,<input type="button">and<input type="reset">elements,
I think you'd want invokeraction on just the set of things we currently allow, which are those "buttons" you listed, but only when they don't participate in a form in some way (submitting or resetting). Mostly, I think that's what you meant, I just wanted to be clear.
as well as an interesttarget attribute to
<button>,<a>,<input type="button">and<input type="success">elements.
Here, I think you likely can add back <input type=reset> and buttons that do participate in form submission, since you'd like to use this feature to provide context for those actions before you commit to them, and there's seemingly no conflict between this new "interest" feature and actually activating the elements. (Side note, I don't think <input type=success> is a thing.)
The invokertarget value should be an IDREF pointing to an element within the document. .invokerTargetElement also exists on the element to imperatively assign a node to be the invoker target, allowing for cross-root invokers.
I don't think this allows cross-root invokers, except in some cases. Note the complex conditions here.
If an element also has a popovertarget attribute then invokertarget must be ignored. interesttarget can exist on the element at the same time as popovertarget.
Thanks for considering what happens with both attributes present. I think it should likely go the other way though - if you have both, respect the new invokertarget and ignore popovertarget.
Loses/Lost Interest: The action of Loses Interest refers to the user "moving away" from an element...
It should be made clear that this applies to the pair of the invoker element and the target element. For example, a button that opens a popover - to lose interest while using a mouse, you'd have to de-hover both the button and the popover.
Overall, awesome proposal! I think it'd be a good idea to chat about this at an OpenUI meeting soon. I don't think I can Agenda+ it with any meaningful label, but would you mind if I get it on the next meeting's agenda?
Also side note: see https://github.com/w3c/csswg-drafts/issues/9236 for a very related CSS proposal to control the delays associated with gaining and losing "interest".
would this cover a button that might need to have an associated tooltip, but also opens a popover, popover dialog, or modal dialog?
How could the platform support having other elements (e.g. Custom Elements) to be invokers as well?
@mfreed7
I think you'd want
invokeractionon just the set of things we currently allow, which are those "buttons" you listed, but only when they don't participate in a form in some way (submitting or resetting). Mostly, I think that's what you meant, I just wanted to be clear.
I'd like us to consider that <input type="reset"> could participate in a dialog form which would allow it to reset the form and close the dialog. I've updated the readme to reflect your comments though.
Here, I think you likely can add back
<input type=reset>and buttons that do participate in form submission, since you'd like to use this feature to provide context for those actions before you commit to them, and there's seemingly no conflict between this new "interest" feature and actually activating the elements. (Side note, I don't think<input type=success>is a thing.)
I have no idea how input type=success came about. I guess ChatGPT isn't the only thing that can hallucinate prose 😆.
I don't think this allows cross-root invokers, except in some cases. Note the complex conditions here.
Thanks, added.
Thanks for considering what happens with both attributes present. I think it should likely go the other way though - if you have both, respect the new
invokertargetand ignorepopovertarget.
Yes that's a much better idea! Added.
It should be made clear that this applies to the pair of the invoker element and the target element. For example, a button that opens a popover - to lose interest while using a mouse, you'd have to de-hover both the button and the popover.
Great point! Added
Overall, awesome proposal! I think it'd be a good idea to chat about this at an OpenUI meeting soon. I don't think I can Agenda+ it with any meaningful label, but would you mind if I get it on the next meeting's agenda?
Please, this would be great to discuss further.
@scottaohara
would this cover a button that might need to have an associated tooltip, but also opens a popover, popover dialog, or modal dialog?
Yes I believe so:
<button interesttarget="my-tooltip" invokertarget="my-dialog">Tooltip on hover/focus, click to open dialog</button>
@Westbrook
How could the platform support having other elements (e.g. Custom Elements) to be invokers as well?
It's a good question, and one that seems to largely revolve around this big open question of "how can a custom element be a button". We could add something to .elementInternals() but that feels like it somewhat defeats the point of this being declarative. One solution without adding any new mechanics is to create an element that delegates focus to the button in its shadowroot, but that still requires imperative assignment if it wants to target a cross-root target.
This is all to say I don't have a good answer for this, and it might need addressing at a larger scope than this proposal.
@keithamus so that helps clarify a bit, but i'm still not sure what type of dialog is going to be invoked from that. a non-modal, a non-modal popover (in the top layer) or a modal dialog. I've looked at the InvokeEvent table from the explainer, and i'm seeing an expectation that something is a popover, or a modal dialog, but not a non-modal dialog (popover or not).
is the intent for interest largely for tooltips? or is there the possibility that other types of content could be shown/hidden if associated in that way? not for or against whatever answer is given, just want to understand the potential UX ramifications of a dialog or large block of content showing up on someone simply trying to tab through an interface. Understanding this would also result in any accessibility properties that would need to be exposed for the interesttarget attribute
Two other bits:
- i assume this proposal would potentially overlap/negate the need for https://github.com/openui/open-ui/issues/700 (if so, great!)
- i'd like to propose some updates to the accessibility section (specifically the implicit ARIA attributes). I assume it's ok to make a PR against your explainer, or would it make sense to talk that section out? Maybe it's better to talk it out at some point, rather than assume i understand the whys behind some of those choices?
so that helps clarify a bit, but i'm still not sure what type of dialog is going to be invoked from that.
I've ignored the concept of non-modal-non-popover dialogs, and so an invokertarget pointing to a <dialog> (with no popover) will call showModal(), and <dialog popover> will call showPopover(). I'm making the assumption that we'll one day resolve https://github.com/whatwg/html/issues/9376 and <dialog>.show() will be deprecated.
is the intent for interest largely for tooltips? or is there the possibility that other types of content could be shown/hidden if associated in that way?
Largely for tooltips. I can't think of a compelling use case that would not be disruptive to folks for anything else, but I'm hoping if there are some, someone will come forward with them.
i assume this proposal would potentially overlap/negate the need for Consider toggle (expand/collapse) button or attribute openui/open-ui#700 (if so, great!)
I think so! It seems like spiritually these issues align. I want this proposal to effectively capture/replicate/explain any built-in interactive element, including <details>. I've not done extensive research into what else conceptually uses aria-expanded but if I've missed any that you think warrant adding please do submit a PR! The table in the proposal is an attempt at an exhaustive list but was mostly drawn up from memory so I'm sure I've missed stuff.
i'd like to propose some updates to the accessibility section (specifically the implicit ARIA attributes). I assume it's ok to make a PR against your explainer, or would it make sense to talk that section out? Maybe it's better to talk it out at some point, rather than assume i understand the whys behind some of those choices?
Absolutely please do! It's safe to assume that the whys behind my choices come from a place of ignorance and doing what I think is right without any hard research.
thanks for all that, @keithamus
your response about the non-modal dialog makes sense now with that context - as it didn't initially click to me before that <* popover> the asterisks would handle the dialog popover use case. I had just read that table as "popovers" and "dialogs" as separate things. error on my part. Re: the linked issue, I've been thinking a lot about what Domenic mentioned in that thread - the use case for a non-modal dialog needing to not render in the top layer... and I can't shake the idea that there's something to that.
I think so! It seems like spiritually these issues align
that was my read as well, which is great. I'll definitely come back and kick the tires on this some more, especially re: the next bit about updating some of the accessibility bits. Which truly thanks for even considering that stuff. It's mostly nits / suggestions to either match reality, or use this as an opportunity to force support / update the spec for a certain attribute.
i'll put this on my todo to do another review / make a PR. Thanks!
is the intent for interest largely for tooltips? or is there the possibility that other types of content could be shown/hidden if associated in that way?
Largely for tooltips. I can't think of a compelling use case that would not be disruptive to folks for anything else, but I'm hoping if there are some, someone will come forward with them.
One such use case is a nested menu. (Try Google Docs for this exact behavior.) Click to open a menu (e.g. File), then hover over one of the items with a sub-menu. The sub-menu shows up after a slight delay. Note that their keyboard behavior is different - focusing on the item does nothing, and you have to hit the right-arrow to open the sub-menu.
Note that their keyboard behavior is different - focusing on the item does nothing, and you have to hit the right-arrow to open the sub-menu.
I suppose if we go with interesttarget that means hover, focus, or long-press, and someone wanted the above behavior (only trigger on focus, because they've handled the other modes themselves) then they'd need to do something like preventDefault() on the appropriate events? That's a bit of an issue now because focus is not cancelable. Is there another suggestion about how to give interesttarget more configurability?
thanks @mfreed7 - yes that's a good example, and a good reason why hint would have been a weird name for this type of popover. but yeh, per that google file menu example, it seems to me that hover is more of an 'enhancement' to that menuitem, as really someone should be able to tap/press on that using a touch device to open, since 'hover' and 'focus' may not even be things one can do if using a touch device, for example.
This was discussed at OpenUI today: https://www.w3.org/2023/08/31-openui-minutes.html#:~:text=openui/open%2Dui.-,Invoker%20Buttons%20%2D%20allowing%20popover/dialog%20and%20more%20to%20be%20invoked%20without%20JS,-masonf%3A%20this
So as mentioned above, we discussed this today at OpenUI, and the general tone was that this new proposal is generally a good direction to try to go. It has the flexibility needed to support many different element types, and it's expressive enough to work for many common use cases.
One point was made that the interesttarget part of the proposal still has some open questions, e.g. the particulars of the long-press behavior. It might be worthwhile splitting the proposal into two pieces, one for invokertarget which seems implementable and standardizable today, and one for interesttarget which needs more research on a few things.
What do folks think about this direction? @annevk @ntim @emilio
General idea seems like the right way to go, though there's probably some bikeshedding to be done around naming and smaller details.
General idea seems like the right way to go, though there's probably some bikeshedding to be done around naming and smaller details.
Great! Do you think it's close enough that someone (@keithamus ??) should start drafting a spec PR? We're happy to prototype for Chromium. I do think we should just tackle invokertarget and not (yet) interesttarget to simplify things at first. Alternatively, two PRs could be drafted, one for each, and we could debate the issues around interesttarget via the PR?
My points of bikeshedding:
invokeraction=opendialogshould just beinvokeraction=showModal. As much as possible, the action names should match up with their JS counterparts.- Similarly, I'd prefer
invokeraction=closefor closing dialogs,invokeraction=openandclosefor opening and closing dialogs,invokeraction=pauseandplayfor videos. The action doesn't need to include the context (e.g.playinstead ofplayVideo), and it's ok for multiple types of targets to support the same action name, likeopen.
Happy to work on specs and implementations.
I think your bike shedding points make sense and I agree with them; I was just being conservative in the explainer 😜
I'm not sure if these have already been discussed or not, but these feature requests for popovertarget should also be considered for this new attribute:
- https://github.com/whatwg/html/issues/9110
- https://github.com/whatwg/html/issues/9109
Adding invokertarget and invokeraction attributes to
Bruce Lawson was trying popover out and noticed that because it's only on "button"s it doesn't work with <map> and <area>. Is this something we want to consider?
https://x.com/brucel/status/1702297666472857776?s=20 for the use case
<area> is equivalent to <a>. I think we generally want <area>/<a> to cause navigations, not to open popovers.
Hmm, so is there an alternative that's possible atm or would it be a matter of positioning buttons manually?
At the very least it's worth specifying interesttarget would work for <area>.
Hmm, so is there an alternative that's possible atm or would it be a matter of positioning buttons manually?
Buttons are generally the correct (e.g., most accessible) way of causing non-navigation actions (like the use case in the linked post) to trigger.
At the very least it's worth specifying
interesttargetwould work for<area>.
If it's specced to work on <a>, then sure.
I actually agree with the concerns raised by @GijsK regarding CSP at: https://groups.google.com/a/mozilla.org/g/dev-platform/c/iNeYYiRjMaQ/m/VQUgDsTNBAAJ
Some actions I would consider gating depending on scripting CSP is:
- media control playback: mainly because playback can be considered as security sensitive (exploits with media codecs/formats), so it would be good to honor CSP here.
- fullscreen: If we give invokers the same level as script here, it does make script free mode more prone to spoofing/phishing attacks. Fwiw, fullscreen does not require user gesture, only transient activation, so it is reasonable to expect CSP to block fullscreen from working.
- showPicker might be a candidate too, simply because they display popups outside of a browser window (though you could argue you can just overlay a giant invisible select)
Alternatively, I don't think it's unreasonable to drop declarative actions for those, and ask authors to define their custom ones using the JS invoke event.
FWIW, CSP was mentioned also during the discussions at TPAC.
Fwiw showPicker only works in same origin iframes. That should help alleviate some security concerns? Given you already don't need any JS to trigger them(you can just use and style the inputs/select) idk that showPicker really warrants anything extra?
Fullscreen is the most 'dangerous' one which we've thought of a few possible mitigations for. For one it follows feature policy and requires user gesture (by virtue of how invokers work). We're actually going to be discussing this later in openUI to consider any extra mitigations required (potentially we require JavaScript to be enabled or limit to video elements etc).
Media playback is also already possible without JS using native video elements so I'm not quite sure why invokers would be novel in this regard?
Alternatively, I don't think it's unreasonable to drop declarative actions for those, and ask authors to define their custom ones using the JS invoke event.
This would lose the native accesibility mapping that invokers can provide so there is a trade off to be made.
Fwiw showPicker only works in same origin iframes. That should help alleviate some security concerns? Given you already don't need any JS to trigger them(you can just use and style the inputs/select) idk that showPicker really warrants anything extra?
I don't quite follow the claim about "same origin iframes". In general, x-origin DOM access isn't allowed, so none of the things here would be possible x-origin without some form of exposure/exploit/cooperation of the target site. Right? Sorry if I'm missing something obvious.
To be very explicit, the reasoning in my m.d.platform post was: "Suppose an attacker finds an injection vulnerability in a website using CSP to prevent inline script, does this new feature give the attacker new capabilities, and if so, what, and is that a problem?".
Even if we decide this doesn't warrant falling under script-src CSP constraints or similar, it would be a good idea to have the text/proposal call out that this is happening (and why it's not severe enough to warrant doing something about explicitly). This would hopefully avoid a future where next week/month/year someone says "I want to be able to use 'Invoke' to activate existing <button> controls", where an attacker could e.g. mislead victims into taking unintentional actions that affect more than just their local copy of the webpage (which IMO would be more serious than some of the existing risk that this proposal as-is is adding, which is less likely to cause e.g. form submission or similar things that affect the outside world).
Given you already don't need any JS to trigger them(you can just use and style the inputs/select)
Hm, well, you can't change the button part of <input type=file> much (can't use custom text). But with the invoker, you could use custom text + styling much more easily to convince/coerce people into clicking the button (and subsequently accepting the picker). So I'm not sure I agree with this.
(Also, we're assuming that affecting styling of the existing element is just as easy for an attacker as inserting a new element and styling that, which I think also isn't necessarily true.)
I'm not a CSP expert, but how bad would it be to have the default be that if for script-src the 'unsafe-inline' keyword was not present, the invoke related attributes would be ignored, and adding a separate CSP directive (invoke-src) or keyword for script-src (allow-invoke or whatever) to holepunch for these new things? That would mean existing sites don't have any decrease in security unless they opted in to the feature, and (IMHO) wouldn't add massive burdens to adoption? (No CSP set would still allow things by default, so it'd likely be a no-op in terms of initial experimentation etc.)
I really don't know, maybe this is overkill - but either way I think it should be an explicit decision. :-)
FWIW, CSP was mentioned also during the discussions at TPAC.
I don't do enough standards work so I have a dumb question: are there notes where I can read the previous discussion so I'm not going over the same ground again?
I don't quite follow the claim about "same origin iframes". In general, x-origin DOM access isn't allowed, so none of the things here would be possible x-origin without some form of exposure/exploit/cooperation of the target site. Right? Sorry if I'm missing something obvious.
I meant if you have a document loaded in a cross origin iframe that has both an input and a button with an invoker pointing to that input it wont work. Much like the current showPicker JS function won't (invokers don't have the color/file input special casing either).
Media playback is also already possible without JS using native video elements so I'm not quite sure why invokers would be novel in this regard?
A native video element being able to invoke those things is not the same as a random button element being able to invoke this.
To keep this issue updated the security concerns were discussed within OpenUI see the transcript here https://github.com/openui/open-ui/issues/904
The TLDR is we don't consider invokers to introduce anything novel that warrants extra precautions beyond that which is already provided by the platform (e.g. fullscreen following permissions policy and showPicker not working within cross-origin iframes).
@keithamus feel free to expand on this if there's anything specific you wanted to call out.
I've raised https://github.com/openui/open-ui/pull/942 which hopefully can be a place where we can discuss some of the security implications and OpenUIs considerations here.
Assessing security using capabilities isn't sufficient fwiw, UX is also a large part of security. "Do users expect random buttons on the page to be able to do this?" is a good question to ask.