at-driver icon indicating copy to clipboard operation
at-driver copied to clipboard

How to represent screen-reader specific modifier keys with `interaction.pressKeys`

Open zcorpan opened this issue 3 years ago • 6 comments

From #26

Include a special modifier for the screen-reader specific modifier keys.

This is not done yet. How do we want to support this? Should there be a special string for each vendor in place of the raw key string, e.g. "nvda", "macOS VoiceOver", etc?

{
  "method": "interaction.pressKeys",
  "params": {
    "keys": ["nvda", "a"]
  }
}

Or a boolean property in InteractionPressKeysParameters to indicate that the screen reader specific modifier keys should be pressed?

{
  "method": "interaction.pressKeys",
  "params": {
    "keys": ["a"],
    "vendorModifier": true
  }
}

Originally posted by @zcorpan in https://github.com/w3c/aria-at-automation/issues/26#issuecomment-1286628283

zcorpan avatar Nov 07 '22 20:11 zcorpan

@jscholes This feature originates from our discussion on 2022-07-07, but I can't remember the motivating use case. Using "interaction.pressKeys" requires writing vendor-specific instructions, and an abstraction for a vendor's modifier key doesn't seem to change that. Could you say a bit about what this feature would enable?

jugglinmike avatar Nov 22 '22 23:11 jugglinmike

I think an important aspect is that the special modifier key is configurable for some screen readers.

zcorpan avatar Nov 23 '22 11:11 zcorpan

Good point! That means the "meta" key would be contextual not just to the screen reader under test but also to its internal state.

To put this in terms of use cases: is this feature about letting folks control screen readers where the configuration is unknown?

jugglinmike avatar Nov 23 '22 21:11 jugglinmike

@jugglinmike Not sure I fully understand the ask/context here, but I'll give it a go.

  • Most (all?) screen readers support a modifier key/set of keys, to carry out functions specific to that screen reader. Some, like VoiceOver, make very heavy use of these modifiers in most commands.
  • ARIA-AT tests use these keys where appropriate. For example, almost all commands targeting VoiceOver will include the VO modifiers, whereas only some commands targeting JAWS and NVDA require a modifier.
  • The VoiceOver modifiers, by default, are Control+Option. These are standard, system-wide modifier keys, possible to represent in most systems that facilitate programmatic keyboard simulation (e.g. they are registered as modifiers in HID standards).
  • JAWS and NVDA do not use system modifiers. They use Insert, Caps Lock, and/or Numpad Insert as a single modifier key, all of which are usually considered standard keys in keyboard simulation software. E.g. if you were sending a bitfield of modifiers, Insert wouldn't be part of it.
  • It is also possible, on macOS, to configure VoiceOver to use Caps Lock as a modifier, for users who find a single key easier/more convenient to hold down. I'm not sure if this is the default, or has to be turned on; it's been a while since I set up my Mac.

With all of this context in mind, VoiceOver may already be well covered by any standard allowing a modifiers field to be included, if it includes Control and Option in its definition. But on Windows, this would not be the case, and hence some "special" handling seems required. Maybe we could just abstract the details away to a single screen reader/meta key (although "meta" has connotations), and behind the scenes, each implementation can respond appropriately?

jscholes avatar Nov 28 '22 22:11 jscholes

Thanks, @jscholes! It seems like the term "modifier" may have a couple meanings. Let me see if I've got this right:

  • Keyboard simulation systems make a technical distinction between "standard" keys and "modifier" keys. The latter cannot be pressed in isolation (or even in a specific sequence); they can only be enabled for the entirety of a sequence of "standard" key presses.
  • Screen readers also use the term "modifier," but with a different meaning. For screen readers, "modifier" keys are those that signal the beginning of a keyboard command.

Is that accurate?

jugglinmike avatar Nov 29 '22 00:11 jugglinmike

Today, the folks at the 2022-12-05 Community Group meeting discussed this.

First, we acknowledged that some system APIs support simulating the pressing of "system modifier" keys (e.g. Control and Shift) declaratively as a refinement to a sequence of additional keys to be pressed. For instance, they support instructions like: "press the X key, Y key, and Z key in that order, and ensure Control is depressed for the entirety of the sequence."

We were not convinced that such a convention is necessary for this proposal because the same sequence can be modeled with an unrefined series of keys-to-be-pressed, and that's already possible with the API which @zcorpan has drafted (see gh-26). The earlier example can be expressed in these terms: "press the Control key, X key, Y key, and Z key in that order."

We didn't come to any formal conclusions, but we do feel more confident about waiting for implementation experience before designing any API around system modifier keys.

jugglinmike avatar Dec 06 '22 00:12 jugglinmike