keyman spec: multitap and flick for touch keyboards 🐵

This is part of a broader specification for Caps Lock support on touch devices. Development of this touches multiple components; each component will be implemented in a separate PR, referencing this issue.

This feature has emerged from the caps layer feature and is now treated separately for implementation.

Related features

#3620: spec: Caps Lock layer for touch layouts
#3621: spec: Start of text/sentence selects shift layer
#3720: spec: Suggestions respect capitalization
#5790: feat(android): Add arrow keys to the keyboard

Introduction

A caps layer, if present, will be accessible by double-tapping the Shift key. We won't add a longpress to Shift because in future we might want to use this for multitouch approach (e.g. hold Shift, press X with another finger to give a capital X and return to default). (Also, double-tap is standard Caps-lock UX)

The double-tap will be defined by a new attribute for keys, called multiTap. This will be an array similar to the subkeys array. Although the key cap won't be initially meaningful, we will store it in case we want to use it for visual feedback in a future version (e.g. a brief popup).

(See also LDML multitap property.)
We will ignore multitap on globe button, just as we ignore longpress.

C5029.1 File Formats: .keyman-touch-layout changes

Changes to the file format and the JSON schema are required.

Required: Add a multiTap property to the key object. This will be an array in the same format as the existing sk longpress property.

We will not require specific key codes for multitap. A recommended pattern would be T_2X_A, T_3X_A for a double tap and a triple tap on an 'A' key.

Optional: Add flick object to the key object. This will be an object with cardinal directions n,e,s,w and intercardinal directions ne,se,sw,nw as properties of the object. Each of those properties will be a key in the same format as the keys in the sk array. (Note: LDML has a flick property so we may need to implement this in order to complete the LDML implementation.)

Backward compatibility

Touch layout files that contain these features will be backwardly compatible with earlier versions of Keyman, with a graceful degradation of functionality.

The multiTap property will be ignored by older versions of KeymanWeb.

C5029.2 Changes to Developer IDE

Required: The touch layout editor needs to surface the multiTap property as a new section similar to the longpress interface.
Required: The touch layout editor needs to visually flag the presence of multiTap properties in its keyboard view. (For longpress this is currently a slash on top right of the key; something similar but differentiated is needed for multitap).
Optional: The touch layout editor needs to surface the flick property again as a new section similar to the longpress interface.
Optional: The touch layout editor needs to visually flag the presence of flick properties in its keyboard view.

C5029.3 KeymanWeb enhancements

Required: KeymanWeb needs to support the multiTap property.

Caps Lock Interaction: double-tapping Shift will switch to the caps layer.

Question: What do we do when the first tap in a multitap changes the layer? How do we link the second tap to the first one?
- Could we ‘cache’ the multitap key?
  - Implementation, loosely: if touch is within the multitap time window, and location is within the touch-bounds of the multitap source key:
  - Automatically use the source key & continue the multitap
  - Reset the time window’s start, allowing it to be further continued.

The time between the two taps should be less than 300ms to be treated as a double tap. If the time between taps is longer than that, the second tap should trigger a normal key interaction. Any other interaction after the first tap will cancel the double-tap state.

Subsequent taps in a multi-tap sequence will never trigger a longpress or a flick.

Required: KeymanWeb will need to use the transcriptions feature to roll back previous outputs in the multitap sequence. Layer changes will not be rolled back, just output.
Optional: We will need an API in web to change the default double-tap time, accessible from Android, iOS and web apps.
Optional: KeymanWeb needs to support the flick property and gesture.
- If a keyboard defines flicks, then we may need to disable the model of sliding across keys to a choose a different key that we currently support, for that keyboard.
- Flicks may be shown in the OSK as a small character on the appropriate edge or corner of the key?
- May need an API to enable/disable flicks per user preference.

C5029.4: Android and iOS changes

Optional: Multitap speed may need to be user-configurable
- Similar pains to Windows double-clicking, which is a well-known accessibility issue.
- Note existing issue on longpress speed. #877, #878
- But let's see if we can use system accessibility values for this.
  - [DW] Android only provides a longpress value, not any sort of double-tap.
- Q: Do browsers provide accessibility values?
Optional: Add a setting to disable flicks for accessibility reasons.

Related Issues and discussions

#246
https://community.software.sil.org/t/how-to-return-from-shift-automatically-and-how-to-caps-lock-by-double-tapping/3469/3
https://community.software.sil.org/t/problem-with-caps-lock-output/3718
Original design document
RFC
15.0 planning document

May 04 '21 22:05 mcdurdin

Most of the KeymanWeb-side infrastructure for multi-tap is implemented in #5989, as a generalized solution with a specialized case for Shift -> Caps in 15.0.

Developer-side infrastructure is needed, plus a more detailed discussion of how to handle multitap and rollback (while we could use the transcriptions feature, it is not entirely clear to me that this is the most appropriate way to solve this):

Required: KeymanWeb will need to use the transcriptions feature to roll back previous outputs in the multitap sequence. Layer changes will not be rolled back, just output.

Dec 01 '21 00:12 mcdurdin

Is it possible to enable double tap gesture for all keys in version 15? there is no need for developer GUI, we can edit the source in textpad, as long as keyman can compile it. developer gui can wait. double tap will make a huge difference I can finally release the keyboard

Feb 08 '22 20:02 MayuraVerma

Is it possible to enable double tap gesture for all keys in version 15?

Unfortunately not.

there is no need for developer GUI, we can edit the source in textpad, as long as keyman can compile it.

The developer GUI is probably the easiest bit. The file formats are not too bad either. The hard part is the interaction with existing rules in the keyboard. You have to do one action, and then undo it again, before applying the double-tap action -- for example, if you press a, then the keyboard emits "a", but then the double-tap gives "A", you have to delete the "a" and replace it with the "A". That's the easy version. What if the keyboard actually changes the context for the a. Then you have to remember what it had done, undo it, and bring back the original! That also means that the double-tap has to work with the context of the previous keystroke.

So this work can be done in Keyman or in the keyboard. I think that putting that work into the keyboard would make it very difficult to program. Now internally, Keyman has the model for handling these transforms -- it uses this for predictive text -- but there are a lot of details to work through.

For Shift, there was less of a problem, because the Shift key does not emit any characters. That's why we were able to get Shift / Caps into Keyman 15.

But this is still a priority for us -- double-tap, swipe down and potentially other gestures.

Feb 12 '22 23:02 mcdurdin

Actually double tap, doesn't need to done in two steps.

Single tap -> a Double tap -> A

In keyboard layout source definition, we need multiple layer, base, shift, NCAPS, doubleTap, pressHold, etc Keyman Engine should take the output from here, predictive text should engage after the text is outputted

Example, predictive text engine doesn't need to know aa is A, we could have keyboard with a in base layer in key_A, below it 1 in shift layer shift_key_A, double tap double_key_A could be programmed directly to output A.

Double tap should not affect the predictive text, until the gesture is complete, text is released.

Please do not implement a, delete, A to proceed to double tap, predictive text will get slow and text will be changing frequently while typing

I have few people test Keyman who uses Android, they all have one major feedback, it's basic keyboard without gestures

Similar feedback for Windows version without predictive text

in keyboard source we need additional layer, engine needs to recognize these layers, gesture definition to pickup input in GUI, output the corresponding text in that layer (double). Predictive text updates with the text output

FYI: Double tap, hold and release in android is most requested

Windows: predictive text

If I help in reducing the workload, please let me how I contribute

I am working on closing iOS, color and icon to match stock keyboard. I will send a pull request soon

Feb 13 '22 00:02 MayuraVerma

Actually double tap, doesn't need to done in two steps.

Single tap -> a Double tap -> A

In keyboard layout source definition, we need multiple layer, base, shift, NCAPS, doubleTap, pressHold, etc Keyman Engine should take the output from here, predictive text should engage after the text is outputted

Example, predictive text engine doesn't need to know aa is A, we could have keyboard with a in base layer in key_A, below it 1 in shift layer shift_key_A, double tap double_key_A could be programmed directly to output A.

Double tap should not affect the predictive text, until the gesture is complete, text is released.

Yes, certainly we can delay on the predictive text; that's not a problem. But we can't really delay on character output into the document.

The issue is the double-tap timing threshold. If the threshold is say, 125msec (any slower is two keystrokes), then this may be workable -- the delay in output becoming visible may not be very noticeable. However, a realistic double-tap threshold is probably significantly higher, say 250msec or even 500msec (and may be user adjustable for accessibility), and if we don't output anything for that long, the keyboard will feel very "laggy". So I don't think this will work.

Feb 13 '22 00:02 mcdurdin

FYI: Double tap, hold and release in android is most requested

Windows: predictive text

Thank you for the feedback. That helps in our planning.

If I help in reducing the workload, please let me how I contribute

I am working on closing iOS, color and icon to match stock keyboard. I will send a pull request soon

Thank you -- I look forward to seeing it!

Feb 13 '22 00:02 mcdurdin

Per https://github.com/keymanapp/keyman/pull/6138#issuecomment-1045485483:

With longpress, we are seeing the popup keys. Is it possible for longpress to output shift layer as output instead of popup keys?

It makes sense to make the default longpress selectable by the keyboard developer, as part of the whole gesture feature update for 16.0.

Feb 19 '22 01:02 mcdurdin

Need to ensure we support #5511 and #1115 - flick gesture on spacebar - and warn devs not to use L/R flick on spacebar in Keyman Developer.

Note: @MayuraVerma requests: "Shortcut to emoji keyboard in Android. It’s simple built in routine to map. Or gesture swipe to space bar to emoji keyboard".

Feb 21 '22 20:02 mcdurdin

See also #5790 - perhaps implementing with two finger slide on spacebar area?

Mar 04 '22 03:03 mcdurdin

I'm waiting for the implementation of the swipable keys on mobile. It's the final thing keeping me from fully porting my Multiling O Keyboard layouts to Keyman.

Mar 19 '22 14:03 JapanYoshi

I'm waiting for the implementation of the swipable keys on mobile.

:grin: This is scheduled for 16.0, due for release later this year -- all going well!

Mar 21 '22 06:03 mcdurdin

Per https://community.software.sil.org/t/flick-layout-mobile-swipe-from-key-for-different-letter/6036

Requirements:

Ability to define up to 8 flick directions for each key
- Warn the creator if swiping left from the leftmost key, swiping right from the rightmost key, or swiping down from the bottom row of keys
Ability to show all letter previews, instead of hiding the extra letters in a dot, as is done with long press
Detect the closest defined flick direction
- e.g. if up-left, down-left, and down are defined, flicking exactly right should not input any letters, but any trajectory between right and down should input the letter defined in down

Mar 21 '22 06:03 mcdurdin

Does this include swipe to type feature?

Apr 09 '22 09:04 MayuraVerma

Does this include swipe to type feature?

Nope, I think this is about flick input like Japanese flick input, as opposed to touch-trail typing like Swype.

Apr 09 '22 09:04 JapanYoshi

@JapanYoshi is correct -- swipe or touch-trail typing requires good predictive text dictionaries to function well, which most of the languages we support just don't have, so it's not currently high on our priority list.

Apr 12 '22 21:04 mcdurdin

can we test multitap gesture in keyboard, if so please point to keyboard syntax.

Oct 07 '22 12:10 MayuraVerma

can we test multitap gesture in keyboard, if so please point to keyboard syntax.

The new gesture engine (#7324) is not ready at this time, so unfortunately, testing such gestures is not possible yet. It's taking more time than we anticipated and may not release as part of 16.0.

Oct 10 '22 01:10 jahorton

So, we had a minor design discussion today. One of the important details is that we fleshed out some ideas for flick-hints; I'd like to record that now.

As flicks involve the touchpoint moving off the base key during the flick, we can use the key cap's space itself to show a visualization of the best-matching flick.
- While our phone form-factor layouts receive a key preview, it (along with longpresses) have the issue of being constrained when in the top row and masked by the finger. Thus, the key preview isn't optimal here.
We can use CSS-based transition tricks to 'slide' hint keycap-text into view for hints using strategies similar to what is employed for collapsible predictive-text suggestions in #7934.
- For phones, one-to-one motion with the touchpoint is likely ideal.
- For tablets... we didn't discuss, but I think the hint should either move more quickly or it should fade in partway through the key. Tablet keys are far larger than phone keys, after all.

Jul 03 '23 08:07 jahorton

@mcdurdin is there plan to implement "long press and release" to invoke shift layer

Or "long press and release" to invoke the default key in pop up keys?

Aug 04 '23 11:08 MayuraVerma

We'll have to see if it lands in this version, but we have plans to do what I've termed a "modifierpress" (or "modipress" for short) - longpress a modifier key to swap to the layer, then swap back when the key is released. I think that corresponds to your first question pretty well.

He and I had a discussion on the second point just last week; see #9416, which was also added to the checklist at the top of this issue.

Aug 07 '23 00:08 jahorton

If at least the "press and release" gesture can have a default key, which can be simply the first key in the pop up keys, then that is plenty.

Aug 07 '23 01:08 MayuraVerma

keyman keyman copied to clipboard

spec: multitap and flick for touch keyboards 🐵

Related features

Introduction

C5029.1 File Formats: .keyman-touch-layout changes

Backward compatibility

C5029.2 Changes to Developer IDE

C5029.3 KeymanWeb enhancements

C5029.4: Android and iOS changes

Related Issues and discussions

keyman
keyman copied to clipboard