keyman
keyman copied to clipboard
spec(core): LDML KeyboardProcessor
Introduction
This is a holding issue for the Keyman Core required to support LDML keyboards. We'll be filling this in as we complete planning and design. Much of this content should be moved to separate issues as we expand the details and establish component boundaries.
Objective: build a keystroke processor that works with the existing Keyman Core design that supports LDML keyboards, including:
- load/save of LDML data
- keyboard metadata API (particularly to support OSK) (this may not be needed?)
- stateless keystroke transform
Intent is to build this in C++. It will need to cross compile to native Windows, macOS, Linux and WASM. The module will need to be standalone and not have runtime dependencies (static link of libraries is probably okay).
Related Features
LDML implementation:
- #7042
- #7043
Groundwork:
- #5011
- #5012
- #5013
- ~~#5069~~
- Draft spec document
General Library Properties
-
no i/o, minimal/no dependencies. Why?
- So that the WASM compiler/linker does not have to chase down file IO and deps
- So that the lib is provably secure and portable
- So that the lib can be tested in complete isolation from any deps or environmental concerns (outside data etc)
-
no_std? - to consider whether this is appropriate or not.
-
API boundary will be UTF-16 (std::u16string).
-
Internal string form will also be std::u16string.
-
Physical limits of input context and output transforms. 64?, 256 chars?
-
Keyboard data storage: a binary (black-box, not necessarily optimized) format that the KMXPlus compiler will produce given input XML.
C5015.1: Infrastructure
~~Moved to #5069.~~
- Define LDMLKeyboardProcessor folder, build scripts, basic files
- Add template unit tests
C5015.2: KMX+ Binary Loader
- File Format: #7043
- Loads from the BLOB
- no i/o requirements so kbd processor can depend on this
- Provides metadata access (see §2.1, "Library for LDML access")
- used by:
- LDML Keyboard Processor Library
- unit test
- clients (i.e. "list of keyboards" or "filtering keyboards", "get osk data")
- API:
- ...
- Metadata
- Keyboard definition, transforms
- OSK layout
C5015.3: LDML Keyboard Processor Library
- no i/o
- no state besides context
- Depends on:
- LDML Datablob Library
- API:
-
constructor
function:- LDML keyboard datablob
- Platform immutable properties (e.g. OS, etc)?
-
processEvent
function- Inputs:
- context state
- before text buffer - from app (string of Unicode characters; unspecified normalisation form; valid UTF-x)
- transitory state - e.g. deadkeys - from previous processor run
- may be empty/null if 1st run of processing engine
- string of Unicode characters or index into state table?
- "user settings" (if added to LDML)
- incoming keystroke
- key code - virtual key code (Windows?)
- For hardware, the vkey is already resolved
- modifiers - shift, ctrl, option, etc
- toggle state keys - caps, num, etc
- key code - virtual key code (Windows?)
- flags
-
touch
or hardware (!touch)
-
- context state
- Outputs:
- Transform: Delete x Unicode codepoints before caret, insert string
- Not supported:
- delete x codepoints after
- Caret repositioning
- Transitory state for the next input event
- Next OSK layer (5.14)
- Changes to 'toggle' modifiers
- Fail/error notification (in Keyman: "beep"; 5.18 ["error"])
- Logic:
- if
!touch
- lookup and remap vkey in
vkey
table
- lookup and remap vkey in
- NOT called for switch keys, that is handled by the caller
- Lookup (vkey, mod) in the
keys
section- Yields a UTF-32 codepoint or UTF-16LE str
- if backspace, process as backspace and stop.
- TODO/Q: Or is this handled by the layer?
-
push_character
tocontext
and toactions
- TODO-LDML: Transform Mapping Here
- if
- Inputs:
-
C5015.4: Test Framework
- Work from kmxkbd unit test model in Keyman Core
- A data-driven test harness
- Test harness should have no i/o
- C++ and Typescript test runners
- Single data source should produce identical results on all platforms
- Allows us to verify that the interfaces are not causing trouble without additional unit tests
- Java runner? (for CLDR CI)
- Interactive tests
- Web based?
- GUI based?
- Command line driven tooling for manual tests
C5015.5: Keyboard Delivery
Requirements:
- The minimum version of Keyman that can load these .kmx files will be _____.
- A .kvk file will be generated by the compiler from the LDML source file for use by desktop platforms.
- On web, the .js will embed a binary base64 blob of the .kmx, alongside the touch and kvk data and necessary metadata. The .kmx blob will be delivered to the Keyman Core WASM module.
Questions:
- Question: file naming conventions
- Question: limit one LDML file per kmp package?
@mcdurdin updated §C5015.3 above w/ pseudocode for keyboard processor. I know we said we don't need a vkey
table originally because they will be mapped by the compiler, but for hardware keys (i could be wrong but) otherwise I don't see how hardware remaps would happen. So I'd propose this change:
- compiler DOES apply vkeyMap mapping when setting up the touch layout
- so
process_event (…, vk, …, flags=TOUCH)
does not need to apply vkey mapping, but just looks up inkeys
However, for hardware:
-
process_event(…, vk, …, flags = 0)
recognizes that it's not a touch call, therefore looks up invkey
to apply mapping BEFORE proceeding.
@srl295 I think you are right. For hardware maps, we have a compiled layoutMap which assigns each position on the keyboard to a key from the keys
bag. Keyman Core is going to pass a "US" vkey to the LDML KeyboardProcessor. So it's up to the LDML KeyboardProcessor to transform that using vkeyMap
.
This is now largely done. Tracking remaining issues separately.