gui icon indicating copy to clipboard operation
gui copied to clipboard

Keyboard key combinations for shortcuts

Open WhisperingChaos opened this issue 5 years ago • 11 comments

Do keyboard events allow the capture of key combinations, such as "CTRL+R"?

Simultaneously pressing this combination generates a "kb/down/ctrl" followed by a kb/up/ctrl when the CTRL key is released. However, no kb/down/ nor kb/up/ events are produced while pressing "R" until after the CTRL key is released.

BTW, thank you for a wonderfully minimal, concurrent GUI framework! My graphical requirements matched its minimalism allowing me to encode a prototype in just a couple hours avoiding the time sink of just understanding a qt implementation written in go.

Noticed your roadmap and was wondering if the distributed nature of concurrency requires disseminating the "intelligence" of drawing widgets and their interaction among the cooperating goroutines, as opposed to the notion of a "controller" in a hierarchical GUI where the "smarts" are concentrated in it, such that it dictates the rendering/behavior of its subordinate widgets?

WhisperingChaos avatar May 29 '19 20:05 WhisperingChaos

Hi! I'm very glad you found faiface/gui useful!

And you're right, this is currently a missing feature that I somehow forgot about... I can get to implement it tomorrow, or, if you feel like implementing it yourself (it shouldn't be hard), we can cooperate on a pull request. Which one do you choose?

And regarding your question. It does not require disseminating the intelligence, but it definitely enables it. Which is great because that makes it possible to express many things much more beautifully than otherwise. However, nothing's preventing you from creating a "master goroutine" that controls some (or even all) of the elements.

faiface avatar May 29 '19 20:05 faiface

Sure, let's cooperate on a pull request.

This is how I would approach implementing this feature:

  • Update the existing keyboard code to independently generate subsequent keyboard events. This permits a layer above to observe elemental keyboard requests and encode its own keyboard processing.

  • Create a stateful function (closure) to filter keyboard events traveling through the event stream and convert key combinations into "single" key events. This stateful function would encode the following semantics:

    • A key combination is essentially a "single" key that has "depth".
    • Each key that's captured in sequence defines another level.
    • When any key within this sequence is released, it completes an instance of the composite "single" key. Additionally, a released key removes its level and all derivative/subsequent ones. Therefore, to include any keys that were once part of a subsequent level, each must be released (if it hasn't been) and then pressed.
    • When the last key in the sequence is depressed long enough to repeat, each repeat generates an instance of the composite key.
    • When any key, even a typical character, like "f" is "captured" by this function, all its events are removed from the event stream.

    The above would generate the same "single" keyboard events for key combinations: KbDown, KbUp, and KbRepeat. However, their generated strings would reflect the key layering. Ex: CTRL-ALT-DEL:

    • on Press: \kb\down\ctrl\alt\delete
    • on Release \kb\up\ctrl\alt\delete
    • on Repeat \kb\repreat\ctrl\alt\delete

    Ex: ALT-f:

    • on Press: \kb\down\alt\102
    • on Release \kb\up\ctrl\alt\102
    • on Repeat \kb\repreat\ctrl\alt\102

The following mechanisms can be used to "insert" the stateful function into the event stream:

  • Call the stateful function within each goroutine that processes the event stream. This approach reflects the notion of distributed intelligence, as each event handling goroutine is responsible for implementing its key combinations.
  • Create a new master Env (environment) which essentially intercepts and converts keyboard events within the event stream before forwarding them to one or more muxed, subordinate virtual Envs . This concentrates keyboard handling intelligence in the master Env, thereby, eliminating the stateful function call within each subordinate virtual environment.
  • Extend Win to permit the optional application of the stateful function to its event stream. There is a level of cohesion to this suggestion, as Win is responsible for the processing keyboard requests from OpenGL. Again, this approach concentrates keyboard intelligence in a master environment.

All three approaches are valid and can be implemented. However, caution must be observed when mixing them. Also, centralizing "intelligence" creates a dependency that's not readily appreciated when reviewing the code of a subordinate goroutine.

Concerning the impact of the approaches:

  • Option 1 doesn't require changes to the code base in order to call the stateful function.
  • Option 2 might need changes to a virtual environment that's considered a root.
  • Option 3 definitely requires extending the Win interface by adding a new "option function" during its creation (only creation?). Supporting this new option will require a change to Win's implementation.

As you indicated in your reply above, gui doesn't necessarily dictate the aggregation/distribution of "intelligence" within its drawing widgets. However, choice encourages an argument that favors aggregation, as those unfamiliar with truly distributed thought argue their position requires less coding by removing redundancy. Unfortunately, the removal of redundancy incurs a potentially hidden dependency causing the resultant code to be less resilient to change.

WhisperingChaos avatar May 30 '19 17:05 WhisperingChaos

Alright, your idea is nice, but a little too complicated I guess. It would be also kinda hard to implement, considering that events are (no longer) strings, but their representation is structs. I'm not sure what the exact implementation would be, but if I understand you correctly, it would either have to be a linked list of events or a key event would have to have a slice of keys. Either way, it wouldn't make it easier to use.

Here's what I was thinking. Feel free to criticize that, of course.

  1. Emit a key down/up/repeat event when pressing a letter/symbol key. This doesn't currently happen, those events only get emitted for special keys.
  2. Extend all the key event structs with a new field, a slice of modifiers: Modifiers.
  3. When emitting a key down/up/repeat, add all pressed modifiers to the slice.
  4. Add a helper method to those structs like this: Modifier(mod string) bool, which returns true if the specified modifier is in the slice.

Then you'd use it like this:

switch event := event.(type) {
case win.KeyDown:
    if event.Modifier("ctrl") && event.Key == 's' {
        // save the file
    }
}

faiface avatar May 31 '19 00:05 faiface

too complicated I guess

It's important to understand the definition of simplicity in order to characterize an approach as being complex. At least to me, simplicity can be defined as the following: The minimal number of orthogonal abstractions required to formulate/encode a solution. Therefore, it's critical to identify a solution's core abstractions and enforce their semantics, otherwise, additional tangential concepts surface in layers where they shouldn't, obfuscating and encumbering the core ones. When obfuscation occurs, it becomes difficult for a software developer to identify core abstractions and reliably apply them without also understanding tangential concepts that must sometimes be implemented in certain contexts. Therefore, to me, it's the tangential concepts that bubble up through layers which represent unnecessary complexity.

When extending a solution to incorporate a new "feature", given the understanding above, the coding required to realize the feature should make every attempt to transform its potentially foreign/discordant concept(s) into the core abstractions already supported by the solution. A transform that encapsulates this conversion requiring only the minimal amount of information, that's naturally available/expressed in the core layer, to me, represents a simple solution. For example, a properly encapsulation transform isolates the coupling of tangential concepts to core ones preventing the rampant replication of this coupling throughout the solution's code base.

Notice the above discussion doesn't measure simplicity via metrics like lines of code, changes to structs, or the incorporation of slices. It focuses on maintaining the expressiveness of the existing core abstractions. Therefore, especially in situations where a new feature diverges, requiring the transformation of difficult tangential concerns to core ones in order to seamlessly map this new feature to core abstractions, it shouldn't surprise anyone that the encoding of this transform will probably be labeled "complex" when measured, for example, by lines of code. Unfortunately, a transform's aesthetics, such as lines of code, tend to dominate design decisions instead of focusing on the resulting simplicity it offers which enables the seamless inclusion of the new feature.

Why is the above important in the context of this discussion? A core abstraction of your project is a concept called a Key. It has behaviors expressed by the following interfaces KbUp, KbDown, and KbRepeat. Every concept considered a Key should implement these interfaces without introducing other abstractions. To align the proposed solution with your project's Key abstraction, my original post equated the semantics of a key combination to a "single" key. In other words, although a user may physically express an intention using two or more physical keys, for example "Alt+f" , the combination of these physical keys should result in a "single", logical key - open file menu.

If your are swayed by the discussion to consider a solution that applies the existing Key abstraction to key combinations by:

  • encoding a stateful function as described in my prior post,
  • and the extension of this transform to accept a function that maps physical keystrokes to logical ones

then I can guarantee the following:

  • the existing Key abstraction would not change at all,
  • and it would be unnecessary to introduce the event.Modifier abstraction.

Regarding the tasks proposed in your reply:

  1. Emit a key down/up/repeat event when pressing a letter/symbol key. This doesn't currently happen, those events only get emitted for special keys.

Absolutely! Every key conforms to the Key abstraction adding uniformity to the processing of any key. This simplifies the key processing model presented to a developer using your package. There are no "surprises" like the one I experienced when typing simple characters, as I expected simple characters to adhere to the Key abstraction. Also, the model presented to a developer has to be assimilated and reconstructed within the developer's mind in order for the developer to effectively apply it. A simple model, one that exposes fewer abstractions and more importantly no "exceptions" when applied in different contexts is simple to remember and use.

  1. Extend all the key event structs with a new field, a slice of modifiers: Modifiers.
  2. When emitting a key down/up/repeat, add all pressed modifiers to the slice.
  3. Add a helper method to those structs like this: Modifier(mod string) bool, which returns true if the specified modifier is in the slice.

I would not support adding the notion of a "Modifier" for many reasons:

  • A Modifier represents a tangential concept which requires an additional interface to determine a key's "purpose". Shouldn't purpose be innate to Key without exposing another concept and its interface? Every physical key has its own unique key code to express purpose. Can you think of a means to assign logical keys their own unique code (purpose) without changing Key's interface? Essentially, the Modifier abstraction breaks the current semantics of Key.
  • Coding becomes obfuscated not only by the call to Modifier(), but the dependency between the if statements below that implement a form of operator precedence to disambiguate the role of "s" when it acts as a member of a key combination or otherwise - as itself.
switch event := event.(type) {
case win.KeyDown:
  if event.Modifier("ctrl") && event.Key == 's' {
      // save the file
   } else if event.Key == 's' {
     // the letter "s"
   }
}

Now compared the above to the code below that assumes key combinations have been mapped to a single logical Key:

switch event.String(){
case "kb/down/filesave": {
     // save the file
     }
 case "kb/down/s":{
    // the letter 's'
    }
}

Notice the absence of any dependency between the case statements. Not only is this form easier to comprehend, the statements can be unilaterally reordered without introducing a bug.

  • If you decide to implement the Modifier abstraction, note that the order of key combinations may matter depending on the developer/platform. Therefore, the Modifier interface must bubble up another concept, the order of the modifying keys. For example, "ALT+SHIFT+Tab" isn't the same as "SHIFT+ALT+Tab" on the Ubuntu desktop. These combinations, however, are equivalent when the Windows 7 Desktop has focus.

I hope the above has demonstrated my notion of simplicity in meaningful terms of software design: preserving the semantics of core abstractions, encoding transforms to convert a foreign abstraction to an existing core one, encapsulation/layering to prevent the escape of tangential concerns when encoding transforms, and the reduction in dependencies between exposed concepts especially when a design successfully encapsulates tangential ones.

If you wish to continue this collaboration, let me know and (I'll or you?) can provide an interface to the transform (stateful function) required to adhere to the existing interface of Key when processing key combinations.

WhisperingChaos avatar May 31 '19 23:05 WhisperingChaos

Just right before you replied I had a thought and I realized that your idea is actually pretty good :D. It takes time sometimes.

Now, I must say that one of the reasons I didn't initially appreciate your idea was that you write really long texts and it's easy to get lost in them so I probably didn't fully understand the idea. Sorry.

Now, I have an idea how to implement this nicely, which is a little different (or perhaps not?) from your exact suggestion, so let me explain.

The win.KeyDown/Up/Repeat events would remain the same except that we add codes for new keys like letters, like I already said.

We add a new event type called win.KeyCombo, which would look like this:

type KeyCombo []Key

And would have one method:

func (kc KeyCombo) Is(keys ...Key) bool {
    // checks if the keys are the same including order
}

The window would store and internal stack of keys. Each time key is pressed, it is added to the stack. Each time a key is released, everything is popped until that key in the stack (including it).

Also, each time a key is pressed, the current key stack gets copied into a new KeyCombo event that gets produced.

There's no need to any Envs other than the window to synthesize these events, because they'll simply receive them from the window already synthesized and retransmit them further.

So, this is how I'd implement. I know it's not exactly your original idea. Let me know what you think about it.

I'll probably reply later, because I gotta go sleep now :D

faiface avatar May 31 '19 23:05 faiface

Also, the combo would format like "kb/combo/ctrl/s", so you'd be able to do the kind of switch with strings you described.

I don't think higher level events like "file save" should be transmitted through the events channel. Those I think should go through their separate channels, like you can see in the examples. If they were to go through the events channel, then everyone would try and shove everything in there in their app and it'd become a mess.

faiface avatar May 31 '19 23:05 faiface

Now, I must say that one of the reasons I didn't initially appreciate your idea was that you write really long texts and it's easy to get lost in them so I probably didn't fully understand the idea. Sorry.

  • Your ability to converse in English overshadows my capacity to express myself in any other language. I'm grateful that you're willing to conduct this discussion in English.
  • I intentionally express my ideas in what I would consider a formal, concise manner. However, developing software for execution by a mindless machine requires extreme attention to detail. I've been trained, through my interactions with it to be not only explicit but also verbose by human measures.
  • There's a reluctance, in the context of today's staccato communication, to engage in thoughtful presentation of ideas, especially when explaining their motivation. Between this and work place demands to provide quick solutions, there's a pervasive attitude of "impatience" towards detail. I'd suggest that this attitude is somewhat unconscious. Therefore, we're all afflicted by it.

type KeyCombo []Key

Adding this abstraction is certainly an improvement when compared to exposing the Modifier concept. KeyCombo isolates the use of an ordinary key from its participation within a combination eliminating the coding dependency required when using Modifier to disambiguate these contexts. However, realizing this abstraction adds another concept to the core Key abstraction. Do you really want to expose the notion of key combinations to an application that only needs to receive a signal expressing a purpose and not how that purpose is generated?

There's no need to any Envs other than the window to synthesize these events, because they'll simply receive them from the window already synthesized and retransmit them further.

Do you intend to always apply a filter in the Win package to replace individual key events with the KeyCombo event type? If so, implementing in this manner would prevent a developer from encoding a custom key processing widget. I would suggest, if you plan to implement KeyCombo that it be an "option" specified when creating a Win.

Also, the combo would format like "kb/combo/ctrl/s", so you'd be able to do the kind of switch with strings you described.

Just to clarify all Key interfaces would be implement such as:

  • kb/up/combo/ctrl/s
  • kb/down/combo/ctrl/s
  • kb/repeat/combo/ctrl/s

I don't think higher level events like "file save" should be transmitted through the events channel. Those I think should go through their separate channels, like you can see in the examples. If they were to go through the events channel, then everyone would try and shove everything in there in their app and it'd become a mess.

I need to consider this more thoughtfully before replying.

WhisperingChaos avatar Jun 01 '19 12:06 WhisperingChaos

Well, I think the art is to express yourself shortly and to the point. It's probably much more difficult than expressing verbosely. Also, the more complicated words you use, the less exact your message is. This is because the likelihood of our definition being different increases with less common words.

Anyway, to the issue :)

You slightly misunderstood me. The kb/down/<key>/kb/up/<key>/kb/repeat/<key> events would remain exactly as they are now. However, there would be one additional kind of events: combo events. They'd look like this:

kb/combo/a
kb/combo/shift/a
kb/combo/ctrl/shift/a
kb/combo/a/b
kb/combo/space/a/ctrl
kb/combo/a/b/c/d/e/f

A combo event would happen any time you pressed a key. The contents of it would be the list of all currently pressed keys in the order they were pressed. That's it. The Is method I outlined in the previous message is simply a way to easily check whether the pressed keys correspond to some expected combo.

There would be no filters or anything, just one more kind of events fireing every time a kb/down/* event happens.

Do you really want to expose the notion of key combinations to an application that only needs to receive a signal expressing a purpose and not how that purpose is generated?

Yes, I think events should simply correspond to the raw events of the environment. They should not represent any higher-level concepts. That should be left to the programmer of the application and their own channels.

PS: I'm sorry I haven't replied earlier, I've been traveling the whole day.

faiface avatar Jun 02 '19 22:06 faiface

Also, the more complicated words you use, the less exact your message is. This is because the likelihood of our definition being different increases with less common words.

Thanks for stating this insight, especially your reasoning!

Regarding the semantics of KeyCombo:

  • When adding a new abstraction, like KeyCombo, as a public one to an existing set, it should offer mechanism(s) (value) that cannot be achieved through the combination of the existing abstractions, nor should it encumber them. What value does this abstraction offer?
  • Why wouldn't key combinations generate KeyDown, KeyUp, and KeyRepeat? How would a developer encode navigating browser tabs using the key combination CTRL+Tab without synthesizing these events for key combinations?

Given the clarifying statements of your post above, KeyCombo shares the same problematic (IMO) semantics as the Modifier solution - it exposes the concept of a key combination that encumbers the processing of each single key that's considered a member of the combination. Try writing the key processing code for a text window that accepts CTRL+I to enable italics while it concurrently accepts "I" as text.

It seems we disagree on the semantics of a key combination. One either characterizes physical key combinations as a single one and encapsulates this decision or exposes the concept of a key combination.

When encapsulating key combinations:

  • Preserves the existing Key abstractions and seamlessly applies them to key combinations. The resulting uniformity minimizes a developer's effort to both understand and apply Key.
  • Avoids extending the existing set of Key abstractions with another one. One less abstraction to learn and potentially interfere with current or future ones.
  • Flexibly incorporates support, as an encapsulated transform (stateful transform) can be inserted when and where it's needed. For example, it can be called to process Events in any goroutine that reads from an Event channel. When unwanted, the elemental Keys are streamed, enabling a developer to encode a custom key processor.

When exposing key combinations:

  • Disrupts the semantics of Key abstractions. Therefore, important fundamental abstractions, like KeyUp, KeyRepeat and KeyDown become "exceptions" as they don't apply to key combinations. Exceptions require more effort for a developer to remember, code, and test.
  • Adds a new abstraction which changes the meaning of other already existing ones and may affect the adoption of other future ones.
  • There seems to be a preference to always apply KeyCombo within Win. Therefore, all goroutines muxed to Win will receive this Event. If a developer wishes to encode a custom keyboard handler, should the developer be forced to eliminate the synthesized KeyCombo event?

A stated goal of the gui project is the notion of "Super minimal". I would suggest that limiting its framework abstraction set to the essential ones promotes this objective. Therefore, instead of adding abstractions to gui when implementing a new feature, encourage the use of transforms to, when possible, convert the concepts required by the new feature to the ones already present.

WhisperingChaos avatar Jun 03 '19 18:06 WhisperingChaos

Okay, I get it. The problem is that receiving both "key down" and "key combo" events at the same time makes it cumbersome to use because you always want to react to either one or another, but not both. Correct.

I'm sorry for wasting so much of your time on this issue :D. I see the whole of its complexity now and I'm not sure about the right solution.

The correct solution actually might be to add a simple thing like this:

func InterceptEvents(env gui.Env, func(Event) Event) gui.Env

And perhaps add a simple interceptor for general key combinations:

func KeyCombinations() func(Event) Event

An addition of another KeyCombo event type will probably be necessary anyway, though.

Anyway, it's possible I'm still missing something. I think the best way to proceed with this proposal will be that you actually implement something, which I trust you're competent enough to do since you understood the problem from the beginning much better than I did. Debating code will be much easier than debating ideas and I'm sure we'll soon converge on the solution. I think it's quite probably I'll accept your first solution.

faiface avatar Jun 06 '19 21:06 faiface

Hello!

As a user of this library, I want to share how I would like to write client code that handles both normal key events and key events with modifiers.

I sticked to use case in WhisperingChaos's comment: "Try writing the key processing code for a text window that accepts CTRL+I to enable italics while it concurrently accepts "I" as text."

Key press (including modifiers) and key combo can be emitted as individual events.

Let's image a text editor, and a user presses 'h', releases, then "ctrl+i" to enable italic, releases, then presses "i"

Here, the library emits 4 KbDown events

  1. KbDown(key: h, modifiers: nil)
  2. KbDown(key:ctrl, modifiers: [ctrl])
  3. KbDown(key:i, modifiers: [ctrl])
  4. KbDown(key:i, modifiers: nil)

Editor code might look like this:

for event := range w.Events() {
  switch e := event.(type) {
    case win.KbDown:
        if e.modifiers != nil {
              // handle shortcuts
        } else {
            // editor.type(e.key)
        }
    }
}

And client code doesn't have to check if e.modifiers != null if it doesn't care about key combinations. So, it would't break existing code.

Please let me know if I'm missing something

sanan-fataliyev avatar Feb 17 '23 22:02 sanan-fataliyev