keyszer icon indicating copy to clipboard operation
keyszer copied to clipboard

Support Wayland

Open joshgoebel opened this issue 3 years ago • 58 comments

Basically we just need a way for Wayland to provide us:

  • currently focused application name
  • currently focused application window title #2
  • preferable a hooks or signals when either of these things change

If this is possible, hooking it up should be largely trivial and adding a way for a user to tell the keymapper if they are using X or Wayland. (perhaps we could even auto-detect based on ENV)

joshgoebel avatar Jun 14 '22 08:06 joshgoebel

This would be a really huge addition to see & a big motivator for me to get Kinto over to keyszer yesterday if this gets added lol. I am not sure if I would daily drive Wayland or not still, but as it stands I can't even consider it realistically till this occurs but will likely have to occur on the DE level, aka Budgie, Gnome, KDE, Mate, XFCE, etc.

And just to link this back to an earlier thread on this topic over at xkeysnail. https://github.com/mooz/xkeysnail/issues/108

rbreaves avatar Jun 15 '22 03:06 rbreaves

but will likely have to occur on the DE level, aka Budgie, Gnome, KDE, Mate, XFCE, etc.

Yeah I saw that longer thread the other day... my current understanding is we're waiting on the Wayland people/DEs to make this all possible... if the APIs and libraries existed it should be trivial... our X11 integration code is only like 20 lines.

We'll need some type of abstraction layer to a generic idea of a "window manager", but it makes no sense to think about that very hard until we know what the Wayland interface will be... obviously anything that gets us "app name" and "window name" would be great, and if that was pushed to us rather than us having to poll it, even better.

joshgoebel avatar Jun 15 '22 03:06 joshgoebel

Yea imo the best possible solution is they add the API to Wayland or XDG I think it is/was. Regardless some layer that will be accessible to all DEs & apps therein.

rbreaves avatar Jun 15 '22 12:06 rbreaves

@joshgoebel

It seems like it's going to be years before something like XDG or wlroots will add the ability to get the window class/name in "Wayland" in general, but there are existing ways to access the information in specific environments like GNOME (DBus calls) or sway. Meanwhile, even though Wayland is still a bit buggy and incomplete, a lot of users are seeing benefits like better multi-monitor support or high refresh rates, so I really can't fault them for wanting to use Wayland over X11 already.

I ran into a keymapper project that claims to provide per-application remapping abilities in both X11 and Wayland, although they appear to use the techniques specific to limited environments like GNOME, sway, and hyprland that already provide this information in some way.

Unfortunately this project is written in Rust rather than Python, so it would not exactly be a simple copy-paste job to try and bring the methods into keyszer to get some limited usefulness in Wayland. But, on the other hand, I don't think the techniques are particularly complicated, so it should be feasible to just look at the methods they are tapping into and do something similar in keyszer.

Question is, how open are you to starting to integrate some Wayland solutions for per-application mappings that currently will only work in a few different Desktop Environments? I get the feeling that the X11 user base is going to start shrinking pretty quickly now, between all the distros that are starting to use Wayland as the default, and users that are actually moving to and liking Wayland on their own for one reason or another. Feels like the balance is really starting to shift lately.

A large chunk of Linux users are on popular distros like Ubuntu and Fedora, which use GNOME by default, so it seems like a lot of users would be served by at least supporting GNOME's DBus method to get the window info in Wayland.

The project is here:

https://github.com/k0kubun/xremap

Doesn't look like they've implemented matching on the window "name" (title) as opposed to the application "class" just yet, but I think that's more of them needing to implement the logic rather than not knowing how to get the window title. With keyszer already having the working logic in place to do matching on the window "name" property, I'm hoping it will be possible to just feed that existing logic the window title from Wayland windows just as easily as getting the Wayland window class info.

Thoughts?

RedBearAK avatar Feb 05 '23 02:02 RedBearAK

I feel like we'd just go with modules/classes for the WM... so in your config you'd specify which module to use and that module would be responsible for providing the window context into the KeyContext. So anyone can use whatever WM they want, just so long as they can provide a module that (from Python) is able to learn a few key details about the windows (name and class hopefully?). I'm not sure the best way to structure that in Python off the top of my head, but it should be fairly trivial. Just a matter of how you inject/connect that module with KeyContext.

Assuming they both have concepts like "name" and "class" you could seamlessly switch between them [window managers] by just changing one line of your config file (or perhaps even auto-detecting, though I'm not sure I'm interested in building that into the key mapper itself)...

Right now you could start just by hacking the existing _query_window_context function to use Wayland/Gnome instead of X... you'd probably add a wm/wayland_gnome.py for the actual wrapper that talks to Wayland... (and move xorg there also, etc)...

Once you get that working we could circle back to how to make which WM module to use a configurable choice.

Assuming they both have concepts like "name" and "class"

If it turns out they are VERY different, than that would be a larger discussion.

joshgoebel avatar Feb 05 '23 02:02 joshgoebel

Apps need to be able to display unique window or tab titles, so I have little doubt that info is in Wayland the same way it is in X11. Not too worried about that.

(or perhaps even auto-detecting, though I'm not sure I'm interested in building that into the key mapper itself)

I mildly disagree with this.

Since users do sometimes have good reason (at this early juncture in Wayland's life cycle) to need to switch back and forth between X11 and Wayland sessions, I feel like we should at least make some attempt to do auto-detection. It really shouldn't be that hard. There was at least one environment variable I found that seemed to reliably hold info on whether the session is Wayland or X11. Something that would have fixed a problem the Kinto installer sometimes has with failing to detect Wayland, but it was never merged.

A manual config setting as a backup, in case the auto-detect isn't working for some reason, shouldn't be too hard.

Right now you could start just by hacking the existing _query_window_context function to use Wayland/Gnome instead of X... you'd probably add a wm/wayland_gnome.py for the actual wrapper that talks to Wayland... (and move xorg there also, etc)...

Yeah, I was going to look at that, study how it gets into KeyContext, etc. But also looking at the methods and seeing if I can get the info and just have it show up in the log, to start with. Bit by bit.

As long as you're cool with a growing collection of patchwork solutions rather than waiting for a general "Wayland" solution to appear. Once the framework for adaptation is in place, it should allow a more general solution to just drop in place and replace the patchwork stuff, eventually. Seems like a good thing to waste some time on.

RedBearAK avatar Feb 05 '23 04:02 RedBearAK

As long as you're cool with a growing collection of patchwork solutions rather than waiting for a general "Wayland" solution to appear.

When things finally get organized (in the Wayland ecosystem) it would just be a matter of writing another short module to handle the "official" API - or upgrading the existing one... should be pretty simple. Since it's easy to add/remove/upgrade I'm not sure why I should oppose.

joshgoebel avatar Feb 05 '23 04:02 joshgoebel

My new AI overlord says:

The XDG_SESSION_TYPE environment variable in Linux is used to specify the type of desktop session. The value of this variable is set by the desktop environment, and its possible values depend on the implementation. However, some common values for XDG_SESSION_TYPE include:

    x11: for X11-based desktop sessions
    wayland: for Wayland-based desktop sessions
    mir: for Mir-based desktop sessions

Note that the exact values may vary depending on the Linux distribution and desktop environment. It's also possible that some desktop environments use custom values for XDG_SESSION_TYPE.

In my testing, the values have always been either x11 or wayland, whereas XDG_SESSION_DESKTOP or XDG_CURRENT_DESKTOP often have customized values that mix the session type with the DE, like "gnome-xorg".

Since it's easy to add/remove/upgrade I'm not sure why I should oppose.

My thoughts exactly. Alright, I'll start chipping away at what I can, when I have the time. Probably will need some pointers now and then, as usual, if my own research fails me.

RedBearAK avatar Feb 05 '23 05:02 RedBearAK

@joshgoebel

Looks like xremap is relying on a GNOME extension (also called xremap), specific to their project, to allow it to monitor focus changes and access the window attributes. But, and this is very interesting to me at first glance, there is a PyWayland module that appears to provide similar functionality. Both monitoring the changes to the focused Wayland "surface", and getting the window attributes (which look to have the same names as the X11 attributes).

import pywayland.server

def focus_handler(surface, event):
    # Get the WM_NAME and WM_CLASS properties of the focused surface
    wm_name = surface.get_label("WM_NAME")
    wm_class = surface.get_label("WM_CLASS")

    # Handle the focus event here, using the wm_name and wm_class variables
    pass

# Connect to the Wayland compositor
display = pywayland.server.Display()
display.add_global_listener(pywayland.server.Seat.FOCUS, focus_handler)
display.run()

Doesn't seem to be all that complicated to set it up, but it would need to be another installed Python module since it's not a built-in module at this time.

Any thoughts on which road to go down? I didn't write any of the code in the example, and haven't started testing anything yet, I'm still just looking into how it's supposed to work. Looks like either way there would need to be something "external" added into the project.

I feel like it can't possibly be as simple as it seems at first glance with PyWayland, because my source keeps saying "depends on the compositor", but it might at least work with GNOME shell right off the bat.

https://pywayland.readthedocs.io/en/latest/

Relevant sections might be (if they still exist):

pywayland.server.wl_surface.WlSurface pywayland.server.wl_shell_surface.WlShellSurface

RedBearAK avatar Feb 06 '23 00:02 RedBearAK

@joshgoebel

I have the oddest thing happening. I've got the window class and name in Wayland with dbus and the help of a GNOME extension that exposes the focused window attributes. But...

What seems to be happening is... The mapped combo will go through, but the input combo will also go through. I can set up a keymap for a specific application like GNOME Terminal, and that shortcut mapping will only work in that application, so the window matching is obviously working. But the keys from the input side end up not being suppressed, although the output keys also come out, as they should.

So this remap of CapsLock you see below definitely types the output string, and only does it in GNOME Terminal, yet it also toggles CapsLock at the same time. Still in GNOME Terminal. And the tab nav shortcuts (Shift-Cmd-Braces) end up doing tab movement (Shift-Ctrl-PgUp/PgDn) instead of tav navigation, because the Shift key press is leaking through to mingle with the Ctrl-PgUp/PgDn of the output combo.

keymap("What is wrong with gnome-terminal-server", {
    C("Shift-RC-Left_Brace"):    C("C-Page_Up"),
    C("Shift-RC-Right_Brace"):   C("C-Page_Down"),
    C("CapsLock"):              ST("What the heck"),
}, when = matchProps(cls="^gnome-terminal.*$"))

Perhaps this is happening in all circumstances, but in other cases it doesn't seem to cause an issue.

I'm just not sure how it's possible the output combo can get fired off without the input being properly suppressed. It should be doing either one or the other, not both. Is the suppression of a matching input combo decided in input.py, or in transform.py? I made a couple of minor modifications to transform, but I reverted even those minor changes and still get this odd behavior. I haven't touched input that I know of.

The new context module passes the exact same dictionary of information back to KeyContext when the "get context" function is called. It works just like xorg, but just pulling the class and name from the DBus connection to the shell extension. And, the window matching seems to work.

If I can get past this weird side effect I should be able to clean this up without too much more trouble. The only thing missing for now is the ability to walk up the window tree to the "parent" of a window with no WM_CLASS and WM_NAME, which means it wouldn't work with JetBrains quite yet. I don't know of any other app that requires that particular workaround.

The extension is called "Window Calls Extended". https://extensions.gnome.org/extension/4974/window-calls-extended/

I tried to work with PyWayland, and pydbus to do this. Things did not go well in either case. Mainly I had the strangest problem with Python seeming to be unable to find most of the methods that I was trying to use from the modules, giving me constant AttributeError issues no matter how I did the installs or imports. Eventually I gave up, and managed to convert[*] some working gdbus terminal commands into what works with the regular dbus module. The gdbus commands could also be used directly via subprocess.run, but that seemed to be pretty slow and not very usable.

[*] (With a LOT of help from a certain, suddenly very popular, AI language model.)

RedBearAK avatar Feb 11 '23 10:02 RedBearAK

Are you running keyszer at boot BEFORE the window manager? Perhaps weyland tries to grad the keyboard directly itself or something. Otherwise no idea. We do not pass the raw input (other than some of the non key events as you already know).

joshgoebel avatar Feb 11 '23 14:02 joshgoebel

Almost all the heavy lifting is in transform.

joshgoebel avatar Feb 11 '23 14:02 joshgoebel

Are you running keyszer at boot BEFORE the window manager? Perhaps weyland tries to grab the keyboard directly itself or something. Otherwise no idea. We do not pass the raw input (other than some of the non key events as you already know).

No, the usual venv setup for testing the constant changes. I can see all the keystrokes in the log. It’s definitely all going through keyszer. It shows the correct combo in the right keymap being triggered, but the original keystrokes also show in the log as if there is no remapping happening. More keystrokes than when I use the same combo in X11. I don’t recall ever seeing anything like it before.

It’s my understanding that the grabbing of the keyboard device should be independent of the display server, with the only interaction with the display server being the context queries to see what the attributes of the focused window are at the moment you press the keys. The dict is sending x_error: False to KeyContext just like the xorg module. So it’s pretty strange. The logic should be stopping the unmapped keys from coming out.

I’ll have to litter transform with debugging output and see if can catch it going somewhere it shouldn’t. Then figure out why.

RedBearAK avatar Feb 11 '23 19:02 RedBearAK

but the original keystrokes also show in the log as if there is no remapping happening.

I'm not sure what this means... the log always shows input and output... so it's hard to imagine what you're seeing... a small log of a single combo might be nice to glance at. (X11 vs Weyland)

that the grabbing of the keyboard device should be independent of the display server

That would also have been my assumption. But Weyland is a whole other thing, in many ways not like X11 at all - perhaps it's ALSO grabbing the keyboard - which is why I was asking about load order.

The logic should be stopping the unmapped keys from coming out.

Well there is no explicit logic to do that - we just only output what we choose... grabbing the input means that nothing else can hear it ... unless we proxy it - like we do in the case of non-key events, etc.

joshgoebel avatar Feb 11 '23 20:02 joshgoebel

(--) Autodetecting all keyboards (--device not specified)
(+K) Grabbing AT Translated Set 2 keyboard (/dev/input/event1)

It shouldn't be possible for anything but evdev(?) to see the input from the real keyboard device after it's been grabbed, correct?

Maybe I'm misinterpreting what the logs are showing. There are "(OO)" lines for X11 for the input keys too. But in X11, the input keys don't actually end up doing anything.

Actually, now that I think about it, it may be more like the app is ONLY seeing the real input, and NOT seeing the mapped output. Even though the correct keymap seems to be getting triggered and supposedly the output keys should be going out. In the other apps like Firefox, shortcuts like this are probably working OK because the shortcut works after just modmapping the modifiers, without needing to be transformed further. That would explain why it also isn't working correctly in GNOME Text Editor.

Probably need to take a closer look at the actual window classes... If they're all slightly different from the X11 names... 😫

But wait, like I said, the string output I set up to output in the terminal absolutely works. Although, it should output "What the heck", but outputs "what the heck", with a lowercase "w".

These log examples show different keymaps being triggered, but that's just because I set up a special keymap for the terminal in Wayland to try and figure out what's going on. Before that the log showed it triggering on the same shortcut from "General GUI" just like it should.

This is one press of the physical keys Alt-Shift-Left_Brace (logical after modmap: RC-Shift-Left_Brace). Wayland. Has some added debugging lines.

(II) in LEFT_ALT (press)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) modmap: LEFT_ALT => RIGHT_CTRL [Conditional modmap - General GUI - not in remotes or terminals]
(DD) on_key RIGHT_CTRL press
(DD) suspending keys [RCtrl<Key.RIGHT_CTRL>]

(II) in LEFT_SHIFT (press)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key LEFT_SHIFT press
(DD) resuspending keys
(DD) suspending keys [RCtrl<Key.RIGHT_CTRL>, LShift<Key.LEFT_SHIFT>]
(DD) resuming keys: [<Key.RIGHT_CTRL: 97>, <Key.LEFT_SHIFT: 42>]
(OO) press RIGHT_CTRL 1676153137.7347488
(OO) press LEFT_SHIFT 1676153137.7348669

(II) in RIGHT_BRACE (press)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key RIGHT_BRACE press

(DD) WM_CLS: 'gnome-terminal-server' | WM_NME: 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'
(DD) DVN: 'AT Translated Set 2 keyboard' | CLK: 'False' | NLK: 'False'
(DD) KMAPS: ['User hardware keys', 'Wordwise - not vscode',
(DD)         'What is wrong with gnome-terminal-server', 'General Terminals',
(DD)         'General GUI']
(DD) COMBO: RCtrl-LShift-RIGHT_BRACE => Ctrl-PAGE_DOWN in KMAP: ['What is wrong with gnome-terminal-server']
(DD) spent modifiers []
(OO) release LEFT_SHIFT 1676153137.767943
(OO) press PAGE_DOWN 1676153137.768032
(OO) release PAGE_DOWN 1676153137.7680573
(OO) press LEFT_SHIFT 1676153137.768091

(II) in RIGHT_BRACE (release)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key RIGHT_BRACE release

(II) in LEFT_SHIFT (release)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key LEFT_SHIFT release
(DD) resume because of mod release
(OO) release LEFT_SHIFT 1676153137.893625

(II) in LEFT_ALT (release)
(DD) ######### ## ###  ctx_gnome_dbus_test.py:
	wm_class = 'gnome-terminal-server'
	wm_name = 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto-keyszer/kinto.py'

(DD) on_key RIGHT_CTRL release
(DD) resume because of mod release
(OO) release RIGHT_CTRL 1676153137.9142215

This is in X11:

(II) in LEFT_ALT (press)
(DD) modmap: LEFT_ALT => RIGHT_CTRL [Conditional modmap - Terminals]
(DD) on_key RIGHT_CTRL press
(DD) suspending keys [RCtrl<Key.RIGHT_CTRL>]
(DD) resuming keys: [<Key.RIGHT_CTRL: 97>]
(OO) press RIGHT_CTRL 1676153295.8804917

(II) in LEFT_SHIFT (press)
(DD) on_key LEFT_SHIFT press
(OO) press LEFT_SHIFT 1676153295.888439

(II) in LEFT_BRACE (press)
(DD) on_key LEFT_BRACE press

(DD) WM_CLS: 'Gnome-terminal' | WM_NME: 'clear && ./bin/keyszer --flush -w -v -c ~/.config/kinto/kinto.py'
(DD) DVN: 'AT Translated Set 2 keyboard' | CLK: 'False' | NLK: 'False'
(DD) KMAPS: ['User hardware keys', 'Wordwise - not vscode',
(DD)         'General Terminals', 'General GUI']
(DD) COMBO: RCtrl-LShift-LEFT_BRACE => Ctrl-PAGE_UP in KMAP: ['General GUI']
(DD) spent modifiers []
(OO) release LEFT_SHIFT 1676153295.9391887
(OO) press PAGE_UP 1676153295.9392915
(OO) release PAGE_UP 1676153295.939376
(OO) press LEFT_SHIFT 1676153295.9395306

(II) in LEFT_BRACE (release)
(DD) on_key LEFT_BRACE release

(II) in LEFT_SHIFT (release)
(DD) on_key LEFT_SHIFT release
(DD) resume because of mod release
(OO) release LEFT_SHIFT 1676153296.0417325

(II) in LEFT_ALT (release)
(DD) on_key RIGHT_CTRL release
(DD) resume because of mod release
(OO) release RIGHT_CTRL 1676153296.0687592

RedBearAK avatar Feb 11 '23 22:02 RedBearAK

There is definitely a trend of the same app using a different class/name in Wayland vs X11. Annoying. Or maybe it's more of a difference between Fedora and Ubuntu? But no, I have an app was installed from Flahub in either case, and GNOME Terminal has always shown something different in X11 on either distro, so I think it's more of some apps choosing to show a different class and "title" for Wayland.

But that should just be a matter of adding additional patterns to match those apps. Which was already a thing that often needed to be done for different distros somehow using a different WM_CLASS for the same app, and native packages vs Flatpak or other sources having a different WM_CLASS. Shouldn't be a big deal.

And again, I can activate a special keymap for some app like GNOME Text Editor, with the class it's showing me in Wayland (org.gnome.TextEditor vs gnome-text-editor) and those mappings will only work in that app window. So the matching on the window attributes (in Wayland) is absolutely working. But, things just don't come out quite right.

The string output when I press Grave in Text Editor is set up to be (after it clears isDoubleTap):

You have double-tapped the Grave key!

But instead what I get is:

you have double-tapped the grave key1

All lower case. It's very consistent. But there isn't even a Shift key involved in the input combo. So why are the shifted characters consistently losing their shifted state, while the unshifted characters remain unshifted? And the result is the same whether I have CapsLock enabled or disabled. (This is a new pull of keyszer, without the unmerged fixes for the string and Unicode processors to adjust behavior for CapsLock LED status). The lack of the CapsLock fix doesn't explain what's happening, since toggling CapsLock has no effect on the output.

I'm at a total loss to understand this so far.

RedBearAK avatar Feb 11 '23 23:02 RedBearAK

Straining my brain here... It's as if the modifier keys (and only the modifier keys) in the output side of the mapping are not actually making it to output. Meanwhile, "normal" keys (letters, numbers, punctuation keys) that aren't modifier keys are making it to output and being "seen".

But, like those Shift key presses and releases that are supposed to be part of the string output, they are all there in the log. Just somehow they don't seem to be getting emitted by the virtual keyboard. So what the app sees is just lowercase letters, the number "1" instead of "Shift-Key_1", and so on.

RedBearAK avatar Feb 11 '23 23:02 RedBearAK

This is really making no sense to me. It's like something is "catching" the output keystrokes (and input keystrokes?), but only when they are modifier keys. The events are happening just the way they are supposed to, AFAICT.

Press LEFT_SHIFT, press "Y", release "Y", release LEFT_SHIFT, etc.

(DD) DVN: 'AT Translated Set 2 keyboard' | CLK: 'False' | NLK: 'False'
(DD) KMAPS: ['OptSpecialChars toggles', 'OptSpecialChars - US',
(DD)         'User hardware keys', 'GNOME Text Editor',
(DD)         'Wordwise - not vscode', 'Cmd+Dot not in terminals',
(DD)         'General GUI']
(DD) COMBO: GRAVE => <function isDoubleTap.<locals>._isDoubleTap at 0x7f1213e7dfc0> in KMAP: ['GNOME Text Editor']
(DD) spent modifiers []
(DD) ## isDoubleTap: 
	Time diff (just right): 
	_tapTime - tapTime1=0.18475580215454102
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.LEFT_SHIFT: 42>, <Action.PRESS: 1>)
(OO) press LEFT_SHIFT 1676158231.0707545
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.Y: 21>, <Action.PRESS: 1>)
(OO) press Y 1676158231.0709589
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.Y: 21>, <Action.RELEASE: 0>)
(OO) release Y 1676158231.0710757
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.LEFT_SHIFT: 42>, <Action.RELEASE: 0>)
(OO) release LEFT_SHIFT 1676158231.0711682
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.O: 24>, <Action.PRESS: 1>)
(OO) press O 1676158231.0713558
(DD) ### ### ### output.py:
	self, key, action = (<keyszer.output.Output object at 0x7f1213d4f940>, <Key.O: 24>, <Action.RELEASE: 0>)
(OO) release O 1676158231.0714197

Maybe I should ask the evdev folks.

RedBearAK avatar Feb 11 '23 23:02 RedBearAK

Re: Shifting for "faking typing"... output looks like it's hitting shift, if that's not registering in Weyland, no idea.

Re: X11 vs Weyland, both outputs look like "what I'd expect" IF you have suspend turned down super low... the original keys DO get sent thru right away - then they are lifted before the real combos... turning suspend up is what prevents this and what I always preferred it. I don't see anything unexpected happened at a glance in either log.

joshgoebel avatar Feb 12 '23 04:02 joshgoebel

IF you have suspend turned down super low... the original keys DO get sent thru right away - then they are lifted before the real combos... turning suspend up is what prevents this

Working in a Boxes VM on the same laptop with the touchpad that requires it be turned down for any kind of Mod+click to work. Thought maybe the device inside the VM would act more like a regular mouse, but doesn't seem to.

So... if the original keys are always getting sent through with a low suspend timeout, why don't they ever seem to actually do anything, at least when things are working normally? I know there are combos I'm using where the input combo would make something else happen, if it was getting through to the app. Seems like I still don't precisely understand what happens to the input combo. Or is it just that the original modifier key presses get through, then released, before the full "combo" with the regular key and transformed modifiers are pressed?

output looks like it's hitting shift, if that's not registering in Weyland, no idea

That does seem to be the case, unfortunately. Like a selective filter.

RedBearAK avatar Feb 12 '23 11:02 RedBearAK

Or is it just that the original modifier key presses get through, then released, before the full "combo" with the regular key and transformed modifiers are pressed?

Yes. Most software doesn't seem to care about modifiers until a non-modifier is hit - and only the modifiers held down at that moment seem to matter.

joshgoebel avatar Feb 12 '23 14:02 joshgoebel

Yes. Most software doesn't seem to care about modifiers until a non-modifier is hit - and only the modifiers held down at that moment seem to matter.

I see, better now than before.

But the "normal" non-modifier key that completes the input combo, that doesn't end up going through before the input modifiers are released, right? (Or at all, if it's not in the transformed combo.) Or at least its not supposed to? Because that would actually make the app do whatever the input combo would do. But that normally doesn't happen. So there is ultimately one key press & release for each combo that should be getting suppressed, right?

RedBearAK avatar Feb 12 '23 23:02 RedBearAK

Cross-posting from evdev issue thread:

Wait, adding a 0.05s delay to a part of the output module does have a significant effect. As in, it makes the intended transformed set of keystrokes "work" as they should.

    def __send_sync(self):
        time.sleep(0.05)
        _uinput.syn()

But there is still the very strange problem where the input combo, that the app window is never supposed to "see", is still being "seen" in addition to the transformed keystrokes. I've been working with this keymapper for a couple of years in X11 environments and have never seen this phenomenon. Which is why I came over here to see if there is a chance the input is not being successfully isolated from Wayland when the keymapper "grabs" the device, which has always worked fine in X11.

Alright, I slowed things WAAAY down with a 0.5s delay [in output.py sync function], and got this:

WWhhaatt  tthhee  hheecckk

The macro string is supposed to be:

What the heck

The CapsLock key that triggers the macro is "seen" by the app immediately (I see an on-screen notification for CapsLock and NumLock keys) and then this macro string is slowly typed out, with each character typed twice. Maybe because the long delay is before sending a sync, and something is interpreting as the key being tapped again? But there are two capital "W", which would require Shift-W in both cases.

I'm not completely familiar with how the event code sequences work, and why this long delay causes this doubling up of the characters.

RedBearAK avatar Feb 12 '23 23:02 RedBearAK

Well, good news, in a way. The issue with CapsLock was actually a red herring. For some reason, when Boxes has the keyboard focus inside the VM window, it isn't really isolating the keyboard input within the VM. I guess it's doing the "shared keyboard" kind of thing, even though I have to move the mouse up to the window's top bar to Cmd+Tab away from the Boxes window. The CapsLock notifications were from the HOST outside the VM. Not inside. Which is why the output of the strings inside the VM never changed. Duh.

Adding the delay in the sync function, if the delay is long enough completely fixes the behavior of transformed shortcuts not doing what they are supposed to do. But it has to be pretty long. Like 0.01 to 0.05 or so [Edit: After removing an additional 0.1s delay from the Unicode function, the minimum delay really seems to be at least 0.05s for any kind of reliability]. So once again we're talking about a pretty significant delay if you want to spit out a macro string or something. And it's still not 100% reliable. Stops part way through macro strings sometimes, even with the delay.

But, in a very technical sense, you could say that I have actually succeeded in bringing support for app-specific mappings using keyszer on Wayland+GNOME.

Yay me. I'm amazing.

It just really seems to have a major reliability problem, like when I was trying to get my Option-key special characters to work correctly in Kubuntu, and the only thing that seemed to help was a similar delay (before the Enter keystroke in the sequence that comes back from the Unicode processor) or disabling the sync entirely.

😞

I'm kind of expecting this to work better on a bare metal install, but have no direct evidence to support that yet. Will have to try that next.

RedBearAK avatar Feb 13 '23 00:02 RedBearAK

😡 😠 👿 Of course, the sleep delay is so long that it actually gets in the way of normal typing, even when not dealing with transformed shortcuts or macros. Without finding an explanation as to why everything is so unreliable without the delay in certain situations, this is an unusable solution.

RedBearAK avatar Feb 13 '23 00:02 RedBearAK

How is this even possible, when the keystrokes that should produce the "!" character should be intrinsically connected to each other as a single combo, and the Shift key shouldn't even be pressed until after the "y" key is released?

You have double-tapped the Grave keY1

Seriously:

You have double-tapped THE Grave key1

I feel like this is a powerful clue.

RedBearAK avatar Feb 13 '23 02:02 RedBearAK

"You have double-tapped THE Grave key1" should be intrinsically connected to each other as a single combo [emphasis mine]

No such thing at the low-level, it's just a sequence of keyboard events - combos aren't really separated by anything... that said it should still be sequential, so that is quite confusing to me also - but what does the output log show for that?

I imagine that could happen if you ripped out the SYNC events... so I'm not sure if you are still hacking such things... Because using sync is how you signal multiple events happened at the same time... so if you weren't SYCNing you could wind up where it's entirely ambiguous when shift was pressed if it was just part of a huge set of characters...

joshgoebel avatar Feb 14 '23 05:02 joshgoebel

Logs are fine. Both the keyszer log, and the log showing what evtest sees coming out of the virtual keyboard device. Perfect order. But whatever is receiving the keystrokes seems to be jumbling things if they are too close together time-wise. Whether that's X11/Wayland, the window manager/shell, the kernel, or just the app window, I have no idea yet how it works on that side of things.

I'm leaving the sync event alone in send_key_action, leaving it disabled in send_event since the real keys send their own sync events through there, as far as I understand. So that's not the problem.

It really feels like a router with a buggy algorithm failing to put packets back together in the correct transmission sequence.

Anyway, I found that regular typing can still be usable if we move the delay into send_combo. A delay wrapped around the "normal" key press-release is pretty effective, with the same delay before the modifier press and after the release making it slightly more effective. But the delay needs to be at least 0.03s in this situation, or things start to go wrong.

This way, shortcuts are working as expected, with no obvious delay. Macros are quite slow, about like a 40wpm typist, but they don't get screwed up anymore even with sequences of several shifted characters together, and normal typing input is not messed with because the delay is not attached to the sync function.

        key_delay_testing = 0.03    # delay to insert between mod+key press/release

        for key in mod_keys_we_need_to_lift:
            # time.sleep(key_delay_testing)
            self.send_key_action(key, RELEASE)
            released_mod_keys.append(key)
            time.sleep(key_delay_testing)

        for key in [mod.get_key() for mod in mods_we_need_to_press]:
            time.sleep(key_delay_testing)
            self.send_key_action(key, PRESS)
            pressed_mod_keys.append(key)
            # time.sleep(key_delay_testing)

        # normal key portion of the combo
        time.sleep(key_delay_testing)
        self.send_key_action(combo.key, PRESS)
        self.send_key_action(combo.key, RELEASE)
        time.sleep(key_delay_testing)

I've tried many different iterations and this is about the best I've achieved. Tried leaving out the delay if there's no modifier involved, but the results were always bad.

RedBearAK avatar Feb 14 '23 06:02 RedBearAK

Nope, not reliable at 0.03s after further testing. Had to push it to 0.04s. And the only delay lines needed are around the "normal" key press-release event. The others didn't really help.

        key_delay_testing = 0.04    # delay to insert between mod+key press/release

        for key in mod_keys_we_need_to_lift:
            self.send_key_action(key, RELEASE)
            released_mod_keys.append(key)

        for key in [mod.get_key() for mod in mods_we_need_to_press]:
            self.send_key_action(key, PRESS)
            pressed_mod_keys.append(key)

        # normal key portion of the combo
        time.sleep(key_delay_testing)
        self.send_key_action(combo.key, PRESS)
        self.send_key_action(combo.key, RELEASE)
        time.sleep(key_delay_testing)

RedBearAK avatar Feb 14 '23 07:02 RedBearAK

capslock 🌹🌹€€‡‡ÿÿ‡e‡-tapped11111

I was really confused by this phenomenon, but think I figured out why this duplication of Unicode characters happens when the timing issue is particularly bad. Something is interpreting the keystroke sequence as somehow being both Unicode entry methods:

  • Shift-Ctrl-U, hold modifiers, type Unicode address, release modifiers (no Enter key)
  • Shift-Ctrl-U, release modifiers, type Unicode address, hit Enter key

Thus, two identical Unicode characters sometimes. I can't replicate it manually, but this is the only possible explanation.

Submitted a description of the overall issue in the libinput GitLab. We'll see if anyone has a response.

RedBearAK avatar Feb 15 '23 01:02 RedBearAK