Not clear how to change language/keyboard layout on the fly for the 'insert into active window' STT feature on Wayland
I'm using Gnome on Wayland with ydotool.
There is a 'keyboard layout' field in the advanced settings available under Wayland. When set to a layout name that corresponds to the language of the speaker (i.e. 'en' for English speech), correct output is inserted into target application.
When left empty, no letter characters are inserted into focused application.
When set to incorrect layout name (i.e. 'ru' but speaking English), the output is as if the user was typing with incorrect keyboard layout. For example: saying 'testing' produces 'Lbcnby' output, which, when typed with Cyrillic layout, would be 'Дистин' ('Distin'), which is not a real word, but is how the algorithm interprets what was spoken when an incorrect language is assumed.
Is that possible to paste into the 'keyboard layout' field something that would allow both keyboard layouts so that when the user speaks Russian, cyrillic 'ru' layout is used and when the user speaks English, latin 'en' layout is used?
For example, something like localectl status | grep 'VC Keymap' | awk '{print $NF}' on my Gnome system tells the current layout. Obviously, it is not intended that something like this shall be pasted into the 'keyboard layout' field.
Is there a way to do what I want? Am I missing something?
As a sidenote, it seems that clipboard on Wayland doesn't work either. Whatever language I speak, nothing is copied there. But for that I should file a separate issue, I suppose.
With X11, both clipboard and active window features work just fine and both the language and layout are automatically chosen correctly. Guess I will use that for now.
Thank you for your effort with this app, otherwise it seems great.
Hi. Thanks for the questions.
There is a 'keyboard layout' field in the advanced settings available under Wayland.
The "Insert into active window" converts text into a set of simulated keystrokes that should generate the same text using the currently configured keyboard layout. If you have multiple layouts configured, it uses the first layout. You can change this by specifying the "keyboard layout" in the advanced settings. For some reason, your first layout is not working.
Could you please start with --verbose option and try to use "Insert into active window" with empty "keyboard layout" setting?
flatpak run net.mkiol.SpeechNote --verbose
Interesting traces are between lines:
init_ydo:394 - using ydo fake-keyboardandmake_compose_table:381 - trying compose file:
Can you share them?
Is that possible to paste into the 'keyboard layout' field something that would allow both keyboard layouts so that when the user speaks Russian, cyrillic 'ru' layout is used and when the user speaks English, latin 'en' layout is used?
This is a very interesting use case that I overlooked - sorry. When everything works correctly, the layout set as active in the system should be selected. Now I see that perhaps more intelligence should be added here. The layout should be selected based on the language of the currently selected model. I don't know why I didn't implement it this way. I'm adding this to the list of things to do in the next version.
As a sidenote, it seems that clipboard on Wayland doesn't work either. Whatever language I speak, nothing is copied there.
This does not work "by design". Wayland does not allow access to clipboard data when the application is not active. This is a security feature, and there is nothing I can do to resolve this issue. I wish I could.
With X11, both clipboard and active window features work just fine and both the language and layout are automatically chosen correctly.
Indeed, it works correctly. Due to the lack of any security measures, X11 allows the app to access almost everything :) Moreover X11 has a dedicated API that allows key events triggering which makes "Insert into active window" much easier to implement.
Hey @mkiol, thanks for working on this project! I also have this issue with a German Keyboard. I ran it with --verbose.
[W] 11:32:58.737680683.737 0x778d63f56d00 wly_keyboard_keymap:894 - map shm failed: 1
**xkbcommon: ERROR: [XKB-822] Failed to parse input xkb string**
**[E] 11:32:58.737706266.737 0x778d63f56d00 wly_keyboard_keymap:914 - failed to get xkb_keymap from wayland**
[D] 11:32:58.737713117.737 0x778d63f56d00 connect_wayland:806 - wayland roundtrip done
**[D] 11:32:58.737717134.737 0x778d63f56d00 init_ydo:434 - fallback, create standard US keymap**
**[D] 11:32:58.739127602.739 0x778d63f56d00 init_ydo:453 - keyboard layouts:**
**[D] 11:32:58.739140644.739 0x778d63f56d00 init_ydo:463 - 0:English (US)**
[D] 11:32:58.739145582.739 0x778d63f56d00 init_ydo:475 - keyboard layout to use: 0
[D] 11:32:58.739149479.739 0x778d63f56d00 get_l3_shift_keycode:184 - l3_shift is mapped to keycode: 92 0
[D] 11:32:58.739159506.739 0x778d63f56d00 make_compose_table:381 - trying compose file: /usr/share/X11/locale/en_GB.UTF-8/Compose
[W] 11:32:58.739166047.739 0x778d63f56d00 make_compose_table:389 - can't open compose file: /usr/share/X11/locale/en_GB.UTF-8/Compose
[D] 11:32:58.739168611.739 0x778d63f56d00 make_compose_table:381 - trying compose file: /usr/share/X11/locale/C/Compose
xkbcommon: ERROR: Couldn't read Compose file (unknown file): Invalid argument
[D] 11:32:58.741037485.741 0x778d63f56d00 () - stt intermediate text decoded: *** "en" 0
[D] 11:32:58.741288685.741 0x778d63f56d00 () - app service state: listening-manual => idle
[D] 11:32:58.741360716.741 0x778d63f56d00 operator():349 - connected ydo socket: /tmp/.ydotool_socket
[D] 11:32:58.741444756.741 0x778d63f56d00 operator():349 - connected ydo socket: /tmp/.ydotool_socket
[W] 11:32:58.743818345.743 0x778d63f56d00 () - no available mnt langs
[W] 11:32:58.743875250.743 0x778d63f56d00 () - no available mnt out langs
[W] 11:32:58.743884195.743 0x778d63f56d00 () - no available tts models for in mnt
[W] 11:32:58.743887961.743 0x778d63f56d00 () - no available tts models for out mnt
[D] 11:32:58.743896786.743 0x778d63f56d00 () - app task state: processing => idle
This would explain why it fallsback to an english keyboard layout, causing my qwertz keyboard to be mapped to qwerty.
Any ideas for a quick fix on the same version without switching back to X11? 😅 X11 really sucks with my external monitor... I believe this warrants a bug label.
Many thanks for your work!
@mkiol strange note here... While I was down the debug rabbithole I tried everything, and thought I was running into a dead-end and had to wait for upstream fixes... Then I switched back to X11 and back to Wayland and it's working again all of a sudden...
Do with that info what you can! Best
I apologize for not responding 3 weeks ago, somehow I missed the email and forgot to regularly check the thread for responses.
Here are the traces with empty 'keyboard layout' field:
[D] 15:03:47.403401534.403 0x7f62dc5c5d00 init_ydo:394 - using ydo fake-keyboard
[D] 15:03:47.403469280.403 0x7f62dc5c5d00 operator():349 - connected ydo socket: /tmp/.ydotool_socket
[D] 15:03:47.408239984.408 0x7f62dc5c5d00 init_ydo:453 - keyboard layouts:
[D] 15:03:47.408289083.408 0x7f62dc5c5d00 init_ydo:463 - 0:English (US)
[D] 15:03:47.408301794.408 0x7f62dc5c5d00 init_ydo:463 - 1:Russian
[D] 15:03:47.408318416.408 0x7f62dc5c5d00 init_ydo:463 - 2:English (US)
[D] 15:03:47.408333013.408 0x7f62dc5c5d00 init_ydo:475 - keyboard layout to use: 0
[D] 15:03:47.408357387.408 0x7f62dc5c5d00 get_l3_shift_keycode:184 - l3_shift is mapped to keycode: 92 0
[D] 15:03:47.408377990.408 0x7f62dc5c5d00 get_l3_shift_keycode:184 - l3_shift is mapped to keycode: 134 0
[D] 15:03:47.408423666.408 0x7f62dc5c5d00 make_compose_table:381 - trying compose file: /usr/share/X11/locale/en_US.UTF-8/Compose
I'm adding this to the list of things to do in the next version.
Thank you, that's very cool to hear
@NoHara42 If it's not too much trouble, could you also send me the traces containing wl global: interface=? You should see them when you try to use "insert into active window" in Wayland. I want to compare the API versions. I suspect that the Wayland protocol on your system is newer than the one I used for testing. That's why the implementation in Speech Note is not fully compatible.
The example:
[D] 17:57:22.405592210.405 0x7fa205fe1d00 connect_wayland:781 - connect wayland
[D] 17:57:22.405788917.405 0x7fa205fe1d00 wly_global_callback:829 - wl global: interface=wl_compositor version=6
[D] 17:57:22.405799016.405 0x7fa205fe1d00 wly_global_callback:829 - wl global: interface=zwp_tablet_manager_v2 version=2
[D] 17:57:22.405803925.405 0x7fa205fe1d00 wly_global_callback:829 - wl global: interface=zwp_keyboard_shortcuts_inhibit_manager_v1 version=1
[D] 17:57:22.405808724.405 0x7fa205fe1d00 wly_global_callback:829 - wl global: interface=zxdg_decoration_manager_v1 version=1
[D] 17:57:22.405813383.405 0x7fa205fe1d00 wly_global_callback:829 - wl global: interface=wp_viewporter version=1
@mkiol I thought perhaps giving you the whole run of me testing my mic to fake-keyboard input with shortcuts would help. https://pastebin.com/B1mqaREE