nerd-dictation icon indicating copy to clipboard operation
nerd-dictation copied to clipboard

Influence of keyboard layout (xdotool limitation)

Open papoteur-mga opened this issue 3 years ago • 17 comments

Hello, Thanks for this tool which improves the usage of vosk. I have done some tests in French. What surprised me is that the dictation before the first output seems to does take into account the keyboard layout. For example I got: Ceci est un essqi instead of Ceci est un essai which is what I said. Next sentences are well transcribed. I didn't yet explore the code.

papoteur-mga avatar Jun 03 '21 06:06 papoteur-mga

Could you run this in a terminal and check the output?

nerd-dictation begin --timeout=2.0 --output=STDOUT

If printed result is still not what you expect, this should be reported upstream to VOSK-SDK.

ideasman42 avatar Jun 03 '21 07:06 ideasman42

The result is now OK ceci est un autre essai pour vérifier si la lettre a est transcrite en cœur In the mean time, I saw another error during a longer dictation, where the letter é has been transcribed in 2, both having the same key on the French keyboard. This error is not constant, because I saw also letter é well transcribed.

papoteur-mga avatar Jun 03 '21 07:06 papoteur-mga

Could it be that you're holding down the shift key to activate this functionality?

If so, this could cause a different key to be used.

ideasman42 avatar Jun 03 '21 10:06 ideasman42

Hello, No, I didn't used any shift key. I have added some debug to be more accurate. I added logging.debug(f"Del:{text_block}") after line 834 and logging.debug(text_block) after line 846. Then I said élévation 3x.

DEBUG:root: il est
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:élévation
DEBUG:root: élévation
DEBUG:root: il
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:élévation

The text is well recognized, but sometimes needs to be corrected. In Libreoffice, I get the good word only when it was not directly output, but printed after Backspaces.

papoteur-mga avatar Jun 03 '21 14:06 papoteur-mga

But bug is surely in xdotool:

xdotool type "élévation"
2l2vqtion

I found https://stackoverflow.com/questions/17853287/xdotool-and-keyboard-layout A workaround is to use the command setxkbmap fr before launching nerd-dictation. As this bug exists since 2009, I think we can't expect a fix. Thus I think that this woraround as to be integrated, isn't it?

papoteur-mga avatar Jun 03 '21 14:06 papoteur-mga

if it can be done in a way that isn't likely to cause other problems, then I don't see why not.

ideasman42 avatar Jun 03 '21 14:06 ideasman42

in my Mageia, this is what I have: xorg.conf.d/00-keyboard.conf: Option "XkbLayout" "fr"

papoteur-mga avatar Jun 03 '21 15:06 papoteur-mga

If this information can be integrated into nerd-dictation, it should be fine to do so.

ideasman42 avatar Jun 03 '21 15:06 ideasman42

I will try something with:

setxkbmap -query
rules:      evdev
model:      pc105
layout:     fr
options:    compose:rwin

Where is it the best place for that?

papoteur-mga avatar Jun 03 '21 15:06 papoteur-mga

Try add this at the start of main_begin, once it's working some details can be ironed out.

ideasman42 avatar Jun 03 '21 23:06 ideasman42

What do you mean about that? https://github.com/autokey/autokey/wiki/API-Examples#send-keys

papoteur-mga avatar Jun 05 '21 05:06 papoteur-mga

Seems fun as an optional dependency, we can have a command line argument --simulate-input-method= ... xdotool, autokey .. etc.

ideasman42 avatar Jun 05 '21 07:06 ideasman42

Hello, I had no results with autokey. It doesn't seem to be usable as a module. I tried also keyboard, but this one needs root access, which is no way. Finally, I have something working with pynput, which is thus a new dependency.

papoteur-mga avatar Jun 06 '21 16:06 papoteur-mga

Hey there, just to say that I'm using non-english vosk models, along with non-qwerty keyboards that are heavily customized (think caps lock and num lock always active, 8 levels of modifiers, modifiers key remapped, several modifiers on one key). I run into a LOT of issue of xdotool. To make nerd-dictation work, I have to set my keyboard layout to qwerty while voice-typing; which is not possible because I need to type with the keyboard to correct the dictation's output. Using ydotool improves the situation a little, but it's still not perfect at all.

On the contrary, with pynput, everything works like a charm; even with my special keyboard layout. I'd suggest switching to pynput, and get rid of xdotool / ydotool. I'm not using Wayland, but pynput has an uinput backend (https://github.com/moses-palmer/pynput/issues/184) so I guess it works?

If it's of any interest, I've adapted papoteur-mga's code to the current version of nerd-dictation: https://github.com/ideasman42/nerd-dictation/compare/master...mklcp:master

mklcp avatar Jun 03 '22 08:06 mklcp

In this case your probably better off using ydotool which simulates a separate keyboard. It generally seems to have less issues and works on X11/Wayland.

ideasman42 avatar Jun 03 '22 09:06 ideasman42

Ydotool also had issues with my setup

mklcp avatar Jun 18 '22 07:06 mklcp

But bug is surely in xdotool:

xdotool type "élévation"
2l2vqtion

I found https://stackoverflow.com/questions/17853287/xdotool-and-keyboard-layout A workaround is to use the command setxkbmap fr before launching nerd-dictation. As this bug exists since 2009, I think we can't expect a fix. Thus I think that this woraround as to be integrated, isn't it?

I tried in void linux:

$ xdotool type "élévation"
lvation

I found this solution:

$ xdotool type --delay 55 "élévation"
élévation

Eloitor avatar Jan 05 '23 23:01 Eloitor