nerd-dictation
nerd-dictation copied to clipboard
Influence of keyboard layout (xdotool limitation)
Hello,
Thanks for this tool which improves the usage of vosk.
I have done some tests in French. What surprised me is that the dictation before the first output seems to does take into account the keyboard layout.
For example I got:
Ceci est un essqi
instead of
Ceci est un essai
which is what I said.
Next sentences are well transcribed.
I didn't yet explore the code.
Could you run this in a terminal and check the output?
nerd-dictation begin --timeout=2.0 --output=STDOUT
If printed result is still not what you expect, this should be reported upstream to VOSK-SDK.
The result is now OK
ceci est un autre essai pour vérifier si la lettre a est transcrite en cœur
In the mean time, I saw another error during a longer dictation, where the letter é
has been transcribed in 2
, both having the same key on the French keyboard. This error is not constant, because I saw also letter é
well transcribed.
Could it be that you're holding down the shift key to activate this functionality?
If so, this could cause a different key to be used.
Hello,
No, I didn't used any shift key.
I have added some debug to be more accurate. I added logging.debug(f"Del:{text_block}") after line 834 and logging.debug(text_block) after line 846. Then I said élévation
3x.
DEBUG:root: il est
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:élévation
DEBUG:root: élévation
DEBUG:root: il
DEBUG:root:Del:
DEBUG:root:Del:
DEBUG:root:élévation
The text is well recognized, but sometimes needs to be corrected. In Libreoffice, I get the good word only when it was not directly output, but printed after Backspaces.
But bug is surely in xdotool:
xdotool type "élévation"
2l2vqtion
I found https://stackoverflow.com/questions/17853287/xdotool-and-keyboard-layout
A workaround is to use the command
setxkbmap fr
before launching nerd-dictation.
As this bug exists since 2009, I think we can't expect a fix. Thus I think that this woraround as to be integrated, isn't it?
if it can be done in a way that isn't likely to cause other problems, then I don't see why not.
in my Mageia, this is what I have: xorg.conf.d/00-keyboard.conf: Option "XkbLayout" "fr"
If this information can be integrated into nerd-dictation, it should be fine to do so.
I will try something with:
setxkbmap -query
rules: evdev
model: pc105
layout: fr
options: compose:rwin
Where is it the best place for that?
Try add this at the start of main_begin
, once it's working some details can be ironed out.
What do you mean about that? https://github.com/autokey/autokey/wiki/API-Examples#send-keys
Seems fun as an optional dependency, we can have a command line argument --simulate-input-method=
... xdotool
, autokey
.. etc.
Hello,
I had no results with autokey
. It doesn't seem to be usable as a module.
I tried also keyboard
, but this one needs root access, which is no way.
Finally, I have something working with pynput, which is thus a new dependency.
Hey there, just to say that I'm using non-english vosk models, along with non-qwerty keyboards that are heavily customized (think caps lock and num lock always active, 8 levels of modifiers, modifiers key remapped, several modifiers on one key). I run into a LOT of issue of xdotool. To make nerd-dictation work, I have to set my keyboard layout to qwerty while voice-typing; which is not possible because I need to type with the keyboard to correct the dictation's output. Using ydotool improves the situation a little, but it's still not perfect at all.
On the contrary, with pynput, everything works like a charm; even with my special keyboard layout. I'd suggest switching to pynput, and get rid of xdotool / ydotool. I'm not using Wayland, but pynput has an uinput backend (https://github.com/moses-palmer/pynput/issues/184) so I guess it works?
If it's of any interest, I've adapted papoteur-mga's code to the current version of nerd-dictation: https://github.com/ideasman42/nerd-dictation/compare/master...mklcp:master
In this case your probably better off using ydotool which simulates a separate keyboard. It generally seems to have less issues and works on X11/Wayland.
Ydotool also had issues with my setup
But bug is surely in xdotool:
xdotool type "élévation" 2l2vqtion
I found https://stackoverflow.com/questions/17853287/xdotool-and-keyboard-layout A workaround is to use the command
setxkbmap fr
before launching nerd-dictation. As this bug exists since 2009, I think we can't expect a fix. Thus I think that this woraround as to be integrated, isn't it?
I tried in void linux:
$ xdotool type "élévation"
lvation
I found this solution:
$ xdotool type --delay 55 "élévation"
élévation