qubes-issues icon indicating copy to clipboard operation
qubes-issues copied to clipboard

sys-usb kworker events CPU consumption

Open neowutran opened this issue 5 years ago • 30 comments

Qubes OS version R4.0

Affected component(s) or functionality USB device handeling

Brief summary Rarely (~ once a month), the system become extremely slow, and mouse/keyboard are extremly laggy.

Looking at CPU consumption it come from the usb qube ("sys-usb" in my case), consuming over 50% of total CPU (Probably eating everything it is allowed to). When I look at the process list, the issue is "kworker/*:*events". The journal logs are filled with the same message: "hid-generic XXXXXXX: control queue full"

My "sys-usb" qube is a static disposableVM based on Debian 10 ( and currently using kernel 5.6.11-1 but it is not related to kernel version since I got the same issue months ago when this kernel version didn't existed yet). The issue is not related to plugging or unplugging USB devices.

To Reproduce No idea

Solutions you've tried Restarting the "sys-usb" qube. Problem is gone after that.

Relevant documentation you've consulted None Found this error referenced on the archlinux documentation: https://wiki.archlinux.org/index.php/APC_UPS And on gentoo forum: https://forums.gentoo.org/viewtopic-p-7569028.html

Related, non-duplicate issues None

neowutran avatar May 14 '20 16:05 neowutran

On Thu, May 14, 2020 at 09:04:31AM -0700, Yukikoo wrote:

Brief summary Rarely (~ once a month), the system become extremely slow, and mouse/keyboard are extremly laggy.

Looking at CPU consumption it come from the usb qube ("sys-usb" in my case), consuming over 50% of total CPU (Probably eating everything it is allowed to). When I look at the process list, the issue is "kworker/*:*events".

I see this issue too. Thanks for reporting it!

-- cheers, Holger


           holger@(debian|reproducible-builds|layer-acht).org
   PGP fingerprint: B8BF 5413 7B09 D35C F026 FE9D 091A B856 069A AA1C

h01ger avatar May 14 '20 23:05 h01ger

I've seen it few times (definitely rarer than once a month), and re-plugging HID devices (keyboard/mouse etc) fixed it, without restarting sys-usb. But I don't have idea (yet) why it happens.

marmarek avatar May 14 '20 23:05 marmarek

On Thu, May 14, 2020 at 04:16:55PM -0700, Marek Marczykowski-Górecki wrote:

re-plugging HID devices (keyboard/mouse etc) fixed it, without restarting sys-usb.

thank you very much for this workaround, I've just used it successfully! :)

-- cheers, Holger


           holger@(debian|reproducible-builds|layer-acht).org
   PGP fingerprint: B8BF 5413 7B09 D35C F026 FE9D 091A B856 069A AA1C

h01ger avatar May 17 '20 17:05 h01ger

I see a similar issue. Re-plugging the HID devices does not help. However, I know what causes it in my case for sure: it happens whenever I connect my 2 TB external USB3 drive (backup). If I disconnect it the problem goes away. If I reconnect it, it comes back. The drive is encrypted. As soon as I entered the passphrase to mount it, the issue becomes less severe but doesn't disappear.

Let me know if there are any logs I should share.

SvenSemmler avatar May 20 '20 18:05 SvenSemmler

Do you confirm that this is the same issue ? Aka: "hid-generic XXXXXXX: control queue full" spammed in the system logs (dmesg)

If it is the same issue, and that you can reproduce the issue at will, maybe you could try something like this ? https://wiki.archlinux.org/index.php/APC_UPS#hid-generic_:_control_queue_full ? ( using distributions kernel [https://www.qubes-os.org/doc/managing-vm-kernel/] )

neowutran avatar May 21 '20 07:05 neowutran

I am sorry for the noise. I had the issue for many weeks and reproduced it 4-5 times before writing the post yesterday. Now I can no longer make it happen! This was either an issue with my external disk that has self-repaired or one of the many updates I installed yesterday evening.

SvenSemmler avatar May 21 '20 17:05 SvenSemmler

hid-generic 0003:1532:0064.0003: control queue full same issue using fedora-30 as template for sys-usb and kernel 4.19.107-1.

Solutions I've tried unplugging and plugging usb mouse and problem gone away for a while

Liquidator2048 avatar May 29 '20 21:05 Liquidator2048

I have similar issues occurring quite often (CPU consumption). The reason for me most of the time is that sys-usb starts swapping causing high CPU (kworker is related the swapping in my case). I have very low available RAM on sys-usb when the other Qubes consume all the RAM, and somehow sys-usb is not able to get enough RAM to work properly.

ptitdoc avatar Jun 02 '20 08:06 ptitdoc

I tried changing my USB-vm ram from 300 to 400MB. I probably need to use it several weeks before being sure that it is the problem.

My high memory usage may be because in my case I'm using some tools from within the sys-usb VM, because USB-passthrough does not work for all devices, and especially for mobile phones (probably because they are resetting USB or changing USB mode all the time, it disconnects the USB device from the attached VM before being able to use it).

ptitdoc avatar Jul 03 '20 07:07 ptitdoc

I suggest for those where it occurs to run top from dom0 to get more insight of the problem: [dom0 ~]$ qvm-run --pass-io sys-usb 'top -b'

ptitdoc avatar Jul 03 '20 07:07 ptitdoc

+1 i also have this issue and have had it sporadically for a while. f'ing heisenbug. its just infrequent enough that i havent bothered tracking it down properly until now. it is bad enough right now that i cant type faster than 1 char per second without it missing input characters. oterwie my typingloks lik this. i normally just reboot sys-usb to reboot, as physically getting to usb is a nuisance. can confirm it can be fixed temporarily by unplugging all usb things and seems to be fixed with removal of usb kbs and related hubs only (possibly important, does not correct just removing 'real kb', must also remove yubikey(s) and seemingly mouse, so all hid devices?) and re-plugging them in. it seems to happen somewhat regularly when i leave my terminal alone for many hours. i did a full re-install (on a different network, with a different m2 drive) for unrelated reasons - and the fresh install without my backup qubes being restored has same issue 'out of the box'. removing usb kvm switch and hubs does not fix issue - aka. same issue without usb hubs. also using debian as default vm. when this occurs, only noticeable behavioral problem is slow input, everything else is fine. can sometimes create a false problem at login screen, as the typing at normal speed without realizing issue is active misses characters of password without really noticing. until i clued into this, i thought i must have just forgotten my hard to remember long pw and was super mad at self.

fyi: top in dom0 only shows 5-6% cpu usage.

in sys-usb it chews (kworker /1:2-events) 90+ cpu and is spikey. Also in sys-usb top MiB total Mem shows 220.4 total and 3.9 free. swap has ~900 free in all cases so i dont think it's a swap cpu issue for me.

dmesg is full of hid-generic xxxx:xxxx:xxxx.0009: control queue full msgs. i'm not particularly technical, but if i'm reading the log right, there are roughly around 100 control queue full messages in a burst, every 200,000 micro seconds (eg. message 0 is [xxxxx.023595], message ~99 is [xxxxx.028887], then message ~100 is [xxxxx.223171], message ~199 is [xxxxx.228899], message ~200 is [xxxx.420379], message ~299 is [xxxxx.425594].

also: after disconnecting things one at a time and going through many pages of messages, i think it might be one specific target device. when disconnecting i get [time] usb x-y: USB disconnect, device number z
[time] hid-generic xxxx:xxxx:xxxx.0009: usb_submit_urb(ctrl) failed: -19

then the control queue full messages stop.

Other notes: I just unplugged and plugged back in all usb devices without a sys-usb reboot and it's back to normal. Xorg is leading cpu usage with 0.3%. free mem: 6.0. swap free: 908.5

im going to try to correct by increasing sys-usb to 400mb. i'll post in a while (possibly days/weeks) if it does or doesnt come back.

ghost avatar Aug 05 '20 08:08 ghost

I have something similar. My sys-usb is running fedora 30 though. I tried increasing ram to 600mb, no change. When I check dmesg it gives the control queue full message. Top from dom0 shows kworker using a lot of CPU (80-95%). It only happens occasionally, maybe once a month, and unplugging and re-plugging in the keyboard fixes it. Looked at the solutions from the original post - I don't have a UPS or anything other than a keyboard and mouse (and sometimes a mic) plugged in.

random265 avatar Aug 09 '20 20:08 random265

I have similar issues occurring quite often (CPU consumption). The reason for me most of the time is that sys-usb starts swapping causing high CPU (kworker is related the swapping in my case). I have very low available RAM on sys-usb when the other Qubes consume all the RAM, and somehow sys-usb is not able to get enough RAM to work properly.

For information, after changing to 400MB this occured less frequently, but this still occurs. I still ends up with high CPU, kworker showing cpu usage in TOP.

ptitdoc avatar Nov 05 '20 14:11 ptitdoc

I have the same issue, and it is very frequently (on Qubes 4.1 alpha). After restarting sys-usb, it only takes a few minutes for the CPU usage to rise again. It's kworker/1:0+events and systemd-journal (I guess due to the kernel log spam).

free -h shows: free: 15M buff/cache: 118M available: 120M Swap used: 19M

Is this due to swapping? I thought that as long as "available" is not zero, the system should not swap so intensely. Actually, Swap now increased to 32M, so something is going on there...

What could I do to provide more detailed information?

csarn avatar Jan 01 '21 21:01 csarn

I just had this problem again and started searching the internet. I found this post: https://forums.gentoo.org/viewtopic-p-7569028.html And I ran in sys-usb:

echo on > /sys/class/usbmisc/hiddev0/power/control

This did not change the cpu load immediately, but after that, I ran

usbhid-dump

And the high CPU usage stopped!

I'm not sure if this was a late effect of the first command, or if both commands are needed, or the second one would help on its own, but I'll try the second command solo when the problem appears again.

Just wanted to tell you that it does not need a sys-usb restart :) Maybe the first command could be integrated in /rw/config/rc.local for sys-usb, if this was the one solving the problem.

EDIT: A few minutes later, CPU usage spiked again. A single usbhid-dump stops that, without the first command. usbhid-dump seems to actually reset the /sys/class/usbmisc/hiddev0/power/control to "auto".

csarn avatar Jan 27 '21 18:01 csarn

I had the best luck with this just using a different mouse. A Microsoft wireless mouse was giving me the problem and the Logitech wired model that is in there now seems to be not triggering the kworker problem.

jsbmg avatar Apr 28 '21 21:04 jsbmg

The same issue when reconnecting mouse via KVM (in my case Logitech wireless). If the mouse is connected (via KVM) for the first time it works fine, but right after the first switch to another PC and back it starts lagging (sys-usb high CPU usage and the same error as reported above).

usbhid-dump works for me too without restarting sys-usb. My take that it happens because usbhid-dump detaches and then reattaches kernel drivers for the interfaces.

It would be great it someone can implement a permanent fix...

PetrVladimirov avatar May 10 '21 16:05 PetrVladimirov

+1 with Qubes 4.1, disposable sys-usb

  • kworker/events using a lot of CPU cycles
  • journalctl spammed with hid-generic [...] control queue full
  • fixed with restarting sys-usb (or usbhid-dump)

I could fairly reliably provoke it by plugging in my Yubikey with my finger on the gold contact (or avoid it by carefully avoiding to do that), which triggers emulated keyboard strokes.

lkubb avatar Oct 07 '21 10:10 lkubb

Qubes 4.1, disposable sys-usb Just wanted to say I have been plagued by the same issue and have been issuing: qvm-run sys-usdb 'sudo subhid-dump' from dom0, to instantly fix the scenario. As an experiment, I swapped out my mouse for another, and the problem has not come back. My previous mouse was a Razor DeathAdder Elite.

jandrews377 avatar Jan 23 '22 18:01 jandrews377

Update: I started having the same problem when I configured my yubikey 5 nano according to instructions. More precisely, after configuring everything and in particular xscreensaver to unlock via yubikey, the following happens. Power the computer on, insert yubikey, lock screen. insert yubikey password, touch yubikey -> screen unlocks, cpu 1%. Lock screen again, insert yubikey password, touch yubikey -> screen unlocks, cpu 1%. Third time -> screen unlocks, cpu 70%. Restart sys-usb -> everything normal. Repeat as above: the computer starts lagging always at the third unlock operation! How weird!

ghost avatar Mar 21 '23 00:03 ghost

I might have an easy way to reproduce this, if I put my corne keyboard into bootloader mode a few times or disconnect and reconnect I see the same type of problems as others are describing.

I am using qubes 4.2.0 with a default-dvm(fedora-38) as sys-usb

will first see if increasing memory will make it harder to trigger.

jimhansson avatar Nov 27 '23 22:11 jimhansson

with 400MB assigned to sys-usb i have around 60MB free after a reboot of sys-usb, After reseting the keyboard once it dips down to 30 free, reseting once more and I see it going under 10MB and it starts hogging the CPU, cursor is lagging and even some keyboard events are missed. I replicated this 3 times

with 600MB assigned to sys-usb i have around 125 mb free after a reboot of sys-usb, on the first reset i got the lagging and hoging of CPU but still had over 100MB free, second time i start with only 28 free but a lot in buff/cache, starts lagging after one reset, After third reboot i have over 100 mb free, starts lagging after one reset.

so increasing the memory by 50% leads to being able to trigger the problem on only one reset of the keyboard. not what I expected. I also noticed that after a reboot of sys-usb i had around 200 MB in swap

Now I try with 6GB of memory assigned to sys-usb. trying to rule out if this has anything todo with memory constraints. first try, triggers the problem after two resets of the keyboard. second try, triggers the problem after three resets of the keyboard. third try, sys-usb boots up in this state directly and I have to restart it again to be able to write this. so 0 reboots Every time i press the reset on the keyboard I see about 50MB of free memory gets taken, but with this amount of memory and because the problem is still happening within one-three resets of the keyboard it should not be a memory amount problem.

jimhansson avatar Nov 27 '23 23:11 jimhansson

after provoking the system by running sudo usbhid-dump a few times in sys-usb i get this

[user@sys-usb ~]$ sudo usbhid-dump 
003:003:002:Failed to retrieve report descriptor: Operation timed out
003:003:002:Failed to attach to the kernel driver: Entity not found
003:003:001:Failed to attach to the kernel driver: Entity not found
003:003:000:Failed to attach to the kernel driver: Entity not found
001:004:001:Failed to release: No such device (it may have been disconnected)
001:004:001:Failed to attach to the kernel driver: No such device (it may have been disconnected)
001:004:000:Failed to release: No such device (it may have been disconnected)
001:004:000:Failed to attach to the kernel driver: No such device (it may have been disconnected)
[user@sys-usb ~]$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 04d9:a081 Holtek Semiconductor, Inc. [unknown]
Bus 001 Device 003: ID 1050:0407 Yubico.com Yubikey 4/5 OTP+U2F+CCID
Bus 001 Device 005: ID 4653:0001 [unknown] [unknown]
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 002: ID 047d:2041 Kensington SlimBlade Trackball
Bus 003 Device 004: ID 3297:4974 [unknown] [unknown]
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 002: ID 0627:0001 Adomax Technology Co., Ltd QEMU Tablet
Bus 005 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 006 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 006 Device 002: ID 2109:2812 VIA Labs, Inc. VL812 Hub
Bus 006 Device 003: ID 2109:2812 VIA Labs, Inc. VL812 Hub
Bus 007 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 007 Device 002: ID 2109:0812 VIA Labs, Inc. VL812 Hub
Bus 007 Device 003: ID 2109:0812 VIA Labs, Inc. VL812 Hub
[user@sys-usb ~]$ 

and a lot of these in dmesg [ 1558.711295] hid-generic 0003:3297:4974.0061: control queue full [ 1558.711314] hid-generic 0003:3297:4974.0061: control queue full [ 1558.711335] hid-generic 0003:3297:4974.0061: control queue full [ 1558.711350] hid-generic 0003:3297:4974.0061: control queue full [ 1558.711364] hid-generic 0003:3297:4974.0061: control queue full [ 1558.711378] hid-generic 0003:3297:4974.0061: control queue full

but that looks like it is my ergodox_ez and not my corne 1750.611698] usb 3-4: New USB device found, idVendor=3297, idProduct=4974, bcdDevice= 0.01 [ 1750.611716] usb 3-4: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 1750.611726] usb 3-4: Product: ErgoDox EZ

the corne has these [ 1858.501513] usb 1-4: New USB device found, idVendor=4653, idProduct=0001, bcdDevice= 0.01 [ 1858.501531] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 1858.501541] usb 1-4: Product: Corne

both are running QMK firmware(both are also presenting themself as both mouse AND keyboard) and I could try to load some sort of usb debug thing on them, maybe.

jimhansson avatar Nov 27 '23 23:11 jimhansson

removing yubico keys and ergodox_ez did not seems to have any effect, maybe one more reset of corne before it started lagging

jimhansson avatar Nov 27 '23 23:11 jimhansson

I suspect this is an interrupt storm: the hardware attached to sys-usb is creating so many interrupts that it is making the overall system lag. The reason is that sys-usb is usually assigned only 2 vCPUs (it has 4 on my system) and another VM with more vCPUs can use them all without harming overall responsiveness. However, if the system is overwhelmed processing interrupts, then Xen might not get a chance to schedule dom0 and system responsiveness could drop.

DemiMarie avatar Nov 28 '23 07:11 DemiMarie

doing some more testing after increasing vCPU to 4

After removing every usb device except one keyboard and plugging in one extra usb-device one-by-one, I have found out that different devices gives very different result.

  1. My USB gaming mouse (an arrogant bastard vortex, "Holtek Semiconductor" in my lsusb output) was the worst offender. it was giving me lagging after only 3 resets
  2. Next was the yubico yubikey, giving me problems around 5 resets tried adding a hub between the computer and the yubikey, did not help
  3. The ergodox_ez seems to be somewhat fine(had to do alot of resets before system started lagging), but was consistantly giving me this error when running usbhid-dump 004:004:002:Failed to retrieve report descriptor: Operation timed out

Tried to isolate devices on different buses, did not experience any improvement from that.

the USB mouse I can live without and only connect if needed. the yubikey is worse but still manageable(for me) by using workaround (usbhid-dump) the ergodox_ez, (guessing it is an atmega32u controller) will be writing a bug to QMK that there seems to something unstable in the USB implementation Kensington slimBlade Trackball seems fine Corne with rp2040(helios) controller with QMK firmware seems fine

Some devices seems to be trigger this alot, some other a lot less. knowing which could maybe be of some use.

For now I am out of ideas on what to try next. will be removing the mouse and only plugin yubikey when needed.

jimhansson avatar Nov 29 '23 00:11 jimhansson

It's been working ok for a year. No hardware changes. And a week ago or two (i guess a major update of 4.2 to rc5 bug appeared) After 3-5 reconnects of USB keyboard and mouse to sys-usb it starts lagging and cpu load of sys-usb goes to 50%

https://forum.qubes-os.org/t/4-2-new-patch-break-sys-usb/22636/1

dalaylapka avatar Dec 09 '23 10:12 dalaylapka

I suspect this is something that https://github.com/QubesOS/qubes-app-linux-input-proxy/ is doing wrong, but I have no idea what.

marmarek avatar Mar 27 '24 12:03 marmarek

I don't know if it helps, but a very easy way to reproduce this is to active caps four times and having a non-sys-usb window selected.

boreasForum avatar Mar 27 '24 14:03 boreasForum

I've noticed something interesting about this issue. I've tried 3 different keyboard/mouse setup, (logitech,microsoft and dell) and found out that the dell keyboard doesn't have the issue (LITEON Dell Wireless Device). Not sure how would this help but it seems that it also depends on devices plugged to sys-usb.

ZYousef avatar Jun 27 '24 16:06 ZYousef