kvdo
kvdo copied to clipboard
High CPU load caused by indexW kernel threads while vdo volume is unused
Hello team,
I have noticed that the indexW kernel threads consume a lot of CPU time while the VDO volume is idle (for example when it has just been started and has been unused since):
CPU time increase, as reported by 'top':
$ top -bw | grep indexW
6608 root 20 0 0 0 0 S 6.2 0.0 0:01.69 kvdo0:indexW
6609 root 20 0 0 0 0 S 6.2 0.0 0:01.69 kvdo0:indexW
6611 root 20 0 0 0 0 S 6.2 0.0 0:01.69 kvdo0:indexW
6612 root 20 0 0 0 0 S 6.2 0.0 0:01.66 kvdo0:indexW
6613 root 20 0 0 0 0 S 6.2 0.0 0:01.68 kvdo0:indexW
6610 root 20 0 0 0 0 S 0.0 0.0 0:01.69 kvdo0:indexW
6611 root 20 0 0 0 0 S 4.6 0.0 0:01.83 kvdo0:indexW
6608 root 20 0 0 0 0 S 4.3 0.0 0:01.82 kvdo0:indexW
6609 root 20 0 0 0 0 S 4.3 0.0 0:01.82 kvdo0:indexW
6610 root 20 0 0 0 0 S 4.3 0.0 0:01.82 kvdo0:indexW
6612 root 20 0 0 0 0 S 4.3 0.0 0:01.79 kvdo0:indexW
6613 root 20 0 0 0 0 S 4.3 0.0 0:01.81 kvdo0:indexW
6610 root 20 0 0 0 0 S 4.6 0.0 0:01.96 kvdo0:indexW
6613 root 20 0 0 0 0 S 4.6 0.0 0:01.95 kvdo0:indexW
6608 root 20 0 0 0 0 S 4.3 0.0 0:01.95 kvdo0:indexW
6609 root 20 0 0 0 0 S 4.3 0.0 0:01.95 kvdo0:indexW
6611 root 20 0 0 0 0 S 4.3 0.0 0:01.96 kvdo0:indexW
6612 root 20 0 0 0 0 S 4.0 0.0 0:01.91 kvdo0:indexW
6613 root 20 0 0 0 0 S 4.6 0.0 0:02.09 kvdo0:indexW
6608 root 20 0 0 0 0 S 4.3 0.0 0:02.08 kvdo0:indexW
6609 root 20 0 0 0 0 S 4.3 0.0 0:02.08 kvdo0:indexW
6610 root 20 0 0 0 0 S 4.3 0.0 0:02.09 kvdo0:indexW
6611 root 20 0 0 0 0 S 4.3 0.0 0:02.09 kvdo0:indexW
6612 root 20 0 0 0 0 S 4.3 0.0 0:02.04 kvdo0:indexW
VDO statistics after collecting the usage above:
# vdostats --verbose | grep -e 'bios in\|bios out'
bios in read : 0
bios in write : 0
bios in discard : 0
bios in flush : 0
bios in fua : 0
bios in partial read : 0
bios in partial write : 0
bios in partial discard : 0
bios in partial flush : 0
bios in partial fua : 0
bios out read : 0
bios out write : 0
bios out discard : 0
bios out flush : 0
bios out fua : 0
bios out completed read : 0
bios out completed write : 0
bios out completed discard : 0
bios out completed flush : 0
bios out completed fua : 0
bios in progress read : 0
bios in progress write : 0
bios in progress discard : 0
bios in progress flush : 0
bios in progress fua : 0
System information: GNU/Gentoo Linux (amd64) kvdo version: 6.2.3.114 on 5.8.0 kernel (amd64) (kvdo-corp) GCC: 10.2
The VDO device is started on top of dm-crypt encrypted partition, ie:
vdo_storage: !VDOService
_operationState: finished
ackThreads: 1
activated: enabled
bioRotationInterval: 64
bioThreads: 4
blockMapCacheSize: 128M
blockMapPeriod: 16380
compression: enabled
cpuThreads: 2
deduplication: enabled
device: /dev/mapper/cryptstorage
hashZoneThreads: 1
indexCfreq: 0
indexMemory: 0.25
indexSparse: disabled
indexThreads: 0
logicalBlockSize: 4096
logicalSize: 1T
logicalThreads: 1
maxDiscardSize: 4K
name: vdo_storage
physicalSize: 306742596K
physicalThreads: 1
slabSize: 2G
uuid: null
writePolicy: async
Hardware specs of the machine: Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz 32GB memory disk partition on NVME disk
There is currently no virtualization in use, though KVM and linux-containers are compiled in as modules.
Anything else you may need to know, do let me know.
Hello,
Thanks for the report; I was able to reproduce this behavior on a system running Fedora 32 with a 5.7 kernel.
Here's some output from "vmstat 1" after creating a new VDO volume directly on a test block device (i.e.: no layers below the VDO volume), with the command vdo create --name=vdo1 --device=<testdevice> --vdoLogicalSize=1T
:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 30105680 66792 1406196 0 0 0 388 85875 185522 0 1 99 0 0
1 0 0 30105680 66792 1406204 0 0 0 0 85105 184946 0 1 99 0 0
2 0 0 30105680 66792 1406204 0 0 0 0 93577 193874 0 1 99 0 0
1 0 0 30105680 66792 1406204 0 0 0 0 99219 199766 0 1 99 0 0
1 0 0 30105680 66792 1406204 0 0 0 0 99274 199817 0 1 99 0 0
1 0 0 30105680 66792 1406204 0 0 0 0 99055 199591 0 1 99 0 0
1 0 0 30105680 66792 1406204 0 0 0 0 99294 199809 0 1 99 0 0
1 0 0 30105680 66800 1406204 0 0 0 16 97547 197933 0 2 98 0 0
Note the high number of context switches per second ("cs"). If you run vmstat 1
on your system with the VDO volume remaining idle, do you see something similar?
(In my case, the VDO volume's index also had 6 zones; the test system has a total of 12 CPUs.)
Hello Bryan,
This is confirmed, I got immediate increase of the number of context switches with approximately +180000 just by bringing up the vdo device. Turning it off, brings number of CS back to normal.
same issue also happens on 5.5.10 amd64 kernels with: uds version 8.0.0.84 kvdo version 6.2.2.117
Hi @nkichukov,
Thanks for confirming. We are investigating this further and have opened BZ1870660 to track this.
We believe this to be due to a bug that we recently fixed in another branch and will be working to further confirm this and apply the fix to the necessary releases.
I've tested with package vdo-6.2.4.14-14.el8 Issue is still persist. Occurred high CPU context switching.