bcache-tools
bcache-tools copied to clipboard
kernel cmdline parameters for bcache devices
For experimenting (or production) it may be useful to set bcache parameters in an early stage during boot, for example to tune the boot performance when the root fs is on a bcache device. The best moment is right before the root fs is actually mounted, which means it may need to be done in the initramfs.
The bcache kernel drivers does not support passing kernel cmdline parameters to it. This program can be excuted from udev rules to take care of it by changing bcache parameters using the /sys interface right after a bcache device is brought up. This works both in the initramfs and later.
It recognizes parameters like these: bcache.0=sco:0,crdthr:0,cwrthr:0 This means:
- parameters are set for bcache device #0: bcache0
- sequential_cutoff (sco) is set to 0
- cache/congested_read_threshold_us (crdthr) is set to 0
- cache/congested_write_threshold_us (cwrthr) is set to 0 Because of kernel cmdline limitations all parameters are based on a short alias, which represents the long /sys filename. Currently 3 parameters are supported that need to be set prior to root fs mount to directly impact performance in the early boot stage.
Just a quick remark: the numbering in sysfs isn't stable. I think you need the cset uuid. If the /proc/cmdline limit matters (how long is it exactly?): you could also use a wildcard, or encode the configuration in udev files, or some other configuration files (you already know how to copy them to the initrd), rather than the kernel command line.
(I think the above limit is 4k: COMMAND_LINE_SIZE in https://www.kernel.org/doc/Documentation/kernel-parameters.txt)
Thanks for your feedback. I won't mind doing another attempt, but it would be nice if that would be the last one :-)
Suppose we introduce a file /etc/bcache.conf which contains the parameters. That file is copied in the initramfs, but is also used afterwards from it's normal rootfs position. That would work, but it's rather static. Everytime you'd like to try a different setting you need to rebuild the initramfs. I wouldn't be in favour of that.
We could do a combination of both: have a /etc/bcache.conf file and parsing of kernel cmdline to overrule /etc/bcache.conf. That's more complex however: parsing both, store in memory, apply. I would be OK with that, if it would contribute to the final attempt.
I heard about limitations on the kernel cmdline size when creating the dracut module. Can't find the details on the web though, but the idea was that one can add additional params in the dracut config to get around these limitations. I'll do some testing to find out.
Dit some testing, I passed a 3589 bytes long cmdline in grub, according to the kernel the cmdline was "only" 2040 bytes long:
[root@home07 ~]# grep fake=0123 /etc/grub2.cfg linux /vmlinuz-3.18.3-201.fc21.x86_64 root=/dev/BCACHE/ROOTFS ro nouveau.modeset=0 rd.driver.blacklist=nouveau video=vesa:off LANG=en_US.UTF-8 libata.force=noncq bcache.0=sco:0,crdthr:0,cwrthr:0 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=012fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=012fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 [root@home07 ~]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.18.3-201.fc21.x86_64 root=/dev/BCACHE/ROOTFS ro nouveau.modeset=0 rd.driver.blacklist=nouveau video=vesa:off LANG=en_US.UTF-8 libata.force=noncq bcache.0=sco:0,crdthr:0,cwrthr:0 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 fake=0123 [root@home07 ~]# grep fake=0123 /etc/grub2.cfg | wc 1 350 3596 [root@home07 ~]# wc /proc/cmdline 1 193 2040 /proc/cmdline [root@home07 ~]#
so ~ 2k in size. Not a serious limitation.
But, when entering the cmdline arguments during boot by hand, it may be convenient to have shortcuts. And entering the cset uid won't be an option too if entered by hand. The (optional) label would be an option though.
How about something that applies to all devices?
That would be really easy to create, and would do for me (I only want to experiment with bootup times) but may not suffice in general. Would applying it to all devices be OK with you?
Maybe later a combined /etc/bcache.conf + cmdline version, but depending on the new bcache Kent is working on that may not be needed in the end.
Applying to all disks, and doing the minimum necessary to get a reliable feature is fine by me.
As far as forward compat: If someone wants to make things more granular, the udev helper could give priority to per-disk settings if/when they are implemented. That helper can also be the place where conffile support is added. The new userland tools (I just had a quick look) don't have sequential_cutoff for some reason. I don't think the new bcache adds extra configuration in the superblock either (it would complicate multi-disk scenarios; see lvm). So this doesn't seem to clash with new work, but feel free to ping the list.
Did some testing on Fedora, seems to work OK:
- built an RPM
- installed it
- created a new initramfs (dracut)
- specified a some (newly) supported params on the cmdline
- rebooted
- noted that all params were processed
Gabriel, could you respond please?
My apologies. I didn't have enough time to review and it slipped my mind. I'll try to get to it tomorrow, and you may ping me if I don't do it this week.
A (polite) ping...