cassandra-image
cassandra-image copied to clipboard
ensure tuned for SSD from Al Tolberts tuning guide
From https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html
discovery
When getting acquainted with a new machine, one of the first things to do is discover what kind of storage is installed. Here are some handy commands:
blockdev --report
fdisk -l
ls -l /dev/disk/by-id
lspci -v # pciutils
sg_inq /dev/sda # sg3-utils
ls /sys/block
IO elevator, read-ahead, IO merge
Folks spend a lot of time worrying about tuning SSDs, and that's great, but on modern kernels these things usually only make a few % difference at best. That said, start with these settings as a default and tune from there.
Use deadline if no Docker
When in doubt, always use the deadline IO scheduler. The default IO scheduler is CFQ, which stands for "Completely Fair Queueing". This is the only elevator that supports IO prioritization via cgroups, so if Docker or some other reason for cgroups is in play, stick with CFQ. In some cases it makes sense to use the noop scheduler, such as in VMs and on hardware RAID controllers, but the difference between noop and deadline is small enough that I only ever use deadline. Some VM-optimized kernels are hard-coded to only have noop and that's fine.
echo 1 > /sys/block/sda/queue/nomerges # SSD only! 0 on HDD
echo 8 > /sys/block/sda/queue/read_ahead_kb # up to 128, no higher
echo deadline > /sys/block/sda/queue/scheduler
I usually start with read_ahead_kb at 8 on SSDs and 64 on hard drives (to line up with Cassandra <= 2.2's sstable block size). With mmap IO in <= 2.2 and all configurations >= 3.0. Setting readahead to 0 is fine on many configurations but has caused problems on older kernels, making 8 a safe choice that doesn't hurt latency.
Beware: setting readahead very high (e.g. 512K) can look impressive from the system side by driving high IOPS on the storage while the client latency degrades because the drives are busy doing wasted IO. Don't ask me how I know this without buying me a drink first.