zfs icon indicating copy to clipboard operation
zfs copied to clipboard

ZFS (DRAID) zabbix - Disk read/write request responses are too high

Open temask88 opened this issue 4 years ago • 4 comments

DRAID 24HDD, 2 cache nvme.

At the time of writing to the pool, the zabbix swears at the read from the pool, although reading at this moment does not occur.

130204425-e31081e8-e5fe-4f54-9d49-4f402eb55015

If, at the time of writing to the pool, at the same time, you start reading from it that the problem disappears.

Problem only with writing and no reading.

Zabbix trigger: min(/zfs-store/vfs.dev.read.await[sdd],15m) > {$VFS.DEV.READ.AWAIT.WARN:"sdd"} or min(/zfs-store/vfs.dev.write.await[sdd],15m) > {$VFS.DEV.WRITE.AWAIT.WARN:"sdd"}

$VFS.DEV.READ.AWAIT.WARN - 20 $VFS.DEV.WRITE.AWAIT.WARN - 20 130206880-faf5f29b-5db2-4759-a7fe-f598b2d45eac

It is not clear the problem of collection by zabbix or a problem with tuning / optimizing zfs

this is what iostat shows at that moment - no read, write 130214510-c591aacc-b6e9-437f-b1a3-44b58a55c449

The question is, is this a normal situation? is it related to the ZFS and CoW? If so, what is the best way to monitor disk performance? If not, what is it ?

temask88 avatar Sep 07 '21 12:09 temask88

Depending on your hardware this could be normal. Those read I/O latencies are high, but the HDDs may be prioritizing writes when there are only infrequent reads. It's hard to say. You can also check zpool iostat -l to get a better idea of how long I/Os are queued in ZFS being being issued, and how long it takes the disk to handle them once issued.

One way to monitor disk performance is with the zpool_influxdb utility which was included in the 2.1 release.

behlendorf avatar Sep 08 '21 19:09 behlendorf

here is the moment in time when writing is active in the pool and there is no reading image

temask88 avatar Sep 13 '21 09:09 temask88

There may not be any application reads, but internally ZFS may need to read some data if it's not cached. It looks like the HDDs may be slow to service some of those reads, possibly because they are also a large number of outstanding writes.

behlendorf avatar Sep 13 '21 22:09 behlendorf

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 16 '22 01:09 stale[bot]