boa
boa copied to clipboard
Include smartmontools package
I recently had a drive failure, fortunately I was on RAID 1 so nothing got lost.
However, I discovered that smartctl
was not installed and I think that is very useful, moreover it could be easily used in cron to report on drive failure.
I suggest adding it to standard install packages list and recommend to add it into some frequent (hourly?) test procedure.
The package can be added for any BOA installation with .barracuda.cnf
via _EXTRA_PACKAGES="smartmontools"
. This could be easily justified with a standard install package as well though.
As far as adding a regular test procedure, that could get problematic as some servers may have RAID controller that add complexity and variability to smartctl
commands (I'm thinking of MegaRAID in particular where a command looks more like smartctl -d megaraid,13 -a /dev/sda
)
The omega8cc folks may have more thoughts here.
@macmladen Thank you for the suggestion, it is a good idea, however, as @pricejn2 pointed out, it may not work on all systems, especially within VM instance, depending on the master system and hardware configuration and restrictions. For example, it will not work on Linux VServer based VM instances, it will not work on HP machines with Adaptec RAID cards, even outside of VM, etc.
That said, smartmontools
provides smartd
daemon, which comes with DEVICESCAN
directive, which is able to detect supported hardware / drives, so you don't need to guess how to run smartctl
to make it work.
We could simply attempt to run smartd
and watch for the output. If it fails, we know that we can't use it -- it will fail to start with message: "Unable to monitor any SMART enabled devices".
It comes with its own alerting, so we would need to just make sure it sends its messages to the correct email and not root, etc.
We would need to build it from sources, though, because the versions included in packages are often a few years old and can't detect drives and RAID configurations used currently.
I found out that mdamd
(soft RAID tool) was able to notify me but it wasn't configured to do so and failure mail was left on local host.
Just a side note for those who are using the soft RAID.