Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Monitoring Hard-Disk Drives and Arrays Health in Linux in 2022?
5 points by metadat on July 16, 2022 | hide | past | favorite | 3 comments
What is the best way to automate monitoring and alerting on the health and SMART status for disks and arrays in a Linux machine?

The most complex case I have is a machine housing 15 hard-disk drives along with 1x RAID-0, 1x RAID-1, and 1x RAID-5 arrays.

Curious what a highly-effective / cutting-edge / best-practices end-to-end setup looks like in 2022.

Thank you!



Quick and easy, run smartctl [1] at least once an hour. Add up the bad sectors (reallocated, pending, offline uncorrectable) --- alert if it grows by 10 in one day or so, or if the total hits 100 or so. Also alert if any of the other metrics say failed; if you've got a helium drive, there's a metric for that and you might want a threshold, but I don't have enough experience there.

If you really want to spend time on it, you could monitor disk transfer speeds and seek times and alert if the speeds drop or the seek times increase. But I'd guess that's unlikely to be worth the time.

[1] or whatever if the controller gets in the way and you have to use it's utility instead.


Thank you both for the question and answer. I’ve been trying to figure out when I should be swapping disks in a RAID10 array. I’ll have to tinker with it to get it to work with my RAID controller (PERC H310) but this is exactly what I’m looking for.


A quick look says you'll likely need to install the perccli tool from Dell to get the disk info. I'm not going to read its manual, but I'd be surprised if it won't give you the smart data if you do the right incantations.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: