Critical mdstat_disks alert on 4-bay Synology NAS with 3 drives (SHR1)

Hi, @pk1966. The fix can be

  • excluding your raid device from the default alarm.
  • creating a custom alarm for your raid device (trigger if down > 1).

To do that you need to configure the “health.d/md.conf” file: we need to copy/paste “mdstat_disks” and add charts filter to both:

 template: mdstat_disks
       on: md.disks
    class: Errors
     type: System
component: RAID
   charts: !*md0* *
    units: failed devices
    every: 10s
     calc: $down
     crit: $this > 0
     info: number of devices in the down state for the $family array. \
           Any number > 0 indicates that the array is degraded.
       to: sysadmin

 template: mdstat_disks_md0
       on: md.disks
    class: Errors
     type: System
component: RAID
   charts: *md0* !*
    units: failed devices
    every: 10s
     calc: $down
     crit: $this > 1
     info: number of devices in the down state for the $family array. \
           Any number > 0 indicates that the array is degraded.
       to: sysadmin
2 Likes