configuring mountpoint alerts in a parent/child setup

I have a host running netdata-2.2, streaming to a host running 2.1.1. I see graph data for the mountpoint in question, at around 69% full. On the master node in /etc/netdata/health.d/disks.conf I have this config, but I am not getting any alerts. Note that I get other alerts sent to ‘sysadmin’ via a slack integration that works for things like high memory.

template: disk_space_usage
          on: disk.space
       class: Utilization
        type: System
   component: Disk
 host labels: _os=linux freebsd
chart labels: mount_point=!/dev !/dev/* !/run !/run/* *
         calc: $used * 100 / ($avail + $used)
        units: %
        every: 1m
         warn: $this > (($status >= $WARNING ) ? (60) : (90))
         crit: ($this > (($status == $CRITICAL) ? (90) : (98))) && $avail < 5
        delay: up 1m down 5m multiplier 1.5 max 1h
      summary: Disk ${label:mount_point} space usage
         info: Total space utilization of disk ${label:mount_point}
           to: sysadmin

If I read this right, it should trigger a warning when above 60%, is this correct? When I restart netdata, all I see in the health.log is a transition from UNINITIALIZED to CLEAR.

In a streaming setup, are all the alerts configured on the master netdata host ?

Hi @EdSchernau,
I see in your configuration you defined the alert to run on a host with specific host labels.
Cloud you confirm the host has indeed the required host labels.

Thanks, I see if I hit http://HOST-IP:19999/api/v1/info that there’s a host label “_os” and it’s set to “linux”. Since I can see the status change from UNINITIALIZED to CLEAR when I look at the health.log on the parent node, it’s like it knows there should be an alert, but perhaps the math isn’t triggering somehow? filesystem is definitely over 60% full.

Ok, in order to understand what’s happening could you:

  • Share the payload from this endpoint:
http://HOST-IP:19999/api/v3/alerts?options=instances,values&nodes={YOUR-NODE-ID}&alert=disk_space_usage
  • From the payload above find the alert in case and search for the cfg key.

  • Share the payload from this endpoint

http://HOST-IP:19999/api/v3/alert_config?config={VALUE-FROM-CFG-KEY}

Thanks, this is good stuff! payload:

{
    "name":null,
    "config_hash_id":"a0be182f-336e-4525-b1b1-ae4a8ad7035e",
    "selectors":{
        "type":"template",
        "on":"disk_space_usage",
        "families":null,
        "host_labels":null,
        "chart_labels":"mount_point=!/dev !/dev/* !/run !/run/* !HarddiskVolume* *"
    },
    "value":{
        "units":"%",
        "update_every":60,
        "calc":"$used * 100 / ($avail + $used)"
    },
    "status":{
        "warn":"$this > (($status >= $WARNING ) ? (60) : (90))",
        "crit":"($this > (($status == $CRITICAL) ? (90) : (98))) && $avail < 5"
    },
    "notification":{
        "type":"agent",
        "exec":null,
        "to":"sysadmin",
        "delay":"up 60s down 300s multiplier 1.5 max 3600s",
        "repeat":null,
        "options":null
    },
    "class":"Utilization",
    "component":"Disk",
    "type":"System",
    "info":"Total space utilization of disk ${label:mount_point}",
    "summary":"Disk ${label:mount_point} space usage"
}

From config standpoint everything look correct.
Could you also share the payload from the 1st endpoint pls.

under alert_instances, there’s this section. Both the referenced config file on the node, and the /etc/netdata/health.d/disks.conf file in the master server the node streams to, have thresholds set to warn at 60. In a streaming setup, are any of the client health alerting config files relevant?

,{
            "ni":0,
            "nm":"disk_space_usage",
            "ch":"disk_space./res-sandbox",
            "ch_n":"disk_space./res-sandbox",
            "units":"%",
            "fami":"/res-sandbox",
            "info":"Total space utilization of disk ${label:mount_point}",
            "sum":"Disk /res-sandbox space usage",
            "ctx":"disk.space",
            "st":"CLEAR",
            "tr_i":"c1503885-ebe7-46cd-9caa-a6cd7933611a",
            "tr_v":69.649669,
            "tr_t":1738678305,
            "cfg":"a0be182f-336e-4525-b1b1-ae4a8ad7035e",
            "src":"line=10,file=/usr/lib/netdata/conf.d/health.d/disks.conf",
            "to":"sysadmin",
            "tp":"System",
            "cm":"Disk",
            "cl":"Utilization",
            "gi":1738678305998889,
            "v":70.9539163,
            "t":1738703865
        }

No. I think what you shared is enough.
We can see that alert value is at "v":70.9539163 and it’s in "st":"CLEAR" although from the config it should be WARNING "warn":"$this > (($status >= $WARNING ) ? (60) : (90))".
Let me ask for help from someone specialist on agent alerts.

Thanks for looking at this. If I’ve crossed up the configs I apologize.

As a side note, I just filled a different filesystem and it alerted exactly as I’d expect.
Might this be because this is a Lustre (network) filesystem?

If I read this right, it should trigger a warning when above 60%, is this correct?

No, it doesn’t. See Special use of the conditional operator.