I have a host running netdata-2.2, streaming to a host running 2.1.1. I see graph data for the mountpoint in question, at around 69% full. On the master node in /etc/netdata/health.d/disks.conf I have this config, but I am not getting any alerts. Note that I get other alerts sent to ‘sysadmin’ via a slack integration that works for things like high memory.
template: disk_space_usage
on: disk.space
class: Utilization
type: System
component: Disk
host labels: _os=linux freebsd
chart labels: mount_point=!/dev !/dev/* !/run !/run/* *
calc: $used * 100 / ($avail + $used)
units: %
every: 1m
warn: $this > (($status >= $WARNING ) ? (60) : (90))
crit: ($this > (($status == $CRITICAL) ? (90) : (98))) && $avail < 5
delay: up 1m down 5m multiplier 1.5 max 1h
summary: Disk ${label:mount_point} space usage
info: Total space utilization of disk ${label:mount_point}
to: sysadmin
If I read this right, it should trigger a warning when above 60%, is this correct? When I restart netdata, all I see in the health.log is a transition from UNINITIALIZED to CLEAR.
In a streaming setup, are all the alerts configured on the master netdata host ?
Hi @EdSchernau,
I see in your configuration you defined the alert to run on a host with specific host labels.
Cloud you confirm the host has indeed the required host labels.
Thanks, I see if I hit http://HOST-IP:19999/api/v1/info that there’s a host label “_os” and it’s set to “linux”. Since I can see the status change from UNINITIALIZED to CLEAR when I look at the health.log on the parent node, it’s like it knows there should be an alert, but perhaps the math isn’t triggering somehow? filesystem is definitely over 60% full.
under alert_instances, there’s this section. Both the referenced config file on the node, and the /etc/netdata/health.d/disks.conf file in the master server the node streams to, have thresholds set to warn at 60. In a streaming setup, are any of the client health alerting config files relevant?
,{
"ni":0,
"nm":"disk_space_usage",
"ch":"disk_space./res-sandbox",
"ch_n":"disk_space./res-sandbox",
"units":"%",
"fami":"/res-sandbox",
"info":"Total space utilization of disk ${label:mount_point}",
"sum":"Disk /res-sandbox space usage",
"ctx":"disk.space",
"st":"CLEAR",
"tr_i":"c1503885-ebe7-46cd-9caa-a6cd7933611a",
"tr_v":69.649669,
"tr_t":1738678305,
"cfg":"a0be182f-336e-4525-b1b1-ae4a8ad7035e",
"src":"line=10,file=/usr/lib/netdata/conf.d/health.d/disks.conf",
"to":"sysadmin",
"tp":"System",
"cm":"Disk",
"cl":"Utilization",
"gi":1738678305998889,
"v":70.9539163,
"t":1738703865
}
No. I think what you shared is enough.
We can see that alert value is at "v":70.9539163 and it’s in "st":"CLEAR" although from the config it should be WARNING"warn":"$this > (($status >= $WARNING ) ? (60) : (90))".
Let me ask for help from someone specialist on agent alerts.
As a side note, I just filled a different filesystem and it alerted exactly as I’d expect.
Might this be because this is a Lustre (network) filesystem?