I have a file cpu.conf
after reviewing Configure health alarms | Learn Netdata and do tests…
alarm: cpu_template
on: system.cpu
lookup: average -10s percentage foreach system,user,nice
every: 10s
warn: $this > 60
crit: $this > 90
repeat: warning 7200s critical 3630s
to: sysadmin
The alarm goes off when it has not yet exceeded the % of the warning or the estimated time that I have configured for it. Example
Could it be that trying to control with seconds is a bit risky due to small variations? Do I have the misconfigured? Thank you very much community for your time.
The value is 68.2 and the warning threshold is 60 - no problems here.
The values on the chart are correct too. The default aggregation algorithm is average, the values you see are the average values over the last 10+ minutes. So the more you zoom out (bigger timeframe) more averaged values you get. Choose MAX instead of AVG (each as)and you will see 68.2.
Because of the small lookup time and no delay option you can get flooded with notifications because of small variations in the value when it is varying regularly but staying close to the threshold value.