Hi,
I think that everything is in the title. I would like to create a custom alarm for my Zk nodes when the request latency is too high. Could somebody help me please (I’m sorry but I think that I still did not properly understood the alarm configuration language) ?
Best,
Jerome
ilyam8
March 11, 2021, 11:58am
2
Hi, there is an example, adjust trigger conditions to your needs.
If there are any questions about health alarm syntax check our docs .
template: zookeeper_requests_latency
on: zookeeper.requests_latency
lookup: average -1m unaligned of avg
units: milliseconds
every: 10s
warn: $this < (($status >= $WARNING) ? (150) : (200))
crit: $this < (($status == $CRITICAL) ? (200) : (300))
delay: up 1m down 15m multiplier 1.5 max 1h
info: average requests latency for the last minute
to: sysadmin
Thanks a lot !
You have a really well written documentation but it take time to really understand all the subtleties.
2 Likes
Hey @jrevillard !
We know that alarm configuration is challenging and we are working hard to improve on that front.
Thanks for the kind words on our docs (kudos to @joel for his phenomenal work).
I tried to setup the alarm . I put it in the health.d
directory but I do not see it in the dashboard… how to debug please ?
[root@zk3 [RCC] netdata-configs]# ls -al health.d/
total 8
drwxr-xr-x. 2 netdata netdata 41 Mar 11 15:48 .
drwxr-xr-x. 10 netdata netdata 4096 Mar 11 15:03 ..
-rw-r--r--. 1 root root 391 Mar 11 15:48 zookeeper_custom_alarm.conf
[root@zk3 [RCC] netdata-configs]# cat health.d/zookeeper_custom_alarm.conf
template: zookeeper_requests_latency
on: zookeeper.requests_latency
lookup: average -1m unaligned of avg
units: milliseconds
every: 10s
warn: $this < (($status >= $WARNING) ? (150) : (200))
crit: $this < (($status == $CRITICAL) ? (200) : (300))
delay: up 1m down 15m multiplier 1.5 max 1h
info: average requests latency for the last minute
to: webmaster
PS: I setup it on the netdata agents and looking at the master Netdata dashboard
ilyam8
March 11, 2021, 4:19pm
7
btw it should be >
, not <
. If greater then …
warn: $this > (($status >= $WARNING) ? (150) : (200))
crit: $this > (($status == $CRITICAL) ? (200) : (300))
It works
I setup it on the netdata agents and looking at the master Netdata dashboard.
Could be the case, put it on the master and restart it.
Ok it’s not working in the master too.
Should I find something in the logs somewhere please ?
ilyam8
March 11, 2021, 7:20pm
9
[root@zk3 [RCC] netdata-configs]# ls -al health.d/
netdata-configs
What is that directory? Do you run netdata in a docker container?
Nop, installed from the install script:
[root@zk3 [RCC] netdata]# pwd
/opt/netdata
[root@zk3 [RCC] netdata]# ls -al
total 0
drwxrwxr-x. 10 netdata netdata 248 Mar 11 17:30 .
drwxr-xr-x. 4 root root 35 Feb 16 08:56 ..
drwxrwxr-x. 3 netdata netdata 145 Feb 9 12:31 bin
drwxrwxr-x. 3 netdata netdata 32 Feb 9 12:31 etc
drwxr-xr-x. 3 netdata netdata 18 Feb 9 12:27 include
drwxr-xr-x. 3 netdata netdata 58 Feb 9 12:27 lib
lrwxrwxrwx. 1 netdata netdata 11 Mar 11 17:30 netdata-configs -> etc/netdata
lrwxrwxrwx. 1 netdata netdata 15 Mar 11 17:30 netdata-dbs -> var/lib/netdata
lrwxrwxrwx. 1 netdata netdata 15 Mar 11 17:30 netdata-logs -> var/log/netdata
lrwxrwxrwx. 1 netdata netdata 17 Mar 11 17:30 netdata-metrics -> var/cache/netdata
lrwxrwxrwx. 1 netdata netdata 19 Mar 11 17:30 netdata-plugins -> usr/libexec/netdata
lrwxrwxrwx. 1 netdata netdata 21 Mar 11 17:30 netdata-web-files -> usr/share/netdata/web
lrwxrwxrwx. 1 netdata netdata 3 Mar 11 17:30 sbin -> bin
drwxrwxr-x. 7 netdata netdata 66 Feb 9 12:31 share
drwxrwxr-x. 2 netdata netdata 216 Feb 9 12:31 system
drwxrwxr-x. 5 netdata netdata 81 Mar 11 17:30 usr
drwxrwxr-x. 6 netdata netdata 52 Feb 9 12:31 var