modifying alarms via Health API

Luis_Johnstone · February 8, 2023, 10:09pm

Hi there,

I’m a bit confused about what should happen when disabling or silencing alarms via the agent health API.

So I have some active, existing alarms and I’d like to control them by either silencing or disabling the checks (the latter is the end goal).

So on the host I go to:

http://ns1:19999/api/v1/alarms

It lists my 4 active alarms, which are raised via the ping collector.
So for example, one of them looks like this (abridged):

“id”: 1671721153,
“config_hash_id”: “58be95da-e922-93b8-f56c-cded51e3bc9a”,
“name”: “ping_host_reachable”,
“chart”: “ping_ping_hosts.host_tor1_fritz_box_packet_loss”,
“family”: “packet loss”,
“class”: “Errors”,
“component”: “Network”,
“type”: “Other”,
“active”: true,
“disabled”: false,
“silenced”: false,

And then on the node where the collection and alerting is happening I run:

curl “http://ns1:19999/api/v1/manage/health?cmd=DISABLE&chart=1671721153” -H “X-Auth-Token: REDACTED”

which returns:

Health checks disabled for alarms matching the selectors
Alarm selector added

And if I look in “/var/lib/netdata/health.silencers.json” I see the following:

{
“all”: false,
“type”: “DISABLE”,
“silencers”: [
{
“chart”: “1671721153”
}
]
}

However, if I reload the alarms page the status of the alarm with that ID number has not changed:

active: true,
disabled: false,
silenced: false,

Moreover, the agent web UI shows no removal of the alert.

I have also tried this using chart-name, with the same results:

curl “http://ns1:19999/api/v1/manage/health?cmd=DISABLE&chart=ping_ping_hosts.host_tor1_fritz_box_packet_loss” -
H “X-Auth-Token: REDACTED”

Is this expected behaviour?

Thanks,

Luis

Christopher_Akritid1 · February 9, 2023, 3:11pm

Health API Calls | Learn Netdata says that chart is used with chart ids/names, as shown on the dashboard. These will match the on entry of a configured alarm.

That numeric identifier doesn’t appear in an alarm configuration. The specific alert is generated from the following template found in netdata/ping.conf at master · netdata/netdata · GitHub

 template: ping_host_reachable
 families: *
       on: ping.host_packet_loss
    class: Errors
     type: Other
component: Network
   lookup: average -30s unaligned of loss
     calc: $this != nan AND $this < 100
    units: up/down
    every: 10s
     crit: $this == 0
    delay: down 30m multiplier 1.5 max 2h
     info: network host ${label:host} reachability status
       to: sysadmin

The on directive in templates is at the context level, not the chart id/name.

So to disable these alerts/notifications the selector should either be alarm=ping_host_reachable, or context=ping.host_packet_loss

Now that we have the ability to use labels in alarm configurations, we might be able to extend the interface to make it more specific (e.g. so that when the label host has a given value perhaps).

Topic		Replies	Views
health api - reset specific silenced/disabled alerts Help agent , alerts , platform	5	36	August 15, 2024
Disable outbound_packets_dropped and outbound_packets_dropped_ratio Help agent-configuration , agent-health , agent	5	2035	November 24, 2020
Resetting notifications Help agent	3	980	May 7, 2021
Alarms set to "silent" or commented out keep sending notifications Help agent , alerts , notifications	12	507	November 16, 2023
Silencing alerts only means emails? Help agent , cloud	2	344	September 1, 2023

modifying alarms via Health API

Related topics