modifying alarms via Health API

Hi there,

I’m a bit confused about what should happen when disabling or silencing alarms via the agent health API.

So I have some active, existing alarms and I’d like to control them by either silencing or disabling the checks (the latter is the end goal).

So on the host I go to:

http://ns1:19999/api/v1/alarms

It lists my 4 active alarms, which are raised via the ping collector.
So for example, one of them looks like this (abridged):

“id”: 1671721153,
“config_hash_id”: “58be95da-e922-93b8-f56c-cded51e3bc9a”,
“name”: “ping_host_reachable”,
“chart”: “ping_ping_hosts.host_tor1_fritz_box_packet_loss”,
“family”: “packet loss”,
“class”: “Errors”,
“component”: “Network”,
“type”: “Other”,
“active”: true,
“disabled”: false,
“silenced”: false,

And then on the node where the collection and alerting is happening I run:

curl “http://ns1:19999/api/v1/manage/health?cmd=DISABLE&chart=1671721153” -H “X-Auth-Token: REDACTED

which returns:

Health checks disabled for alarms matching the selectors
Alarm selector added

And if I look in “/var/lib/netdata/health.silencers.json” I see the following:

{
“all”: false,
“type”: “DISABLE”,
“silencers”: [
{
“chart”: “1671721153”
}
]
}

However, if I reload the alarms page the status of the alarm with that ID number has not changed:

active: true,
disabled: false,
silenced: false,

Moreover, the agent web UI shows no removal of the alert.

I have also tried this using chart-name, with the same results:

curl “http://ns1:19999/api/v1/manage/health?cmd=DISABLE&chart=ping_ping_hosts.host_tor1_fritz_box_packet_loss” -
H “X-Auth-Token: REDACTED

Is this expected behaviour?

Thanks,

Luis

Health API Calls | Learn Netdata says that chart is used with chart ids/names, as shown on the dashboard. These will match the on entry of a configured alarm.

That numeric identifier doesn’t appear in an alarm configuration. The specific alert is generated from the following template found in netdata/ping.conf at master · netdata/netdata · GitHub

 template: ping_host_reachable
 families: *
       on: ping.host_packet_loss
    class: Errors
     type: Other
component: Network
   lookup: average -30s unaligned of loss
     calc: $this != nan AND $this < 100
    units: up/down
    every: 10s
     crit: $this == 0
    delay: down 30m multiplier 1.5 max 2h
     info: network host ${label:host} reachability status
       to: sysadmin  

The on directive in templates is at the context level, not the chart id/name.

So to disable these alerts/notifications the selector should either be alarm=ping_host_reachable, or context=ping.host_packet_loss

Now that we have the ability to use labels in alarm configurations, we might be able to extend the interface to make it more specific (e.g. so that when the label host has a given value perhaps).