Silencer not matching or disabling notification

I have a server that is doing a fstrim command every night, and during that time, I want to silence the disk alerts.

So I’m adding silencers:

Alarms list: {
    "all": false,
    "type": "SILENCE",
    "silencers": [
        {
            "alarm": "disk_util.nvme3n1"
        },
        {
            "context": "disk_util.nvme3n1.10min_disk_utilization"
        }
    ]
}

(I’m adding both an alarm and a context silencer, just trying to get it to match something, I’ve tried other names but I can’t get anything to work).

and this is the alarm that’s still going off:

Alarms list: {
    "hostname": "fqdn",
    "latest_alarm_log_unique_id": 1647905719,
    "status": true,
    "now": 1648377809,
    "alarms": {
        "disk_util.nvme3n1.10min_disk_utilization": {
            "id": 1647905528,
            "name": "10min_disk_utilization",
            "chart": "disk_util.nvme3n1",
            "family": "nvme3n1",
            "active": true,
            "disabled": false,
            "silenced": false,
            "exec": "/usr/libexec/netdata/plugins.d/alarm-notify.sh",
            "recipient": "sysadmin",
            "source": "130@/etc/netdata/health.d/disks.conf",
            "units": "%",
            "info": "the percentage of time the disk was busy, during the last 10 minutes",
            "status": "WARNING",
            "last_status_change": 1648376654,
            "last_updated": 1648377794,
            "next_update": 1648377854,
            "update_every": 60,
            "delay_up_duration": 0,
            "delay_down_duration": 900,
            "delay_max_duration": 3600,
            "delay_multiplier": 1.200000,
            "delay": 0,
            "delay_up_to_timestamp": 1648376654,
            "warn_repeat_every": "0",
            "crit_repeat_every": "0",
            "value_string": "87%",
            "last_repeat": "0",
            "db_after": 1648377195,
            "db_before": 1648377794,
            "lookup_method": "average",
            "lookup_after": -600,
            "lookup_before": 0,
            "lookup_options": "unaligned",
            "warn":"$this > $green * (($status >= $WARNING) ? (0.7) : (1))",
            "warn_parsed":"(${this} > (${green} * ((${status} >= ${WARNING}) ? 0.7 : 1)))",
            "crit":"$this > $red * (($status == $CRITICAL) ? (0.7) : (1))",
            "crit_parsed":"(${this} > (${red} * ((${status} == ${CRITICAL}) ? 0.7 : 1)))",
            "green":90,
            "red":98,
            "value":87.0127922
        }
    }
}

it’s still saying ‘silenced: false’, and still sending a notification. I know it’s going to be something simple, but what am I missing?

Also, as far as I can tell, you can’t see in the GUI if something is silenced, is that right? That would be really handy to see if this would work properly.

I’m using the netdata docker image : netdata:v1.29.3 .

Thanks,
Chris.

Hi there,

Let me start from the beginning

I have a server that is doing a fstrim command every night, and during that time, I want to silence the disk alerts.

For your issue the ideal solution would be: in a specific time frame the Agent to silence these alerts (the disk ones). In your version of the Agent (and in the current) you can’t silence an alert for a period of a day. If you want this you must ask for a feature request in Sign in to GitHub · GitHub in the GH repo.

Now a workaround to silence (forever disk alerts is)

Option 1 (easy way):

Set the particular alert (by default: /etc/netdata/health/health.d/specific_alert_conf_file.conf to silent

Option 2 (a little bit more advance):
Use the Health API Health API Calls | Learn Netdata

Let’s try the Option 1 and see if it fixes your issue.

Also, as far as I can tell, you can’t see in the GUI if something is silenced, is that right? That would be really handy to see if this would work properly.

For this, you can also make a feature request.

For example this alert is in silent (the Agent respects it) but you still see role:sysadmin or and yu can’t understand that this alert is in silent.

Hi,

Thanks for the reply.

I am trying to use the API to temporarily silence the alert, but it’s not matching, and I’m still getting notifications about the alert.

Am I using the wrong name or wrong flags when I do this?

The details of the alert are in my original post.

    "alarms": {
        "disk_util.nvme3n1.10min_disk_utilization": {
            "id": 1647905528,
            "name": "10min_disk_utilization",
            "chart": "disk_util.nvme3n1",

should I be using context = disk_util.nvme3n1.10min_disk_utilization or alarm = disk_util.nvme3n1.10min_disk_utilization or something else?

Thanks,
Chris.

Try the following:

curl "http://node:19999/api/v1/manage/health?cmd=SILENCE&context=10min_disk_utilization" -H "X-Auth-Token: <YOUR_TOKEN>"

This will silence the 10min_disk_utilization alerts for all disks. I am not sure if you could just silence this alert only for one drive (e.g the nvme3n1)

Let me know if this helped your situation.
Tasos.

Thanks for the suggestion.

I eventually did

curl "http://node:19999/api/v1/manage/health?cmd=SILENCE&alarm=10min_disk_utilization"

which seems to have worked (for a couple of weeks).

I don’t think using a context (instead of an alarm) matched and worked.