Alarms set to "silent" or commented out keep sending notifications

Kevin_Thierry · November 13, 2023, 10:22am

Hello,

I am facing the following issue:

Notifications are sent even when alarms are set to “silent”
Notifications are sent even when alarms are fully commented out

The notifications are sent to a Telegram group and we want to disable the ones that are less relevant to us because they are spamming the channel, hiding the important ones.

This happens on multiple servers with health checks from the following files:

/etc/netdata/health.d/tcp_listen.conf → commenting the alarms seems to have work here
/etc/netdata/health.d/web_log.conf → commenting the alarms only worked for some hosts
/etc/netdata/health.d/postgres.conf → commenting the alarms doesn’t seem to work at all

Observed with the following versions of Netdata:

v1.43.2
v1.43.0-210-gc672d8ab1

I restarted the netdata service after every changes made to the health check files but I keep receiving notifications for those alarms.

Surely I’m doing something wrong or forgetting something but I can’t put my finger on it.

Any idea what I could do to investigate this further?

Thank you

Manolis_Vasilakis · November 13, 2023, 1:11pm

Hi @Kevin_Thierry

Do note, that even if you comment or delete alerts from /etc/netdata/health.d, their respective stock alerts will still be loaded from the default directory (usually in /usr/lib/netdata/conf.d/health.d/).

A better way to disable some alerts is to use the enabled alarms config option (more info here → Configure alerts | Learn Netdata)

Kevin_Thierry · November 14, 2023, 1:30am

Thanks for the quick reply @Manolis_Vasilakis, I will check the enabled alarms config option.

Kevin_Thierry · November 14, 2023, 10:52am

What about the fact that the “to: silent” parameter does not work?

I also tried to remove the critical alert and only left the warning one for those health checks and I still receive critical alerts notifications…

I thought that my custom postgres.conf was ignored but I have some alerts from a custom plugin inside it and those are working so this is not the issue.

Kevin_Thierry · November 15, 2023, 2:18am

I also added the following to /etc/netdata/netdata.conf:

[health]
enabled alarms = !postgres_* *

And I still get notifications form the postgres_* alarms…

Manolis_Vasilakis · November 15, 2023, 7:40am

Hi @Kevin_Thierry

Ok, one thing to check. Could you check http://localhost:19999/api/v1/alarms?all ?

This should list all running alerts currently on the agent.

Also, are we talking about a single agent? Any streaming of other agents to it? Is it connected to the cloud?

Thanks

Kevin_Thierry · November 15, 2023, 8:15am

Hi @Manolis_Vasilakis,

Thank you for your reply.

This is a child node streaming to a parent node.

The API call does not return any alarms:

curl http://localhost:19999/api/v1/alarms?all?
{
	"hostname": "xxxxxx",
	"latest_alarm_log_unique_id": 1692624511,
	"status": true,
	"now": 1700035940,
	"alarms": {

	}
}

Manolis_Vasilakis · November 15, 2023, 8:21am

Ah, could you remove the final ? from the url? It should be http://localhost:19999/api/v1/alarms?all.

Also, just a note so to perhaps clear some area on alerts:

Both the child and the parent run health. Each for it’s own metrics, charts, etc.

In addition, the parent will run health also for the child.

When you configure an alert on the parent, that will apply to the parent itself and the child. However, the child itself will also run it’s own health configuration.

So if for example you setup the parent to not load any postgres alerts, the child might continue to run them if a similar configuration is not made on the child’s netdata.conf

Is it possible that this would be the case here?

Kevin_Thierry · November 15, 2023, 8:41am

Thanks a lot for all the information.

I just found out that there are many errors like the one hereunder in my postgres.conf file:

2023-11-15 08:25:12: netdata ERROR : HEALTH : Health configuration at line 4 of file '/etc/netdata/health.d/postgres.conf' has unknown key 'on'. Expected either 'alarm' or 'template'

But I can’t figure out what the error is, these are the top lines of the file:

# you can disable an alarm notification by setting the 'to' line to: silent

 template: postgres_total_connection_utilization
       on: postgres.connections_utilization
    class: Utilization
     type: Database
component: PostgreSQL
    hosts: *
   lookup: average -1m unaligned of used
    units: %
    every: 1m
     warn: $this > (($status >= $WARNING)  ? (50) : (55))
     crit: $this > (($status >= $CRITICAL)  ? (55) : (60))
    delay: down 15m multiplier 1.5 max 1h
     info: average total connection utilization over the last minute
       to: dba

I ran the api call without the ‘?’ at the end and I get the list of alarms which doesn’t contain alarms from the postgres.conf file.

As you suggest, the parent may be involved with this issue. I will check the config running on it too.

Thank you for your help

Manolis_Vasilakis · November 15, 2023, 8:43am

Please ignore that error. It’s just a by-product of you disabling it via the enabled alarms config option, will fix it.

Kevin_Thierry · November 16, 2023, 9:00am

There is no health checks for postgres on the parent (no health.d/postgres.conf file and “curl http://localhost:19999/api/v1/alarms?all” does not return postgres alarms) so I still don’t know where those alarms come from

Kevin_Thierry · November 16, 2023, 11:03am

@Manolis_Vasilakis, you were right about the parent throwing those alerts. I added a postgres.conf configuration on the parent with the alarms set to silent and it stopped the notifications.

Thanks again for your help!

Manolis_Vasilakis · November 16, 2023, 1:39pm

No problem, thanks for the follow up!

Topic		Replies	Views
Resetting notifications Help agent	3	980	May 7, 2021
Not all alarms send to the Telegram Help agent-alarms , agent	11	1595	May 25, 2021
Netdata alarm notifications not stopping even after updating "to: " field from sysadmin to silent Help agent	14	1875	April 9, 2021
Silencing alerts only means emails? Help agent , cloud	2	346	September 1, 2023
Disable some metric monitoring Help	1	230	December 27, 2023

Alarms set to "silent" or commented out keep sending notifications

Related topics