setting alerts

Pavel_Rekun · March 29, 2023, 1:21pm

Hi Guys,

I’m not sure if i’m missing something obious or it’s just the design of the system. trying to figure out how to setup a global alert for each resource. so for example if any of our monitored cpu’s goes above X - send alert.
In the docs it says each alert should be configured on the agent itself - is it really that complaicated? we have about 250 machines - this is impossible to manage…
Or maybe i can just get a simple api request to pull all cpu’s at ones and make my own alert (didn’t find this either…

Thanks.

Manolis_Vasilakis · March 29, 2023, 1:41pm

Hi @Pavel_Rekun

In general yes, each agent runs it’s own health (alerting) on it’s own metrics. So if you need to setup an alert you need to do it on each one separately.

If you use any special way for deployment of netdata, then you could include this custom alert as part of the deployment.

The other way this can be achieved, is by having those agents stream their metrics to a “parent” netdata agent. That parent can be configured to run alerts on it’s “children” nodes.

Are those 250 nodes currently in such a setup, or each on it’s own?

Christopher_Akritid1 · March 29, 2023, 2:00pm

What’s meant by “special way for deployment” here is that for 250 nodes we expect you are already using a provisioning and/or configuration management (infrastructure as code) tool like terraform, ansible, chef, or puppet.

Also, for production deployments you should always set up streaming and replication. Read Deployment strategies | Learn Netdata , we’re improving it heavily these days.

Pavel_Rekun · March 30, 2023, 11:24am

Currently about half of it are windows machines. still testing out the parent setup (this is the way we can monitor windows machines - already opened a bug on parent crash so its on a hold) .

As to the alarms - not sure why its design so complicated - if we already have all data in the cloud web gui - why not to add an option to manage it from the cloud? this complicates things very much for us.
Is it possible to fetch with api? as example all cpu’s of all machines?

We currently have about 25 machines connected to test out netdata - the UI and usability are excellent, but not sure what to do with the alerts now…

Topic		Replies	Views
Applying Alerts to an entire Systemg Help agent	1	606	November 13, 2020
Netdata Agent vs Cloud alarm notifications General faq	1	681	June 28, 2021
Health alarm created in a netdata docker node not listed in Netdata Cloud -> Alert Configurations Help agent , cloud , alerts	10	998	July 25, 2022
Netdata cloud / agent notifications question Help cloud	5	720	May 7, 2021
Cloud Notifications General announcement	0	1012	July 24, 2020

setting alerts

Related topics