Questions about notification roles

jurgenhaas · March 10, 2021, 9:42am

Some of our customers got interested in the power of alert notifications and we are about to enhance the configuration to not only meet the needs of ourselves but also the clients who want to receive selective notifications. We did some research and think, all we need can be done but we want to make sure, we got it right, hence a number of questions:

We can define our own roles and are not limited to the stock roles like sysadmin, dba, webmaster. Is that right?
Each template can can multiple roles in the to field, separated by space, right?
When overwriting the to field of a template, we do that in /etc/netdata - when using edit-config we get the file to overwrite all templates of a health category. Can we just overwrite one template in a category by creating the correct file containing only that overwritten template?
Can a chart overwrite the to field of the alert templates? Use case is that we have many http checks configured, but for one of them, the alert should go to an additional custom role.

ilyam8 · March 10, 2021, 1:21pm

Hi @jurgenhaas

Yes
Yes
If i understand you right - no. There is no merge of user and stock configuration files, it is: try to read user config, if it doesn’t exist, read stock config. For instance, if you create an empty /etc/netdata/health.d/cpu.conf you get no cpu alarms (stock cpu.conf is ignored).
No, it can not. That is an interesting idea, but i believe it is impossible to do right now (many http checks configured, but for one of them, the alert should go to an additional custom role). Correct me if i am wrong @Thiago_Marques_0 @vlvkobal

Thiago_Marques_0 · March 10, 2021, 2:10pm

No you are right, right now it is not possible chart overwrite any field to alert template. When health process data data, it reads the alarm values, but it does not use any chart field to dispatch messages, but like you I also consider this a good idea.

jurgenhaas · March 10, 2021, 2:20pm

Thanks @ilyam8 and @Thiago_Marques_0 for your feedback.

As for 3. that’s a bit unfortunate. Copying the complete stock file and making customizations is not a problem. I’m only worried about updates of Netdata, where we don’t want to miss out on template improvements in stock. Any suggestions on how to handle that?

As for 4. if we have individual charts that should have modified notifications, should we write our own templates for those? Is it even possible to assign specific templates to charts or is there a better way of doing this?

Thiago_Marques_0 · March 10, 2021, 2:54pm

Hello @jurgenhaas ,

For 3:
When you use edit-config, the script copies the original stock files to /etc/netdata/health.d, netdata never overwrites these files, we only overwrite the content inside /usr/lib/netdata/conf.d, for example, if you run the following commands:

# cd /etc/netdata/
# ./edit-config health.d/haproxy.conf
# ls health.d/haproxy.conf

you will be able to configure your alarms and you do not need to worry with the possibility to lose them.

For 4:

When we use template, we apply alarms to contexts and they are more generic, but if you create alarms you can apply to specific charts.

Best regards!

jurgenhaas · March 10, 2021, 3:02pm

Well, that workflow regarding 3 is ok for a single host. But if you maintain hundreds or thousands of hosts, you can do it that way. Then you need a provisioning tool like Ansible, which we already have in place. That’s not the issue. So we use the stock templates from the latest version and use them as Ansible templates with parameters in them which we copy to each host when rolling out Netdata - and later also if we update Netdata to a newer version. And that’s when I get worried. The new version of Netdata will come with improved templates but we won’t benefit from those improvements, unless we compare each of the new stock templates with our custom templates - or we review the Git repository from Netdata. Both options are not practical to be honest.

As for 4, I’m not getting it yet how we could define individual alarm and notification settings for individual charts. Is there any examples or a guide somewhere?

ilyam8 · March 10, 2021, 7:31pm

There is no way to handle it. That is how our user/stock configs work (not health only).

Related topic - 'local' support for on/off plugins? - #8 by k0ste

@jurgenhaas there is something you can do. If the file contains several alarms and you want to overwrite only one, you can:

copy the stock config
make changes in the alarm
remove other alarms
rename the file (add my_ prefix, e.g cpu.conf => my_cpu.conf)

So netdata loads both user/my_cpu.conf and stock/cpu.conf, but uses that one alarm from the user/my_cpu.conf.

ilyam8 · March 10, 2021, 7:41pm

See alarms and templates on section. So it is not specific setting for a chart, but specific alarm for specific chart.

jurgenhaas · March 11, 2021, 9:32am

That sounds better than expected. Regarding config files, if I keep stock config where it is - untouched - and create my own template file with a different filename, that way I can overwrite individual templates with my own version without loosing anything from other stock templates. @ilyam8 hope I got this right, just trying to rephrase what you explained two comments ago.

I’ll also revisit the alarms and templates documentation that you linked to. I had been there before but had issues understanding it all. I may well come back here with more specific questions then.

ilyam8 · March 11, 2021, 10:01am

Yes, you got it right. Different filename, but the alarm/template in there has same name as in the stock config file => overwrite specific alarm/template.

k0ste · March 11, 2021, 6:24pm

and this impossible for turn off/on for plugins? Only for alarms?

jurgenhaas · March 17, 2021, 7:30am

Just to let you all know, we’ve achieved what we needed to with the help from Netdata people in the chat, thanks a lot. As we do everything in Ansible, we’ve also documented our approach and you can read about it at Ansible Role NetData - DevOps Tools

ilyam8 · March 17, 2021, 8:20am

i checked and it is possible to disable go.d.plugin modules adding empty moduleName.conf, that is how it works:

check if the moduleName is enabled
if yes, than load it and look for moduleName.conf in the user config dir
if found, load it.
if not found look for moduleName.conf in the stock config dir
if found, load it.

If the config has no jobs - nothing to run.

But pyhton.d.plugin behaves differently in case of empty (no jobs) config - it creates one job with default parameters.

k0ste · March 17, 2021, 9:29am

You mean if we touch /etc/netdata/conf.d/python.d/samba.conf netdata will enable this module?

ilyam8 · March 22, 2021, 1:57pm

@jurgenhaas we add httpcheck_ prefix to the httpcheck collector alarms/templates

jurgenhaas · March 22, 2021, 2:19pm

@ilyam8 that’s a great move! It means we have to adjust our Ansible scripts when we update to that new version eventually, but it makes things more readable and avoids confusion, I agree.

OdysLam · March 23, 2021, 3:32pm

Hey @jurgenhaas,

I find it very interesting that you use Ansible. Perhaps you would like to share your setup in our community repo GitHub - netdata/community: Netdata-powered applications and examples. For the community, by the community. ? I could help you write a small guide in our #community-guides . I am sure that a lot of users will be interested, especially if you also tackle configuration management and not only deployment/setup.

jurgenhaas · March 29, 2021, 8:26am

@OdysLam I’d be happy to rite a little guide, but I wouldn’t want to replicate the source code repository on GitHub because our code is already available publicly on our GitLab instance. So if it were OK with you you keep the code where it currently is and link from the guide to it, then that’s OK for us too and we could provide you with something by the end of the week.

OdysLam · March 29, 2021, 11:20am

That sounds like a plan @jurgenhaas ! Let me know if I can help you in any way

k0ste · June 8, 2023, 9:53am

@OdysLam, if u are interesting in Ansible netdata role, I’m develop it and use since 2017, the link is GitHub - k0ste/ansible-role-netdata: Role for deploy netdata and netdata configuration

Topic		Replies	Views
Overwriting alerts configuration Help agent , alerts , configuration	3	733	May 11, 2022
Alarm not overriding template alerting rules Help agent , alerts , configuration	0	31	October 11, 2024
alarms: a comprehension problem Help cloud	5	517	March 23, 2023
Resetting notifications Help agent	3	980	May 7, 2021
Unable to add or edit health checks from WebUI (Plan: Business) Help agent , cloud , alerts , notifications , dashboard	14	62	August 28, 2024

Questions about notification roles

Related topics