Exclude Cgroup from alert



  • I’d like to exclude all cgroups that start with abc* from the cgroup ram in use alert.

    the telegram alert comes in like this

    8499aa46-0320-45fe-9ab9-d0ff8e1089e8-image.png

    i tried

    hosts: !abc* *
    hosts: !cgroup_abc* *
    

    but it doesn’t stop the alerts from coming in (yes I did restart netdata after the config change)


  • Staff

    Hi Eddie,

    Documentation: https://learn.netdata.cloud/docs/agent/health/reference#health-entity-reference

    The hosts label is used to filter to which machine hostnames this alarm should run on. I think what you want is to define a template, define the context mem_usage and then define the families for which this context should run against.
    e.g

    template: apache_last_collected_secs
          on: cgroup.mem_usage_limit
        families: !abc* * 
      ...
    

    Please do tell me if my suggestion worked! We work towards improving our story around alarms 🙂

    Cheers



  • thanks.

    can I use the hostname of the LXC container? That’s easier than creating a template


  • Staff

    No, I don’t think that that’s possible. In general, when we refer to hostnames, we mean that of the host (when netdata runs on).



  • when we refer to hostnames, we mean that of the host (when netdata runs on)

    this clarifies why it won’t work. thanks!


  • Staff

    Hello @Eddie ,

    An additional information to help you with alarms, every time that you want to change an alarm, you can check the available variables for a chart doing a request like this :

    curl -o uptime.json https://localhost:19999/api/v1/alarm_variables?chart=apps.uptime
    

    the last command will store inside uptime.json all variables used with the chart apps.uptime.

    Best regards!



  • This post is deleted!


  • This is good info to look at an understand what is going in general, thanks @thiago-marques-0. but is this meant to help me with exactly building the template?

    @odyslam, is the code above complete? where would i drop the template file? does it get picked up automatically after a restart?

    thank you both

    PS: whatever forum system you guys use here congrats, the code stylesheet is awesome!


  • Staff

    Hello @Eddie ,

    Firstly I have to apologize you, I was busy with other PR.

    I will give examples for you and if I am not clear, please, let me know and also you can read more details about simple pattern here.

    For the first example you used:

    bash-5.0# netdata -W simple-pattern '!abc *' 'abc'
    RESULT: NOT MATCHED - pattern '!abc *' does not match 'abc', wildcarded ''
    bash-5.0# netdata -W simple-pattern '!abc *' 'cde'
    RESULT: MATCHED - pattern '!abc *' matches 'cde', wildcarded ''
    

    As you can see Netdata will never match nothing with abc, but will match other texts. The same thing will happen with cgroup_abc :

    bash-5.0# netdata -W simple-pattern '!cgroup_abc *' 'abc'
    RESULT: MATCHED - pattern '!cgroup_abc *' matches 'abc', wildcarded ''
    bash-5.0# netdata -W simple-pattern '!cgroup_abc *' 'cgroup_abc'
    RESULT: NOT MATCHED - pattern '!cgroup_abc *' does not match 'cgroup_abc', wildcarded ''
    

    Please, pay attention for the fact if two hosts are given to template, only the last will be used, because it overwrites the first.

    you can identify the names used by netdata running /usr/libexec/netdata/plugins.d/system-info.sh.

    You can store all your templates or alarms inside /etc/netdata/health.d. They will never be discarded after the restart, instead, when you store them in the directory, it will be loaded every time you restart Netdata.

    Finally you can build a template, you can use this example:

    template: dev_dim_template
          on: system.cpu
          os: linux
      lookup: sum -3s at 0 every 3 percentage foreach system,user,nice
       units: %
       every: 1s
        warn: $this > 1
        crit: $this > 4
     hosts: !abc *
    

    This is a modified example that we have inside Netdata tests directory.

    This example applies an alarm to all hosts that are not named abc on charts with context system.cpu.

    Best regards!



  • thanks, Thiago (or should say obrigado?). i just added families: !abc* * below hosts and it seems to have done the trick for this time.

    but i will get into your instructions to learn some more.


  • Staff

    Well @Eddie , about “(or should say obrigado?)”, be free to use the language you prefer. 😄

    I am glad that we could help you, please, let us know if you need any other information.

    Best regards!


Log in to reply