Disable cgroup_ram_in_use alerts for specific service

Slind14 · October 27, 2024, 5:43pm

Hi there,

we want to fully disable the cgroup_ram_in_use alert for promtail running through Docker Swarm.

How do we achieve this? We tried multiple attempts but couldn’t figure out a way that works.

Here is the most recent config we tried:

/etc/netdata/health.d/cgroup_mem.conf

template: cgroup_ram_in_use
      on: cgroup.mem_usage
   class: Utilization
    type: Cgroups
component: Memory
host labels: _os=linux
chart labels: cgroup_name=!*promtail* *
    calc: ($ram) * 100 / $memory_limit
   units: %
   every: 10s
    warn: $this > (($status >= $WARNING) ? (80) : (90))
    crit: $this > (($status == $CRITICAL) ? (90) : (98))
   delay: down 15m multiplier 1.50 max 1h
 summary: Cgroup ${label:cgroup_name} memory utilization
    info: Cgroup ${label:cgroup_name} memory utilization
      to: silent

car12o · November 18, 2024, 12:30pm

Hi @Slind14,
If you want to disable it for all the nodes you can just comment out the all block.
Another option is to filter by host labels.

Disable on all OS’s AKA completely disable:

host labels: !_os=*

Disable on a specific node:

host labels: !_hostname=my-node-hostname

Enable only on a specific node:

host labels: _hostname=my-node-hostname

Slind14 · November 18, 2024, 4:49pm

We want to disable it for a specific cgroup name. See the code above. This isn’t working though. It only works when applied through the UI.

car12o · November 19, 2024, 1:14pm

Sorry, didn’t catch it right at first.

I tried myself and I was able to filter the alert out.

Initially alert is running.

Then I updated the config to exclude by chart label.
Found what’s the correct label to filter out by checking it on my node’s single view.

Config:

    template: cgroup_ram_in_use
          on: cgroup.mem_usage
       class: Utilization
        type: Cgroups
   component: Memory
 host labels: _os=linux
chart labels: cgroup_name=!*data_master* *
        calc: ($ram) * 100 / $memory_limit
       units: %
       every: 10s
        warn: $this > (($status >= $WARNING)  ? (80) : (90))
        crit: $this > (($status == $CRITICAL) ? (90) : (98))
     summary: Cgroup ${label:cgroup_name} memory utilization
        info: Cgroup ${label:cgroup_name} memory utilization
          to: sysadmin

After config change, I restarted the agent and the alert is gone.

Bear in mind that on Cloud the removed alert changes take a while to be propagated (max 10min).

If after this expiation you are still having issues, let me know which agent version are you running.

Slind14 · November 20, 2024, 11:10am

Where did you place the file with the single alert override?

car12o · November 20, 2024, 11:29am

Placed in:

/etc/netdata/health.d/cgroups.conf

Slind14 · November 20, 2024, 12:34pm

When placed there it overrides the other cgroup alerts or not?

car12o · November 20, 2024, 12:48pm

No, it just overrides the alerts matching the template/alarm key.

Topic		Replies	Views
New to netdata, how can I remove cgroup_ram_in_use alert? Help	2	426	October 25, 2022
Exclude Cgroup from alert Help agent-configuration , agent-health , agent	9	1371	October 21, 2020
cgroup_ram_in_use warnings with Docker Swarm / services / containers Help cloud	8	796	September 1, 2022
How to disable some of the cgroups charts from netdata Help agent	11	1122	July 9, 2021
Disable all docker monitoring except network Help cloud	6	844	October 15, 2021

Disable cgroup_ram_in_use alerts for specific service

Related topics