OOM Kills happening on host are shown in Netdata running inside an LXC container

philip · December 14, 2022, 8:14am

Problem/Question

OOM Kills are reported by Netdata (which is installed inside the cotnainer) even when the system (an LXC container) has lot of RAM available i.e. the container never reached the RAM limit.

We believe the Netadata inside the container is actually showing the OOM Kills happening on the host.

Relevant docs you followed/actions you took to solve the issue

We checked the lxc container RAM usage using htop. The RAM usage is very low and the container will never run out of RAM as the only thing running is Uptime Kuma (a lightweight website uptime monitor)

What I expected to happen

The OOM kills should not be shown as they are not happening inside the container.

ilyam8 · December 14, 2022, 8:58am

Hi, @philip. Netdata gets the number of OOM kills from /proc/vmstat. The same true for other metrics we get from procfs (there are a lot). I am not sure how this can be fixed for LXC containers.

Have you considered installing Netdata on the host system and monitoring VMs/containers with cgroups.plugin? It gathers CPU, memory, disk, and network statistics for every VM/container.

philip · December 14, 2022, 9:23am

@ilyam8

Yes, we already do that.

From within we want to monitor the LXC container, so Netdata was installed but it showed OOM kills and we wondered where those kills are coming from.

ilyam8 · December 14, 2022, 9:48am

As I mentioned, the system metrics come from reading procfs. I see that you get the host metrics reading procfs from inside a container (checked on a Proxmox server), and it is not the case for qemu/kvm VMs. I think this is by design (shared kernel) and not a bug. Netdata reports whatever kernel reports (procfs).

philip · December 14, 2022, 10:08am

@ilyam8 Yes, understood.

Thanks for your assistance.

philip · January 8, 2023, 1:14pm

Hello

@ilyam8

I just wanted to let you guys know that there’s a change in LXD that added a new metric for OOM kills. Maybe we thought it would be helpful for you guys.

ilyam8 · January 9, 2023, 12:22pm

Thanks, @philip. Netdata gathers containers/VMs metrics reading the /sys/fs/cgroup/* directory (see cgroups.plugin). It is easy to add oom_kills metric/chart, but it is not really useful from my experience - if OOM killer kills the main process (which is likely) in the container that counter doesn’t get incremented.

Topic		Replies	Views
LXC Containers Stats Are Not Shown Help	72	2950	September 28, 2023
Netdata installed in LXC retrieves statistics from bare-metal Help agent	3	1469	August 15, 2022
Incorrect "Available Memory" values with LXD Containers Help	14	987	January 13, 2023
LXC wrong stats reported General	3	171	April 27, 2024
Monitoring LXD VMs General	2	929	February 28, 2023

OOM Kills happening on host are shown in Netdata running inside an LXC container

Problem/Question

Relevant docs you followed/actions you took to solve the issue

What I expected to happen

Related topics