LXC Containers Stats Are Not Shown

Problem/Question

As stated in the title, container metrics (lxc containers) are not shown/visible in the Netdata dashboard. Now the version we’re using is v1.37.1. Netdata is installed on the host, and on the same host we have several lxc containers running. Netdata shows metrics for the actual host but not the metrics of the lxc containers.

Interestingly, we have the same version of Netdata on another host and there we do not see this problem.

Relevant docs you followed/actions you took to solve the issue

We tried uninstalling Netdata completely and re-installed it.

What I expected to happen

The dashboard should show lxc container metrics.

Below I’ve shared the errors we see in the error.log file on the host where container metrics are not shown.

(I’m hitting the body character limit, hence the pastebin link)

On host where the issue is:

On host where we see Netdata showing container info but some error related to cgroup-network:

Please ask if you need more information, happy to provide!

@ilyam8

Can you please look into this?

Hi, @philip. Let’s do the following:

  • set errors flood protection period to 0 (netdata.conf, [logs] section).
  • clear error log (sudo cp /dev/null /opt/netdata/var/log/netdata/error.log, remove /opt/netdata if you don’t use the install prefix).
  • restart netdata service
  • wait 1 minute
  • grep cgroup related logs (grep -i cgroup /opt/netdata/var/log/netdata/error.log, remove /opt/netdata if you don’t use the install prefix).
  • share the grep output here please

@ilyam8

Here is the output from /var/log/netdata/error.log

2022-12-14 11:14:16: ebpf.plugin INFO  : EBPF CGROUP : thread with task id 452551 finished
2022-12-14 11:14:16: netdata INFO  : MAIN : EXIT: Stopping main thread: PLUGIN[cgroups]
2022-12-14 11:14:16: netdata INFO  : PLUGIN[cgroups] : cleaning up...
2022-12-14 11:14:16: netdata INFO  : PLUGIN[cgroups] : stopping discovery thread worker
2022-12-14 11:14:16: netdata INFO  : PLUGIN[cgroups] : waiting for discovery thread to finish...
2022-12-14 11:14:19: netdata INFO  : PLUGIN[cgroups] : thread with task id 452394 finished
2022-12-14 11:14:23: netdata INFO  : PLUGIN[cgroups] : thread created with task id 610283
2022-12-14 11:14:23: netdata INFO  : PLUGIN[cgroups] : set name of thread 610283 to PLUGIN[cgroups]
2022-12-14 11:14:23: netdata INFO  : PLUGIN[cgroups] : cgroups v2 (unified cgroups) is available but are disabled on this system.
2022-12-14 11:14:23: netdata INFO  : PLUGIN[cgroups] : use unified cgroups false
2022-12-14 11:14:23: ebpf.plugin INFO  : EBPF CGROUP : thread created with task id 610446
2022-12-14 11:14:23: ebpf.plugin INFO  : EBPF CGROUP : set name of thread 610446 to EBPF CGROUP

@philip do you have any other group logs after 11:14:23?

@ilyam8

You mean “cgroup”? No.

Hmm…okay. I did a grep for “group” and got this:

2022-12-14 11:14:16: ebpf.plugin INFO  : EBPF CGROUP : thread with task id 452551 finished
2022-12-14 11:14:16: netdata INFO  : MAIN : EXIT: Stopping main thread: PLUGIN[cgroups]
2022-12-14 11:14:16: netdata INFO  : PLUGIN[cgroups] : cleaning up...
2022-12-14 11:14:16: netdata INFO  : PLUGIN[cgroups] : stopping discovery thread worker
2022-12-14 11:14:16: netdata INFO  : PLUGIN[cgroups] : waiting for discovery thread to finish...
2022-12-14 11:14:19: netdata INFO  : PLUGIN[cgroups] : thread with task id 452394 finished
2022-12-14 11:14:23: netdata INFO  : PLUGIN[cgroups] : thread created with task id 610283
2022-12-14 11:14:23: netdata INFO  : PLUGIN[cgroups] : set name of thread 610283 to PLUGIN[cgroups]
2022-12-14 11:14:23: apps.plugin ERROR : MAIN : PROCFILE: Cannot open file '/etc/netdata/apps_groups.conf' (errno 2, No such file or directory)
2022-12-14 11:14:23: apps.plugin INFO  : MAIN : Cannot read process groups configuration file '/etc/netdata/apps_groups.conf'. Will try '/usr/lib/netdata/conf.d/apps_groups.conf'
2022-12-14 11:14:23: apps.plugin INFO  : MAIN : Loaded config file '/usr/lib/netdata/conf.d/apps_groups.conf'
2022-12-14 11:14:23: netdata INFO  : PLUGIN[cgroups] : cgroups v2 (unified cgroups) is available but are disabled on this system.
2022-12-14 11:14:23: netdata INFO  : PLUGIN[cgroups] : use unified cgroups false
2022-12-14 11:14:23:  INFO  : MAIN : Cannot read process groups configuration file '/etc/netdata/apps_groups.conf'. Will try '/usr/lib/netdata/conf.d/apps_groups.conf'
2022-12-14 11:14:23: ebpf.plugin INFO  : EBPF CGROUP : thread created with task id 610446
2022-12-14 11:14:23: ebpf.plugin INFO  : EBPF CGROUP : set name of thread 610446 to EBPF CGROUP
2022-12-14 11:14:24: go.d ERROR: prometheus[kafka_consumer_group_exporter_local] Get "http://127.0.0.1:9208/metrics": dial tcp 127.0.0.1:9208: connect: connection refused
2022-12-14 11:14:24: go.d ERROR: prometheus[kafka_consumer_group_exporter_local] check failed
2022-12-14 11:14:24: go.d ERROR: prometheus[exporter_for_grouped_process_local] Get "http://127.0.0.1:9644/metrics": dial tcp 127.0.0.1:9644: connect: connection refused
2022-12-14 11:14:24: go.d ERROR: prometheus[exporter_for_grouped_process_local] check failed

Yep, I meant to say cgroup.

  • Some logs can be missing if you haven’t disabled log flood protection (errors flood protection period). If you did it - I don’t know what is happening.
  • We can add cgroups.plugin debug logging but it will require compiling Netdata with debugging. An alternative solution is to create a Docker container with a custom Netdata image (I can do it).

Additionally, can you send me a snapshot to ilya@netdata.cloud?

Hello @ilyam8

I’ve disabled it, as you said, set it to zero (before it was 1200).

With cgroup, there was no other logs. The ones I posted were the only cgroup logs, no more after 11:14:23

As to Docker think you said: Do you mean you’ll generate a custom docker image for us so that we can deploy Netdata on our systems using Docker?

As you asked, I emailed the snapshot to you.

Thank you very much for your assistance!

Yes, just to test. I built a Netdata image with debugging.

  • Create netdata.conf with the following content:
[logs]
  debug flags = 0x0000000000100000
  • Run this docker command
docker run -d --name=netdata_test_cgroups \
-p 20000:19999 \
-v netdatacache_test_cgroups:/var/cache/netdata \
-v /etc/passwd:/host/etc/passwd:ro \
-v /etc/group:/host/etc/group:ro \
-v /proc:/host/proc:ro \
-v /sys:/host/sys:ro \
-v $(pwd)/netdata.conf:/etc/netdata/netdata.conf \
-v /etc/os-release:/host/etc/os-release:ro \
--restart unless-stopped \
--cap-add SYS_PTRACE \
--security-opt apparmor=unconfined \
  ilyam8/netdata-test-for-github
  • wait 1 minute
  • to get the logs
docker logs netdata_test_cgroups 2>&1 | grep -i cgroup | grep -v "UUID"

When done, to remove

docker stop netdata_test_cgroups
docker rm netdata_test_cgroups
docker volume rm netdatacache_test_cgroups
docker rmi ilyam8/netdata-test-for-github:latest

@ilyam8

Here is the logs you requested. I’ve uploaded it to Google Drive as pastebin too limit the content size to 512KB.

Can you also share the tree -d /sys/fs/cgroup/ output?

@ilyam8

Here:

Interesting, and you don’t see the container metrics?

It looks like all lxc.payload.X groups have been found and are being collected.

  • found and renamed
Downloads $ grep "lxc\.payload\..*" netdata_docker_logs.txt | grep "is called"
2022-12-15 10:39:27: cgroup-name.sh: INFO: cgroup 'lxc.payload.hydrogen' is called 'hydrogen'
2022-12-15 10:39:28: cgroup-name.sh: INFO: cgroup 'lxc.payload.planb' is called 'planb'
2022-12-15 10:39:30: cgroup-name.sh: INFO: cgroup 'lxc.payload.microsrvc2' is called 'microsrvc2'
2022-12-15 10:39:32: cgroup-name.sh: INFO: cgroup 'lxc.payload.mynode7' is called 'mynode7'
2022-12-15 10:39:37: cgroup-name.sh: INFO: cgroup 'lxc.payload.sapphirecap4' is called 'sapphirecap4'
2022-12-15 10:39:39: cgroup-name.sh: INFO: cgroup 'lxc.payload.emailserver' is called 'emailserver'
2022-12-15 10:39:41: cgroup-name.sh: INFO: cgroup 'lxc.payload.reverseprox1' is called 'reverseprox1'
2022-12-15 10:39:43: cgroup-name.sh: INFO: cgroup 'lxc.payload.butterflyeu' is called 'butterflyeu'
2022-12-15 10:39:49: cgroup-name.sh: INFO: cgroup 'lxc.payload.peng' is called 'peng'
2022-12-15 10:39:51: cgroup-name.sh: INFO: cgroup 'lxc.payload.bipradix' is called 'bipradix'
2022-12-15 10:39:52: cgroup-name.sh: INFO: cgroup 'lxc.payload.ireallydo' is called 'ireallydo'
2022-12-15 10:39:54: cgroup-name.sh: INFO: cgroup 'lxc.payload.ads' is called 'ads'
...
  • reading metrics
Downloads $ grep "lxc\.payload\..*" netdata_docker_logs.txt | grep "reading metrics for cgroups"
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.hydrogen'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.planb'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.microsrvc2'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.mynode7'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.sapphirecap4'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.emailserver'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.reverseprox1'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.butterflyeu'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.peng'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.bipradix'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.ireallydo'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.ads'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.kindao'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.viecondocker'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.iran'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.aclserver'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.sapphirecap2'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.deloittefond'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.authservice'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.valtameri'
2022-12-15 10:43:31: netdata DEBUG : PLUGIN[cgroups] : (1613@collectors/cgroups.p:read_cgroup    ): reading metrics for cgroups '/lxc.payload.darpa'
...
The following got filtered out
Downloads $  grep "disabled by" netdata_docker_logs.txt | grep -Eo "name '[^']+'" | sort | uniq | sed -e "s/^name '//" -e "s/'$//"
/
dev-hugepages.mount
dev-mqueue.mount
docker
lxc.monitor.aclserver
lxc.monitor.ads
lxc.monitor.adw
lxc.monitor.androphedia
lxc.monitor.authservice
lxc.monitor.bdcloudapps
lxc.monitor.bhtwebdock
lxc.monitor.bigcheese
lxc.monitor.bipradix
lxc.monitor.bnw
lxc.monitor.bot
lxc.monitor.butterflyeu
lxc.monitor.bytesahead
lxc.monitor.capitalcom
lxc.monitor.cardataapi
lxc.monitor.cdmain
lxc.monitor.chat
lxc.monitor.checkworf
lxc.monitor.cms
lxc.monitor.coreconnect
lxc.monitor.darpa
lxc.monitor.deloittefond
lxc.monitor.dhke
lxc.monitor.docker1
lxc.monitor.documotor
lxc.monitor.durarerp
lxc.monitor.dy11
lxc.monitor.emailserver
lxc.monitor.entegrali
lxc.monitor.escisnew
lxc.monitor.fourfpstudio
lxc.monitor.fynbusweb3
lxc.monitor.healthsy
lxc.monitor.hobbyserver
lxc.monitor.hordalan2023
lxc.monitor.hydrogen
lxc.monitor.india1
lxc.monitor.iran
lxc.monitor.ireallydo
lxc.monitor.istudyturkiy
lxc.monitor.jaczulloisno
lxc.monitor.jsq
lxc.monitor.justix
lxc.monitor.kindao
lxc.monitor.komeil1
lxc.monitor.lurianbase
lxc.monitor.maghalejoo2
lxc.monitor.maxthon
lxc.monitor.mc
lxc.monitor.microsrvc1
lxc.monitor.microsrvc2
lxc.monitor.microsrvc3
lxc.monitor.microsrvc4
lxc.monitor.mnd
lxc.monitor.moseacg
lxc.monitor.musrv
lxc.monitor.mynode6
lxc.monitor.mynode7
lxc.monitor.nako
lxc.monitor.nanodeck
lxc.monitor.ouwgredmine
lxc.monitor.oxen2
lxc.monitor.pbserver
lxc.monitor.peng
lxc.monitor.phs
lxc.monitor.planb
lxc.monitor.pre002finlan
lxc.monitor.projectzomb
lxc.monitor.puoti2022
lxc.monitor.randsomar
lxc.monitor.raspberry
lxc.monitor.redbullhel
lxc.monitor.reverseprox1
lxc.monitor.robertoarred
lxc.monitor.sapphirecap1
lxc.monitor.sapphirecap2
lxc.monitor.sapphirecap3
lxc.monitor.sapphirecap4
lxc.monitor.sapphirecapi
lxc.monitor.sapphirecapn
lxc.monitor.server6
lxc.monitor.shiftplanner
lxc.monitor.strapi
lxc.monitor.suissebankke
lxc.monitor.testrecovery
lxc.monitor.tntrr
lxc.monitor.ubuntu5
lxc.monitor.v2ui
lxc.monitor.valtameri
lxc.monitor.viecondocker
lxc.monitor.vldirsrv002
lxc.monitor.vm
lxc.monitor.vps210thread
lxc.monitor.vps410thread
lxc.monitor.vps55thread
lxc.monitor.weather
lxc.monitor.webdev
lxc.monitor.wienerslol
lxc.monitor.wp221967
lxc.monitor.wwsszz
lxc.pivot
proc-sys-fs-binfmt_misc.mount
sys-fs-fuse-connections.mount
sys-kernel-config.mount
sys-kernel-debug.mount
sys-kernel-tracing.mount
system.slice
system.slice/boot-efi.mount
system.slice/boot.mount
system.slice/dev-disk-by/x2did-dm/x2duuid/x2dlvm/x2di01qb1bwuulokcjaambdkxtzbizwa3euwk6uf32e4ey4bwndrlsbwzqortha8epe.swap
system.slice/docker.socket
system.slice/snap-core20-1405.mount
system.slice/snap-lxd-22923.mount
system.slice/snap-lxd-23680.mount
system.slice/snap-snapd-15534.mount
system.slice/snapd.socket
system.slice/system-getty.slice
system.slice/system-getty.slice/getty_tty1.service
system.slice/system-lvm2/x2dpvscan.slice
system.slice/system-modprobe.slice
system.slice/system-systemd/x2dfsck.slice
user.slice

@ilyam8

No. We don’t see the container metrics.

Do you mean the Netdata docker container (netdata_test_cgroups) has no container metrics? (<IP>:20000)

@ilyam8

Yes, after creating the container I visited the agent dashboard and I didn’t find the container metrics (usually I would find them in the bottom right)

@ilyam8

I followed the instructions here and uncommented those lines and restarted Netdata, still container metrics are not shown.

I uncommented these lines (that “lxc.monitor” pattern line as well) and restarted Netdata but for some reason the container metrics are not shown.

Check the screenshot to see the changes I made.

Any input on this?

I think you need lxc.playload.* (no config changes are needed), not lxc.monitor.*. I don’t know what is happening. I see no problems in the logs. I will add more debug logs and build another custom image.

Is this correct? I am not sure what the right syntax is.

 search for cgroups in subpaths matching =  !*/init.scope  !*-qemu  !*.libvirt-qemu  !/init.scope  !/system !/systemd  !/user  !/user.slice  !/lxc/*/*  !/lxc.monitor  !/lxc.payload/*/*  /lxc.payload.*  *

@ilyam8

Sorry to bother but is that the right syntax? I guess an “!” means not to consider that, so I removed it and restarted Netdata but still nothing - no information on container metrics.