LXC Containers Stats Are Not Shown

@Christopher_Akritid1

Netdata started now. Error log showed that it’s couldn’t bind to the default netdata port, seemed like some processes are still running. Killed the already running processes. Now netdata is running.

The metrics are SHOWN.

However, I added the same search pattern on another host that’s also affected (and running v1.37.1), but there the metrics are still missing.

I suspect that Netdata wasn’t properly restarted, which another possible bug we have on your specific OS/Kernel.

The same pattern working on one machine and not the other screams permission issues, but let’s not go further here, this is complicated enough to require a real time discussion. See Ilya’s message above and get on a Discord conversation with him. We’ll probably have to replicate the set up in our lab.

@ilyam8 Could you please check your Discord. I’ve created a post and tagged you and my colleague.

Thanks for your time.

@ilyam8

The issue is back Ilya. The host (Kes) that we made the changes on during the end of the call, doesn’t seem to work for some reason. We’ve not made any changes from then on but it suddenly stopped working.

So we just uninstalled and followed the procedure to install static only version from the stable channel again, still no. There’s no croup find worker process running. Tried installing the nightly version as well with the static again, same results.

Here I attached the error log. Please have a look.

Thank you.

@ilyam8

Some more information.

On a host with static version installed, on one host, the home dir was not set to /opt/netdata instead it was still /var/lib/netdata

netdata:x:115:118::/var/lib/netdata:/usr/sbin/nologin

So maybe changing the home dir after installing the static version wasn’t the fix - maybe this is why on another host even static version didn’t show the LXC container metrics.

Just wanted to update you with this.

Netdata fails to start if the home dir is wrong. What’s the current status? Do you have problems with LXC containers or not?

It still runs even if the home dir is incorrect. No, container metrics are not shown.

Not possible, if

  • the home dir is /var/lib/netdata and
  • static install

Netdata fails to start and systemctl status netdata shows that the main PID (code=exited, status=1/FAILURE).


The issue is back Ilya. The host (Kes) that we made the changes on during the end of the call, doesn’t seem to work for some reason. We’ve not made any changes from then on but it suddenly stopped working.

Can you please elaborate a bit, do I understand correctly:

  • you switched from native packages (deb) to static install on N servers (stable version).
  • it worked (we switched 2 servers during our call).
  • after some time all container metrics just disappear? On all servers? And Netdata restart doesn’t help?

Well, I’m not sure then. The one that’s working with the static install has the var lib home dir. I’m not gonna touch that as it’s working, haha.

Yes, on host Kes it worked - the one on which we made the change before we quit the call, that one stopped working now. Reinstall too didn’t work.

Yes, static and stable version.

@ilyam8 Any update on this?

Let me know if you need any logs or something that may help you with debugging.

@ilyam8 Any update on this issue?

I see this issue even on the latest stable version of Netdata. The container metrics suddenly disappear. Had to reinstall Netdata. Even then, after some time or after a few days, the metrics disappear suddenly. This recently happened on two of our hosts.

Hey, @philip. I am not sure how to find the problem:

  • you are the only user who reports it (and cgroups collector is running almost on every ND instance out there!).
  • we have never been able to reproduce it.

If you can give us access to a vm/host where “the metrics disappear suddenly” (after the fact it happened) - it will help. Otherwise, :man_shrugging:

@ilyam8

Sure, will remember this and will get back to you when it happens again.