Incorrect "Available Memory" values with LXD Containers

Problem/Question

Incorrect Available Memory value is shown.

Relevant docs you followed/actions you took to solve the issue

We provide LXD VPS hosting and we have a couple of hosts that use various version of LXD. Initially, we thought it was because of LXD incorrect values of available memory are shown. So we installed Netdata on all hosts that have a different version of LXD. But irrespective of the LXD version, incorrect values are shown.

So, we instead started downgrading the Netdata version. We went all the way down to v1.33.1, and with this version, the value is shown correctly on the hosts where we previously saw incorrect values.

Is there something we’re missing or is this in fact something to do with Netdata?

Please see the attached screenshots.

With the latest version of Netdata on server with 20GB RAM:

Using the v1.33.1 on the same machine (shown correctly here):

Please let me know if you guys need more information - happy to provide it!

Thanks.

Hi, @philip. The avail value comes from /proc/meminfo

$ grep MemAvail /proc/meminfo
MemAvailable:   15396028 kB

The only additional processing we do with the value is adding (if available) ZFS ARC shrinkable cache size. See this PR (appeared in v1.35.0).

Can you check /proc/meminfo?

@ilyam8

We see the correct value. Here’s output from meminfo:

$ grep MemAvail /proc/meminfo
MemAvailable:   19212896 kB

And we indeed use ZFS but why add ZFS ARC cache to mem available? It’s not usable RAM, right?

It’s not usable RAM, right?

It is usable (available if there is (strong) demand).

When a system’s workload demand for memory fluctuates, the ZFS ARC caches data during a period of weak demand and then shrinks during a period of strong demand .

I also applied the patch (been accepted) to htop, see consider only shrinkable ZFS ARC as cache by ilyam8 · Pull Request #1003 · htop-dev/htop · GitHub.

server with 20GB RAM
avail 70.47

It means that ZFS ARC cache shrinkable size is ~50GB (zfs_arc_max - zfs_arc_min).

Where does ZFS store its cache if not in RAM in your case?

I think there is some confusion here maybe. You are adding the ZFS ARC shrinkable cache as being available RAM considering that if there is strong enough demand the ZFS ARC cache will free that memory for use. What you are missing here is that we are talking about LXD containers. These are virtual machines which have a set memory limit as to how much RAM they can consume. What you are doing here seems to be that you then read the ZFS ARC cache value from the host system and are adding that to the RAM available to the VPS. In that case, this is wrong as the ZFS ARC cache only applies to the host, not the guest containers.

Indeed, there is confusion. Can you describe your setup, please?

  • Is Netdata running inside a container?
  • If yes, I guess you want Netdata to monitor the host system, not the container it is running in, correct?
  • In addition to the host system, you want Netdata to monitor LXD containers (their mem %/usage), right? All of them are separate sections on the dashboard (with cpu, mem, net subsections).
  • Yes, Netdata is installed directly inside a container (LXD Ubuntu Jammy)

  • No. We want to monitor the container, not the actual host it is running on.

Basically, we installed Netdata in a LXD Ubuntu container and we want to monitor it. Not the actual host.

Usually, this can be achieved by not mounting the host system procfs/sysfs inside the container . I assume you didn’t do it, but /proc/spl/kstat/zfs/arcstats exists and returns the same statistics as it would if you were reading it from the host system.

It is not clear from the Netdata view that those metrics don’t really belong to the host (LXD container). I guess a quick fix is disabling both ZFS collectors:

[plugin:proc]
        /proc/spl/kstat/zfs/arcstats = no
        /proc/spl/kstat/zfs/pool/state = no

I am not sure what would be the proper solution (some additional checks based on what? the current logic is straightforward - if there is a file and it is readable - read). In general, taking into account ZFS ARC cache when calculating ram usage/mem available is correct (common sense, htop does it).

Ok, I guess I know what we can do.

htop src code:

   if (lpl->zfs.enabled != 0 && !Running_containerized) {
      // ZFS does not shrink below the value of zfs_arc_min.
      unsigned long long int shrinkableSize = 0;
      if (lpl->zfs.size > lpl->zfs.min)
         shrinkableSize = lpl->zfs.size - lpl->zfs.min;
      this->values[0] -= shrinkableSize;
      this->values[3] += shrinkableSize;
      this->values[4] += shrinkableSize;

So they do this only when NOT containerized. The fun fact that I was aware of the check (fixed container detection) but somehow forgot to implement that in Netdata :see_no_evil:

@philip can you share cat /proc/1/mounts? I want to see if this check is enough. I can test only in an LXC container.

Turns out it was a silly question after some googling. Technically containers are LXC containers, no such thing as LXD containers.

Sorry about that. We use those terms interchangeably.

As you asked, here is the content of /proc/1/mounts:

lxd/containers/netdatatest / zfs rw,relatime,xattr,posixacl 0 0
none /dev tmpfs rw,relatime,size=492k,mode=755,uid=1000000,gid=1000000,inode64 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
udev /dev/fuse devtmpfs rw,nosuid,noexec,relatime,size=198008088k,nr_inodes=49502022,mode=755,inode64 0 0
udev /dev/net/tun devtmpfs rw,nosuid,noexec,relatime,size=198008088k,nr_inodes=49502022,mode=755,inode64 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,nosuid,nodev,noexec,relatime 0 0
efivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,nosuid,nodev,noexec,relatime 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
configfs /sys/kernel/config configfs rw,nosuid,nodev,noexec,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0
tracefs /sys/kernel/debug/tracing tracefs rw,nosuid,nodev,noexec,relatime 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tracefs /sys/kernel/tracing tracefs rw,nosuid,nodev,noexec,relatime 0 0
mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
proc /dev/.lxc/proc proc rw,relatime 0 0
sys /dev/.lxc/sys sysfs rw,relatime 0 0
tmpfs /dev/lxd tmpfs rw,relatime,size=100k,mode=755,inode64 0 0
/dev/sda2 /dev/ppp ext4 rw,relatime 0 0
/dev/sda2 /dev/net/tun ext4 rw,relatime 0 0
tmpfs /dev/.lxd-mounts tmpfs rw,relatime,size=100k,mode=711,inode64 0 0
lxcfs /proc/cpuinfo fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
lxcfs /proc/diskstats fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
lxcfs /proc/loadavg fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
lxcfs /proc/meminfo fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
lxcfs /proc/slabinfo fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
lxcfs /proc/stat fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
lxcfs /proc/swaps fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
lxcfs /proc/uptime fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
lxcfs /sys/devices/system/cpu fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
udev /dev/full devtmpfs rw,nosuid,noexec,relatime,size=198008088k,nr_inodes=49502022,mode=755,inode64 0 0
udev /dev/null devtmpfs rw,nosuid,noexec,relatime,size=198008088k,nr_inodes=49502022,mode=755,inode64 0 0
udev /dev/random devtmpfs rw,nosuid,noexec,relatime,size=198008088k,nr_inodes=49502022,mode=755,inode64 0 0
udev /dev/tty devtmpfs rw,nosuid,noexec,relatime,size=198008088k,nr_inodes=49502022,mode=755,inode64 0 0
udev /dev/urandom devtmpfs rw,nosuid,noexec,relatime,size=198008088k,nr_inodes=49502022,mode=755,inode64 0 0
udev /dev/zero devtmpfs rw,nosuid,noexec,relatime,size=198008088k,nr_inodes=49502022,mode=755,inode64 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=1000005,mode=620,ptmxmode=666,max=1024 0 0
devpts /dev/ptmx devpts rw,nosuid,noexec,relatime,gid=1000005,mode=620,ptmxmode=666,max=1024 0 0
devpts /dev/console devpts rw,nosuid,noexec,relatime,gid=1000005,mode=620,ptmxmode=666,max=1024 0 0
none /proc/sys/kernel/random/boot_id tmpfs ro,nosuid,nodev,noexec,relatime,size=492k,mode=755,uid=1000000,gid=1000000,inode64 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,uid=1000000,gid=1000000,inode64 0 0
tmpfs /run tmpfs rw,nosuid,nodev,size=79228520k,nr_inodes=819200,mode=755,uid=1000000,gid=1000000,inode64 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k,uid=1000000,gid=1000000,inode64 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,size=4096k,nr_inodes=1024,mode=755,uid=1000000,gid=1000000,inode64 0 0
cgroup2 /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,name=systemd 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/misc cgroup rw,nosuid,nodev,noexec,relatime,misc 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/rdma cgroup rw,nosuid,nodev,noexec,relatime,rdma 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset,clone_children 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
fuse-overlayfs /var/lib/docker/fuse-overlayfs/fc465ee4c16adb21e02ca6dc892b94f25fcb5d14e5f545948e00c6ba8a710b74/merged fuse.fuse-overlayfs rw,nodev,noatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
nsfs /run/docker/netns/0daac479e697 nsfs rw 0 0
tmpfs /run/user/1001 tmpfs rw,nosuid,nodev,relatime,size=1953124k,nr_inodes=488281,mode=700,uid=1001001,gid=1000027,inode64 0 0

The same issue is on GitHub. I think I understand what is happening.

Fixed in fix(pacakging): fix cpu/memory metrics when running inside LXC container as systemd service by ilyam8 · Pull Request #14255 · netdata/netdata · GitHub

@ilyam8 Thanks for the update. Much appreciated.