Monitoring LXD VMs

We’d been able to monitor LXD containers using Netdata and wanted to know if there’s a way to monitor the VM’s as well since LXD VM’s use QEMU.

Saw this:

But in the /sys/fs/cgroup tree I didn’t see any qemu slices. Does this mean it’s not possible? Or do the VM’s groups reside in another location??

I’ve created a post on LXD forum too, here’s someone who also says this Proxmox VM is visible but not LXD VM. Any input on this would be nice.

We have several hosts that use Netdata for monitoring LXC containers, we would be more than happy to see Netdata being able to monitor LXD VMs as well.

Hi @philip - i commented here about using the /metrics endpoint and Netdata Prometheus collector as one potentially quick solution.

Maybe @Austin_Hemmelgarn might know more about how Netdata can or cannot natively monitor something like an LXD VM or other QEMU based systems on a host?

In general, all that Netdata can collect about VMs when running on the host system is the things the host system itself sees (IOW, what the host-side VM processes are doing), irrespective of what the VM tool stack is, though how well this gets picked up really depends on the tool stack (for example, libvirt actually does provide individual cgroups for each VM, so Netdata automatically picks up libvirt-managed VMs).

For LXD specifically, the metrics endpoint that @andrewm4894 is probably the easiest ‘quick’ solution, and likely also the best solution generically, though I believe that will show all the instances managed by LXD (both containers and VMs). Alternatively, if there is some way to get it to use cgroups for VMs like libvirt does, that would also provide you with per-VM host-side metrics without needing to configure anything new on Netdata.

That said, there’s also the option of setting up Netdata in each VM and configuring it as a minimalistic child node streaming to the host as a parent node. Based on personal experience, this gets nicely detailed information about each VM (and plays nice with Netdata Cloud), but requires a bit more setup than either approach I mentioned above.

Also, for completeness-sake, there are technically three other options, in increasing order of complexity:

  • Create a new collector to scrape metrics from the virtual hardware directly via the info command for the QEMU monitor. I believe this is where LXD would be getting the info for it’s /metrics endpoint, though it would obviously only work for QEMU VMs if scraping it yourself.
  • Set up SNMP in each guest system, then configure Netdata’s SNMP collector to scrape metrics from them. This would get you data from inside the VM itself, though would be limited to what you can find MIBs for (there are MIBs for most basic system stats though, and IIRC net-snmp does include them by default).
  • Install the QEMU guest agent in each VM, and create a new collector to scrape metrics from that. This will get you data from inside the VM itself, but it’s likely to be a serious pain to set up, and would only work for some guest platforms.

I don’t really recommend any of the above options though, they’re complex, somewhat error prone, and likely to be far more effort than they are worth for you (though I can attest that the SNMP approach does work, even if it is a serious pain to get set up).

1 Like