trying to figure out why some of our physical servers were seeing unusual high load for an equally unusual long time, I noticed that netdata consumes at least 100% of one core on the VMs running on those physical servers.
Those are just 4 of the VMs affected, but I saw the same phenomenon on every single VM that runs netdata. I’ve since updated the VMs and rebooted them, all to no avail: netdata still eats at least one core.
Any ideas what’s wrong there?
The affected VMs run on up to date debian bullseye, the same goes for netdata:
$ dpkg -l netdata
ii netdata 1.44.0-205-nightly amd64 real-time charts for system monitoring
difficult to say, but from what I see in the limited historical metrics that proxmox offers, this seems to be going on for a couple of weeks, if not longer.
In the netdata console, “Systemd Services CPU utilization” is where I see that netdata is eating 100% CPU, but I didn’t find a possibility to visualize historical data there.
Hey, I’m experiencing the same problem here it seems.
One of my agents spends 100% of a single core in an ACLKSYNC thread.
Netdata 1.44.1, RockyLinux 9.
Just registered to follow this. We have netdata on a few dozen servers in an LSF environment. Centos-7 based 64-core Epyc servers with 1TB+ memory, generally running 100% loaded. I see netdata at 100% - 300% CPU usage and memory sometimes up to around 50GB.
Tried clearing out /var/cache/netdata and initial results are promising; any thoughts on what we’re losing by clearing that out?
Edit – I see that what I’m losing is my history of usage :). That’s a bit of a loss.