netdata eating a lot of CPU

udotirol · January 18, 2024, 11:23am

trying to figure out why some of our physical servers were seeing unusual high load for an equally unusual long time, I noticed that netdata consumes at least 100% of one core on the VMs running on those physical servers.

See for yourself:

Those are just 4 of the VMs affected, but I saw the same phenomenon on every single VM that runs netdata. I’ve since updated the VMs and rebooted them, all to no avail: netdata still eats at least one core.

Any ideas what’s wrong there?

The affected VMs run on up to date debian bullseye, the same goes for netdata:

$ dpkg -l netdata
ii  netdata        1.44.0-205-nightly amd64        real-time charts for system monitoring

ilyam8 · January 18, 2024, 11:25am

Hey, consider using htop to find out which thread that is.

udotirol · January 18, 2024, 11:31am

sure, here you go:

ilyam8 · January 18, 2024, 11:38am

How long has it been 100%?

udotirol · January 18, 2024, 11:57am

difficult to say, but from what I see in the limited historical metrics that proxmox offers, this seems to be going on for a couple of weeks, if not longer.

In the netdata console, “Systemd Services CPU utilization” is where I see that netdata is eating 100% CPU, but I didn’t find a possibility to visualize historical data there.

avh · January 22, 2024, 11:15am

Hey, I’m experiencing the same problem here it seems.
One of my agents spends 100% of a single core in an ACLKSYNC thread.
Netdata 1.44.1, RockyLinux 9.

avh · January 22, 2024, 11:21am

rm -rf /var/cache/netdata/* solved the problem.

ilyam8 · January 23, 2024, 1:48pm

@udotirol hey, do you see 100% right after Netdata restart or after some time?

udotirol · January 23, 2024, 2:01pm

it happens practically instantly after I reboot the entire VM or just restart Netdata

ilyam8 · January 23, 2024, 2:40pm

@udotirol are you on Discord? If yes, can you please join Discord? It is the same issue, it will be easier to debug it there

udotirol · January 23, 2024, 10:07pm

yes, I’ve just joined the server right now. But as @avh mentioned, a workaround seems to be to remove /var/cache/netdata

I haven’t done so yet, in case you want to debug the issue. If so, I am happy to proceed on Discord

ilyam8 · January 24, 2024, 10:11am

in case you want to debug the issue

Nice, because Jacob (Discord OP) removed /var/cache/netdata.

ilyam8 · January 24, 2024, 10:17am

@udotirol can you please send your /var/cache/netdata/netdata-meta.db to stelios@netdata.cloud?

udotirol · January 24, 2024, 12:16pm

ok, I’ve just sent the email.

phr3dly · January 25, 2024, 4:34pm

Just registered to follow this. We have netdata on a few dozen servers in an LSF environment. Centos-7 based 64-core Epyc servers with 1TB+ memory, generally running 100% loaded. I see netdata at 100% - 300% CPU usage and memory sometimes up to around 50GB.

Tried clearing out /var/cache/netdata and initial results are promising; any thoughts on what we’re losing by clearing that out?

Edit – I see that what I’m losing is my history of usage :). That’s a bit of a loss.

Topic		Replies	Views
High CPU use of Netdata Help	7	887	January 28, 2024
Critical - Netdata CPU Leak - had to shut it off on hundreds of nodes Help	14	119	December 14, 2024
Netdata service freezes VPS completely Help agent	2	868	September 26, 2022
Netdata consuming high ram amount Help agent	29	8197	February 6, 2022
Apps.plugin high CPU usage Help agent	14	3857	June 18, 2021

netdata eating a lot of CPU

Related topics