OS: Debian Linux, Bullseye (testing)
Netdata version: v1.31.0-56-nightly
CPU: VPS, with 3 cores of an E5-2680 v2 (2.8Ghz)
Problem/Question
Netdata’s apps.plugin, as well as the netdata executable itself, are taking a large amount of CPU power on one of my machines, on average around 8-9% for apps.plugin and 4-5% for Netdata itself:
This seems excessive, given the very low CPU usage I’ve seen in the past. This is problematic because this VPS has “fair use” CPU and only has 33% CPU dedicated (that is, I can use one core constantly, and have occasional spikes of usage higher than that), and Netdata is taking quite a big chunk of that.
On another one of my systems, apps.plugin only takes 0.7% CPU and Netdata itself only takes 1%. The configuration is very similar on that machine.
I dropped the update every setting for both the apps and ebpf plugins to 10 from its default of 1, which helped with the CPU usage of those plugins (well, they still use a bit of CPU, just less frequently), however the CPU usage of netdata itself is still quite high.
I looked at the snapshot you sent. There are ~1500 active processes on your machine plus new processes are created constantly. So, increased load, as it was mentioned in #11164, is expected.
I ran 30 docker containers on my VM and increased the number of processes using stress --vm 1000 --vm-bytes 1M. In this scenario, apps.plugin consumes pretty much the same amount of CPU time as in your case.
I couldn’t reproduce your results for the netdata process itself, but I suspect there is some dynamic load on your machine that causes slightly increased CPU utilization.
The main factor that affects apps.plugin CPU usage is the number of active processes. For instance, if I start 20k processes (sleep) I get 60% single core.
@Daniel the apps.plugin CPU usage you get is expected with the workload (num of processes, new processes creation rate, etc.). Increasing apps update every is a correct decision.
We will think about apps.plugin possible optimizations.
On another one of my systems, apps.plugin only takes 0.7% CPU and Netdata itself only takes 1%.
Compare both systems workload/number of collected metrics/charts/alarms.
The first three columns measure CPU and IO utilization of the last one, five, and 15 minute periods. The fourth column shows the number of currently running processes and the total number of processes. The last column displays the last process ID used. (source)
The chart is System Overview->processes->system.active_processes.
I googled a bit and it seems the value (loadavg 4th column) is processes + kernel threads + user threads.
Wow, I didn’t know about this! Thanks for the link.
I usually turn off display of threads (both kernel threads and user threads) in htop because it bloats the display a bit, as some apps have a lot of threads. Good to know.