10min_cpu_usage

10min_cpu_usage

OS: Linux & FreeBSD

This alert calculates an average on CPU utilization over a period of 10 minutes, excluding iowait, nice
and steal values.

Note that on FreeBSD, the alert excludes only nice.

The alert is generally self-explanatory, but to clarify any confusion that might exist in the iowait, nice and steal metrics:

iowait is the percentage of time the CPU waits on a disk for an I/O; it happens when the former is getting bottlenecked by the latter. At this point the CPU is being idle, waiting only on the I/O.

nice value of a processor is the time it has spent on running low priority processes. Low priority processes are those with a ‘nice’ value greater than 0 (on UNIX-like systems, a higher ‘nice’ value indicates a lower priority).

steal, in a virtual machine, is the percentage of time that particular virtual CPU has to wait for an available host CPU to run on. If this metric goes up, it means that your VM is not getting the processing power it needs.

Troubleshooting section

Processes slowing down your CPU

There are two primary cases in which this alert is raised, and determining which applies to you requires understanding your own scenario.

  • Generally, if you have high CPU utilization alongside a high nice value, it means that the system is running through all the low priority processes, and if some high priority process needs CPU time, it can get it at any time.
  • On the flipside, if you have high CPU utilization with low nice value, this means that the CPU is used on high priority processes and new ones will not be able to take CPU time, and they will have to wait.

The latter scenario is worth investigating if there is a process slowing down your CPU. We suggest you go to your node on Netdata Cloud and click the nice dimension under the Total CPU Utilization chart to see the value. You can then pair it with the value of this alert to see what’s your case of the two described.