Netdata consuming high ram amount

I installed netdata in a virtual machine running debian 10 for test purposes. The VM has 2 GB Ram configured and it uses around 200MB when running. As soon as I install Netdata it jumps to almost 2Gb consumed. I have seen that Netdata should be a lightweight agent for monitoring. I have uninstalled it and installed back again with no luck on the ram issue. I already tried the tips to decrease this consumption but they dont work.
Maybe is the netdata server side installation and I just need to install the agent?? Sorry, new to Netdata but I really like the product

1 Like

Hi @Lcarrillo.

As soon as I install Netdata it jumps to almost 2Gb consumed.

That is not expected. How did you check ram usage?

Could you share apps.mem (Applications->mem->apps.mem)?

I have Netdata Agent installed on DO (1 GB ram vm)

In the System Overview–>Ram section it shows that the physical memory being used is 1.75GiB and free is only 0.08 GiB.
I installed another software called Glances to check if the ram was actually consumed and it showed up the same ram being consumed. As soon as I uninstall netdata this problem goes away.

@Lcarrillo

To make it clear - the problem is used: 1.75GB? Why do you think it is cosumed by Netdata?

Can you do when netdata is running

# system free/used memory
free -h

# top 30 memory consumers
ps -eo pmem,pcpu,vsize,pid,cmd | sort -k 1 -nr | head -30

@ilyam8

When I ran those commands this is the info i got. Sorry for all this trouble and thanks for all your help.
I believe the netdata go.d.plugin is causing this RAM consumption issue.

It is not go.d.plugin, it uses 12MB (you can see it on the netdata dashboard apps.mem, and in the top output - it is 0.5% (pmem, first column)).

I’m also seeing that the netdata agent is using a lot more ram recently.

I have an Ubuntu server dedicated to an HAproxy instance with 3G ram. Without the netdata agent running, memory usage sits around 600M total; with netdata agent running, memory usage is around 2.3G

Agent documentation says memory usage should be around 100M without database, is that still correct?

$ netdata -v
netdata v1.29.3-172-nightly
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.2 LTS
Release:        20.04
Codename:       focal
$ free -h
              total        used        free      shared  buff/cache   available
Mem:          2.9Gi       2.3Gi       143Mi       1.0Mi       503Mi       393Mi
Swap:         3.0Gi        16Mi       3.0Gi
$ ps -eo pmem,pcpu,vsize,pid,cmd | sort -k 1 -nr | head -10
10.7 18.9 538684 3053127 /usr/sbin/netdata -P /var/run/netdata/netdata.pid -D
 3.0  0.0 285316     488 /lib/systemd/systemd-journald
 1.1  9.8  55252 3053328 /usr/bin/python3 /usr/libexec/netdata/plugins.d/python.d.plugin 1
 0.9  0.8 144132     969 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock
 0.9  0.0 141948     967 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock
 0.7  3.5 725444 3053333 /usr/libexec/netdata/plugins.d/go.d.plugin 1

htop shows netdata is only using 10.9%, but similar to OP the memory usage drops from 2.3G to 600M when I run systemd stop netdata

I just checked my other HAproxy server and it is only using ~50M on the same netdata version. This server is configured very similarly - only main difference is it only has 2G ram.

$ netdata -v
netdata v1.29.3-172-nightly
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.2 LTS
Release:        20.04
Codename:       focal
$ free -h
              total        used        free      shared  buff/cache   available
Mem:          1.9Gi       503Mi       1.2Gi       0.0Ki       228Mi       1.3Gi
Swap:         2.0Gi       371Mi       1.6Gi
$ ps -eo pmem,pcpu,vsize,pid,cmd | sort -k 1 -nr | head -10
 2.6  1.6 564284  564164 /usr/sbin/netdata -P /var/run/netdata/netdata.pid -D
 1.3  0.2  71396  564324 /usr/bin/python3 /usr/libexec/netdata/plugins.d/python.d.plugin 1
 0.9  0.5 129876     504 /lib/systemd/systemd-journald
 0.9  0.0 345824     764 /sbin/multipathd -d -s
 0.6  0.6 1227108    899 /usr/lib/snapd/snapd
 0.6  0.1 725444  564329 /usr/libexec/netdata/plugins.d/go.d.plugin 1

@ilyam8 Both of these servers are now having the high memory usage issue :confused:

I found this after looking through open Github issues, but I’m not sure if it’s related because I do not have metrics streaming to a central netdata server High RAM usage by netdata-master · Issue #10471 · netdata/netdata · GitHub

Hi @haydenseitz

  • What is your retention storage setting dbengine multihost disk space (should be in /etc/netdata/netdata.conf section global)

  • How many metrics are you collecting?
    Should be at the very bottom of the output when running http://localhost:19999/api/v1/charts

Similar to :

"charts_count": 386,
"dimensions_count": 2567,
"alarms_count": 95,
"rrd_memory_bytes": 2998876,
"hosts_count": 5,

Also could you share the output of http://localhost:19999/api/v1/chart?chart=system.cpu

Retention is set to roughly 7 days:

[global]
    dbengine multihost disk space = 2048

Most of my installations have around 2000 metrics. Here is one having the issue:

...
	"charts_count": 337,
	"dimensions_count": 2436,
	"alarms_count": 48,
	"rrd_memory_bytes": 2700032,
	"hosts_count": 1,
...

I have been switching installations to the stable branch and I have not seen the high memory issue yet when running version netdata v1.29.3

@Stelios_Fragkakis I am seeing the high memory usage behavior in netdata stable 1.30.0 now. A freshly installed Ubuntu 20 server goes from 500mb memory usage to 1.8gb memory usage when starting the netdata agent.

Do you think it could be related to the issue here? High RAM usage by netdata-master · Issue #10471 · netdata/netdata · GitHub

Thanks for your help on this!

1 Like

This does not seem to be related. Investigating again changes from v1.29.3-172-nightly onwards to try and pinpoint this.

@haydenseitz to help us investigate the problem, can you send a snapshot to ilya@netdata.cloud?

2 Likes

The dbengine metadata usage looks normal (pages on disk ~2.7 million)

1 Like

Thanks @Stelios_Fragkakis and @ilyam8 for looking into this. Perhaps it would be useful to post here any findings, maybe a contributor get’s inspired and help us debug it even faster :stuck_out_tongue:

Something I don’t fully understand is that tools like htop still only show netdata using ~500Mb virtual memory and ~300Mb resident memory, but the memory usage changes >1.5Gb when starting the netdata agent. Are there any other tools I can use to track down exactly which process is filling up memory?

While running:

Summary
$ ps aux | head -n1; ps aux | grep netdata
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
netdata  2103535  6.0  8.3 586400 333792 ?       Ssl  08:44   0:02 /usr/sbin/netdata -P /var/run/netdata/netdata.pid -D
netdata  2103545  0.0  0.0  26864  2312 ?        Sl   08:44   0:00 /usr/sbin/netdata --special-spawn-server
netdata  2103924  0.1  0.0   4108  3280 ?        S    08:44   0:00 bash /usr/libexec/netdata/plugins.d/tc-qos-helper.sh 1
netdata  2103928  1.2  0.5 724168 23300 ?        Sl   08:44   0:00 /usr/libexec/netdata/plugins.d/go.d.plugin 1
netdata  2103931  3.5  0.1  56152  6308 ?        R    08:44   0:01 /usr/libexec/netdata/plugins.d/apps.plugin 1
root     2103933  5.6  0.0 380172  3820 ?        Sl   08:44   0:02 /usr/libexec/netdata/plugins.d/ebpf.plugin 1
netdata  2103935  1.7  0.6  48356 27976 ?        Sl   08:44   0:00 /usr/bin/python /usr/libexec/netdata/plugins.d/python.d.plugin 1
$ free -m
              total        used        free      shared  buff/cache   available
Mem:           3907        2883         256           7         767         745
Swap:          3907          98        3809

While stopped:

Summary
$ sudo systemctl stop netdata
$ free -m
              total        used        free      shared  buff/cache   available
Mem:           3907        1304        1841           7         761        2325
Swap:          3907          98        3809

Yeah, that keeps me wondering too :exploding_head:

Let’s try to disable external plugins one by one and see if it changes anything (see netdata.conf [plugins] section)

@haydenseitz here are the docs to disable plugins:

Thank you for helping us figure this out!

thanks, I’ll try disabling and see if I find a suspect.
Another oddity is sometimes I’ll stop and restart the agent and memory usage stays at the expected amount. But at some point, usually just a day or two later, it will jump up.