Debian 10/11, latest netdata stable from official netdata repository (1.36.1). Database on each server, web server is accessed through nginx proxy.
For me, a good monitoring software should have 2 main attributes:
- collect required data in a correct way, then store/present/correlate that data etc. (doh)
- not interfering (too much) with the system performance (cpu, memory, disk usage)
Clearly we cannot compare “old school” monitoring software like munin or cacti (5 min samples with tens of data samples) with netdata (1 sec samples with thousands of data samples). But …
Installed “default”, had some insane memory usage for netdata daemon. From 200-400 MB for regular servers to 700-1000+MB for kvm hypervisors. USED memory. And that’s just in a single month.
With some tunning, it’s a little better (marginal), but still far away from the < 100 MB “promised” in the documentation.
Examples:
- hypervisors: 450-500 MB (since 14 sept)
- almost unused nginx server (restarted days before)
netdata 564 0.6 8.4 648972 172568 ? SNsl Oct01 16:58 /usr/sbin/netdata -D -P /var/run/netdata/netdata.pid - nginx proxy for netdata
netdata 837 0.9 24.6 1156840 246228 ? SNsl Sep14 271:51 /usr/sbin/netdata -D -P /var/run/netdata/netdata.pid - almost unused nginx server
netdata 807 1.1 27.4 1724912 274908 ? SNsl Sep10 363:39 /usr/sbin/netdata -D -P /var/run/netdata/netdata.pid - unused vpn server
netdata 1078 1.0 30.3 1593188 303828 ? SNsl Sep14 284:23 /usr/sbin/netdata -D -P /var/run/netdata/netdata.pid
as a matter of fact, on some servers with memory pressure (i.e. database servers) netdata memory usage is “good” (under 100 MB); but overall it feels like ‘java’, with more memory available on servers, more memory is used by netdata
also “committed” values are insane, maximum is almost 10 gigs !!!; as you know “committed” memory data is to be watched especially on app servers (php, java etc.), being a “worst case scenario” for memory allocation; but netdata with 1-2 GB committed “screws up” every servers’ monitoring data, with 1-2 Gigs committed on its own, so the memory graphic on older monitoring systems becomes unusable
please tell me what am I doing wrong? how can I reduce the memory used & allocated (commited) by netdata?
the configuration follows:
[global]
run as user = netdata
process scheduling policy = idle # run with least priority
OOM score = 1000
enable metric correlations = no
[db]
update every = 1
mode = dbengine
storage tiers = 1
dbengine page cache size MB = 32
dbengine disk space MB = 256
[logs]
access = none
[ml]
enabled = no
[web]
bind to = *
allow connections from = localhost $ip_of_nginx
[registry]
enabled = no
registry to announce = $url_to_announce
[plugins]
commented # ebpf = no
charts.d = no
fping = no
python.d = no
statsd = no