We use netdata statsd on our Kubernetes nodes to collect our custom statsd metrics and send them to Graphite. We have thousands of metrics (about 200000 after 20 days of netdata running). Looks like all those metrics are stored in memory, because netdata memory usage after 20 days is 9.7 Gb.
Netdata version is 1.39.1
This documentation (StatsD | Learn Netdata) is outdated. In this version it is not possible to use separate “memory mode” for statsd and we use default “dbengine” mode. Furthermore, it is not possible to set “private charts history” limit. According to the source code we have default RRD_DEFAULT_HISTORY_ENTRIES 3600.
My questions are:
- Every statsd metric creates private chart. Is it possible to somehow set retention for private charts?
- Is it possible to set history limit for private charts?
@victorvoronin thanks for raising this to us. Considering that 1.39.1 is a few versions ago is there a way you could upgrade to a more recent version? There have been quite a few performance improvement releases done.
In the meantime, will try to get the answers to your questions,
@hugo Honestly I’ve tested out version 1.42.1 as well and have looked a bit in the statsd.c source code. I don’t see any changes about my problem there.
Furthermore, I’ve looked dipper and figured out that we have some incorrect configuration parameters.
At the moment I think our main problem is statsd private charts retention. As far as I understand there are no options to set private charts lifetime and private charts (statsd metrics) can only be added, not removed or retained. In our case, where every new container adds about 2000 statsd metrics, we will have private charts overlimiting after some time, and we will have to restart netdata periodically. If we set a huge limit for “max private charts hard limit”, we will have high memory usage, because private charts data will be stored in memory.
So, to recap, this looks like the new feature request “Add an ability to retain private chart if there are no new values after a configurable period of time”