My server froze every morning, and then I discovered that it’s due to the high consumption of memory by cc1plus processes started by netdata-updater.sh.
And it always froze at the same point. Sadly I can no longer remember where did it stop.
So I updated the instance to t3a.small which has 2 GB RAM instead of just 1, and manually started the update again. And after some time it finally finished.
So then I decreased the RAM back to 1 GB because even 1 GB is too much for the server’s standard usage.
I was hoping that this was a one-time huge netdata update, and the updater usually is more than fine with 1 GB of memory.
But I was wrong and the next morning it froze the instance again. Is this normal?
Thanks in advance for all the help!
What I expected to happen
That the daily updater doesn’t freeze the instance completely.
Just tried it on an Ubuntu 20 VM and it took around 700MiB at peak. I believe this was identified before and we have an option (or a PR to add an option) to run it with lower memory requirements. @Austin_Hemmelgarn will provide more info soon.
We have no real way to lower the memory requirements, it0s almost entirely a side effect of us bundling protobuf now (it takes a huge amount of resources to build).
Did some tests. If you don’t need to connect that node to the cloud and don’t use the Prometheus remote write exporter, you can run the following: bash <(curl -Ss https://my-netdata.io/kickstart.sh) --reinstall --disable-cloud --disable-backend-prometheus-remote-write
“connect that node to the cloud”
By cloud do you mean the Netdata Cloud Dashboard, right?
So if I’ve understood it well, the only way we can reduce the Memory usage of the update is to disable the feature to watch the server through the browser?
The Netdata agent has its own UI. The cloud allows monitoring of the entire infrastructure, but you can use the agent without it.
A more proper way to do it is to use streaming/replication from agents running with very limited capabilities (search learn.netdata.cloud for "high performance netdata or IOT), to a “parent” agent running on a more beefy machine. You can have all the child charts displayed on that parent too, it’s how streaming always worked. The thing doesn’t currently work on the cloud is the ability to just connect that parent to the cloud and get the cloud’s infra level view with all the nodes. We are correcting that in a major redesign that we will be releasing in October.