High Bandwidth Consumption in Child-Parent streaming

meghnamscs · February 26, 2025, 11:30pm

Problem/Question

I have three Linux machines acting as Child nodes, streaming 5000 netdata metrics to one Parent Node. Each of these three machines is consuming very high bandwidth, about 110 GB/month (~45kB/s).
They all have same Netdata version v2.1.1 and same Linux firmware.

The child nodes are remote and they’re on different networks. They use VPN service called “Tailscale” to stream data to Parent Node, which is a virtual machine in Google Cloud Platform (GCP).

Relevant docs you followed/actions you took to solve the issue

Netdata Data Retention for 1000 metrics = ~60 MB/month

I used linux command line tool called “Nethogs”. Nethogs would tell me the KB/s consumed by tailscale. I measured the nethogs logs for 24-hrs on all these six machines WITH Netdata Child-Parent streaming and WITHOUT Netdata Child-Parent streaming:
Here are the results:
WITH Netdata-Child-Parent streaming: ~45kB/s
WITHOUT Netdata-Child-Parent streaming: ~20 B/s

Environment/Browser/Agent’s version etc

Child Node Netdata version: v2.1.1
Child Node Machine Details:
Type: lxc
O/S version:Ubuntu, 22.04.4 LTS (Jammy Jellyfish)
Architecture:aarch64
Kernel:Linux, 5.15.136-tegra

Parent Node Netdata version: v2.1.1
Parent Node Machine Details:
Type: docker
O/S version:Ubuntu, 22.04.4 LTS (Jammy Jellyfish)
Architecture:x86_64
Kernel:Linux, 6.8.0-1018-gcp

What I expected to happen

I expect it to consume bandwidth of 300 MB/month, for 5000 metrics my child nodes are streaming, calculated as per Netdata official documentation. [Operational Considerations Long-Term Data Storage and Retention in Netdata | Netdata ]

Kindly assist to understand why there is a big variation in the Bandwidth.
Thank you

ilyam8 · February 27, 2025, 8:08am

Hi, @meghnamscs. To reduce the data volume, you will need to limit the number of metrics being collected. This can be accomplished by disabling collectors that aren’t essential to your needs. You can also adjust the data collection frequency (for example, changing from every 1 minute to every 5 minutes) to decrease the volume.

I expect it to consume bandwidth of 300 MB/month, for 5000 metrics

Impossible.

meghnamscs · February 27, 2025, 4:03pm

Thanks for those suggestions, but I want to know how I can calculate how much Netdata supposed to consume to transfer 5000 metrics from Child to Parent? Isn’t this documentation the right resource to arrive at my conclusion of 300MB/month? Long-Term Data Storage and Retention in Netdata | Netdata

meghnamscs · February 27, 2025, 6:00pm

These are my testing steps if it helps us in getting to the bottom of the issue:

Test 1-
Netdata Child-Parent is the only service using our VPN service “tailscale” to stream data to Parent Node. And turning OFF the Child-Parent streaming has shown drastic decrease in the Bandwidth tailscale is using. So that proves Netdata indeed is consuming high bandwidth.
Following is the observation.
(1) Machine 1: SEM-SC
(a) WITH NETDATA CHILD-PARENT STREAMING: Mean Bandwidth=41.8 kB/s

(b) WITHOUT NETDATA CHILD-PARENT STREAMING: Mean Bandwidth=13.8 B/s

(2) Machine 2: SEM-CC
(a) WITH NETDATA CHILD-PARENT STREAMING: Mean Bandwidth=40.3 kB/s

(B) WITHOUT NETDATA CHILD-PARENT STREAMING: Mean Bandwidth=24.9 B/s

Test 2-
I performed similar tests on two similar machines and they showed lower bandwidth consumption. That is strange.
(1) Machine 3: HARV
WITH NETDATA CHILD-PARENT STREAMING: Mean Bandwidth=256 B/s

(2) Machine 4: PSL
WITH NETDATA CHILD-PARENT STREAMING: Mean Bandwidth=261 B/s

Points to note:

OS, Firmware , architecture of all four machines are same. Netdata versions are same: v2.1.1.
They all are on different network and are remote.
All are running on default netdata.conf configuration, streaming about the same no. of metrics, ~ 5000, with 1s update interval. None of them have been tuned. So I expect the bandwidth should be same across all of them. If Netdata can perform streaming of 5000 metrics with avg 300 B/s, then why does is it need to consume ~40 kb/s high bandwidth across other setups is the question I’m trying to answer.

You as a Netdata Team, can help me in getting insights on why this is happening. Please let me know what you think about this problem. What can you gather from my observation?

ilyam8 · February 28, 2025, 8:47am

Expecting only 300MB/month total traffic when streaming 5000 metrics at 1-second granularity is unrealistic. This calculation is impossible given the data volume involved, even with Netdata’s built-in ZSTD compression used during streaming between instances.

meghnamscs · February 28, 2025, 7:30pm

Okay.

What is a realistic bandwidth range that I can expect?
Because I’m seeing two ranges: ~600MB/month, and the other is ~100GB/month in my experiements.
My remote machine will have limited bandwidth quota, about 2GB/month and I want to get Netdata streaming working within that limit. It is critical for my work to know how much bandwidth netdata is actually supposed to consume, and how I can estimate it.

Could you please give me some insights/tools?

Topic		Replies	Views
Child Parent Streaming Bandwidth and Data consumption Help agent , streaming	0	21	February 25, 2025
Remote netdata storage/query question Help	9	387	January 19, 2024
Update frequency of cascaded parents Help agent , configuration , streaming	5	284	October 26, 2023
Netdata Proxy issue(s?) - remote server rejected this stream, the host we are trying to stream is already streamed to it Help agent , streaming	4	479	November 16, 2023
Is it time for multiple streaming servers? General question	15	1211	June 30, 2023

High Bandwidth Consumption in Child-Parent streaming

Problem/Question

Relevant docs you followed/actions you took to solve the issue

Environment/Browser/Agent’s version etc

What I expected to happen

Related topics