Problem/Question
Hi,
I’m using Netdata Cloud Beta architecture with agents v1.32.0 on parent-child architecture to monitor a Kubernetes cluster.
I have some trouble trying to show timespans superior to 4-5 days on the merged Kubernetes graphs only. (it doesn’t happen on a single node view or with the nodes graphs). The query times out, sometimes end up showing something but with 2 nodes erroring (I have 3 nodes on the k8s cluster).
Also after the query, the whole graphs are erroring, even if I set the timespan back to 15 or 5min, and end up showing again after maybe 2 minutes.
The parent and the childs are not showing any error in the logs, I couldn’t find anything to enable “debug/verbose logs” on containers. I would like some help to debug this, if possible .
Screenshots
Last 15min K8S plugin
Last 4 days K8S plugin
After some secs - showing ERROR but still showing the graphs
Last 7 days K8S plugin
After some secs
After a minut or something
Last 7 days standard cpu graph
After some secs
Environment/Browser
I’m running Netdata with parent-child architecture on 3 kubernetes nodes. Each node has 4 cores and 16GB or RAM, whenever I try to show 7 days on the kubernetes plugin I can see the resources usage growing (as expected), but I do not expect it to crash with this usage. the netdata parent has no resource limit.
1769m means 1769 milliCPU which represent 1.7 core usage.
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
node 2404m 61% 9087Mi 62%
NAME CPU(cores) MEMORY(bytes)
netdata-child-467pp 63m 143Mi
netdata-child-88vq8 52m 162Mi
netdata-child-q6w9q 71m 157Mi
netdata-parent-77d87f5979-7857g 1769m 2381Mi
Tried with Firefox 94.0 and Chromium 96.0.4664.45
Also I’m on your discord channel with the same username if you want to reach me there.
Zeylos