Timeout while getting agent response if we select Last 7 days in netdata cloud for a node

Getting Timeout if we select Last 7 days graph for a node as attached. Pls help

There’s some kind of bottleneck on the specific agent, most likely RAM, or IO. Limit the time window after the long query, to show the time around the query itself and check the following:

  • The “Applications” charts for the netdata process, especially the memory ones
  • The “Disks” for the IO BW, iowait etc.
  • The “Netdata Monitoring” charts at the very bottom of the right menu. There are a lot of them, but I’d check “queries”, “dbengine” and probably “workers aclk query” to begin with

You can also run metric correlation or anomaly detection (if ML is enabled) with the time window of the query highlighted, to filter our the irrelevant metrics and see what the remaining ones show.

You’ll probably want to enable tiering, so that you don’t retrieve per second resolution for a week, if your node can’t handle it. Streaming metrics to a more powerful parent is also highly recommended.