I'm receiving a storm of reachability alerts

lordpengwin · March 24, 2021, 12:12pm

This morning i started to get alerts for every node on my network that they are unreachable then a minute or two later they are reachable again. Since it is from every node I would assume that my internet connection is having problems but I see no sign of this. I can go to any website including netdata.cloud, I can ping sites, internet speed tests seem good and so on.

I tried restarting netdata on all my nodes and notice that it has been running a month almost to the hour, i don’t know if this is related.

Is there anyway to debug why netdata is reporting unreachability or to tune the parameters that it uses to decide?

ilyam8 · March 24, 2021, 12:21pm

Hi @lordpengwin

Give us some details about your setup. You created the issue in the Netdata Agent Support so i assume you don’t use Netdata Cloud?

Or you use Netdata Cloud and get those notifications from it?

lordpengwin · March 24, 2021, 12:24pm

Yes sorry, I use Netdata Cloud. I have a variety of nodes registered, a couple of raspberry pi’s, a intel Linux VM and a Docker container running on may NAS. All started reporting reachability problems around 7am EDT this morning. I get the alerts via email.

OdysLam · March 24, 2021, 12:26pm

I am so sorry for this flood of emails. We are having intermittent issues with our Netdata Cloud backend, which loses connection with the Netdata Agents and thus believes that they are out of reach.

You can disable the reachability alarms in the netdata cloud dashboard.

Rest assured that your setup is OK. We are working to resolve these issues once and for all.

lordpengwin · March 24, 2021, 12:28pm

Great! Thanks for your quick response. Is there a new data cloud status page?

OdysLam · March 24, 2021, 12:30pm

We have https://status.netdata.cloud but it’s not logged there because it’s not an incident. It’s a known bug that we are working to root cause and fix it.

In reality, we are re-working a lot of our backend, so we expect that the overall stability and refinement will improve in the following months.

Finally, I just got a notice that we did some restarts of our backend services, so this might be related to that too.

OdysLam · March 24, 2021, 12:48pm

For anyone reading this, we have integrated our status page into this very forum, so in case of an incident, a popup will appear that will inform you about it.

So, no worries, with your visit in our community, you also ensure that no incident is underway that you are not aware of

Topic		Replies	Views
Reachability alerts not particularly reliable Help agent , cloud	2	258	February 13, 2024
node is unreachable SPAM Help cloud	2	276	July 25, 2023
Not receiving alerts for unreachable node Help cloud-alarms , agent	10	3373	April 8, 2022
"Unreachable, the node is not connected to Netdata Cloud" Help agent	6	2249	June 10, 2021
False alert: Netdata Cloud is not able to reach the node Help cloud	4	704	November 16, 2022

I'm receiving a storm of reachability alerts

Related topics