Host shows "off" but graphs are drawn

ServerUser · September 12, 2022, 3:07pm

Hi.

I have a Netdata server configured to collect data from other hosts based on the streaming option. One day, one of my hosts shows as “off” but when I access it, the graphs are correctly drawn. The agent on this server is running, it has been restarted, and the agent and the server and the problem are the same all the time. Both the host and the server have Netdata version 1.36.1

Thanks for the help

Christopher_Akritid1 · September 12, 2022, 4:04pm

This looks like a bug. Can you restart Netdata on host Gieta again? It looks like the cloud wasn’t properly updated with its live state after the recent restart. If it’s a one time thing we can perhaps ignore it, but if you see the issue persisting after the additional restart, we’ll need to investigate further.

ServerUser · September 12, 2022, 6:56pm

Interesting problem. If the agent is turned off on the Gieta host, the “off” sign in the panel shows up to me on another host that has nothing to do with the Gieta host and the graphs for the Gieta stop being drawn. The other host, which is now “off”, has correctly drawn plots.

If I turn on an agent on the Gieta host, the other host jumps to “on” and the graphs for the Gieta host start to be drawn.

The Gieta host is “off” all the time and the graphs are correctly drawn after restarting the agent on this host.

Christopher_Akritid1 · September 13, 2022, 4:44pm

Scratching my head a bit here, but:

Did you by any chance clone a VM or somehow else copied entire directories between the hosts at any time? Netdata keeps some identifying info on the filesystem, so things can get a bit screwed up this way, sometimes with only one of the two allowed to connect at the same time. Specifically, under /var/lib/netdata, you’ll find a cloud.d directory and a registry directory. The contents of these two should be different for each agent. If you see any of the two being identical between the two hosts, report it here and we’ll let you know how to clean it up.
You can see the hostname each agent thinks it has by calling http://localhost:19999/api/v1/info It’s also visible in http://localhost:19999/netdata.conf . I suggest you call such an endpoint on both and see what they say.

Forza · September 15, 2022, 4:21pm

Hi,

I have the same issue. Initially I thought it was due to corruption in the dbengine and I removed old files and restarted the parent and nodes, but the issue has come back.

Christopher_Akritid1 · September 19, 2022, 12:27pm

This problem was brought up again in Discord and the bug was identified:

From @Jacek_Kolasa :

We’ve found the issue, it was a bug on Dashboard UI side. We’ve been showing wrong badges when Agent had archived hosts. We’ll fix that, will let you know when it will be on nightlies. Thanks a lot for submitting it here!

Forza · September 19, 2022, 3:30pm

Great! Thank you for letting us know

ServerUser · September 19, 2022, 6:13pm

Hi.

Sorry to disappear but unfortunately no time Do you need any additional information from me or do you have everything debugged?

Manolis_Vasilakis · September 20, 2022, 10:12am

Hi @ServerUser , I think we have what we need, thanks!

Ankit_Khandelwal · January 23, 2023, 6:18am

Is this bug fixed?
If fixed, in which release?
If not fixed, I can try fixing it if it is in open source version.

Topic		Replies	Views
Disk and Container showing on localhost but not on netdata.cloud Help cloud	8	890	October 8, 2020
error for all graphs on cloud Help agent , cloud	3	319	March 24, 2023
Suddently all nodes shown as offline (despite servers being 100% active/live) Help agent , cloud	1	501	June 6, 2023
Most our hosts can stream to the cloud Help cloud	2	615	February 2, 2022
Netdata Cloud suddenly stops displaying data for a node, agent still showing as "Live" Help agent , dashboards	3	531	January 16, 2023

Host shows "off" but graphs are drawn

Related topics