Problem/Question
My Netdata Cloud has about 60 servers flickering between an “Unknown Error” state and having no issue. Clicking run correlations doesn’t show you anything specific. Logging into the servers and checking the logs in /var/log/netdata, specifically “error.log” is empty. Literally nothing. Checking the rotated logs, I don’t see anything from today. The netdata daemon on these servers seems to just be running fine as far as I can tell.
Environment/Browser
Tested with a completely updated Google Chrome & Brave.
What I expected to happen
To be able to track down why a large group of my servers are constantly flickering from “Unknown Error” to totally fine, despite nothing being left behind in error.log.
Hi @Mbrantley ,
Thank you for posting the problem. Sorry for the delayed response from our side. Is the problem reported still occurring?
Kind regards,
Christos
No worries, it’s not like I’m paying for anything.
It’s still happening, right now I have many many unknown errors. The exact number changes constantly.
Unfortunately I cannot reproduce this issue. It would help if you could use the browser developer tools to see there are any API calls that have failed or what the error returned is.
Alternatively, feel free to drop me a message on our Discord server so that we can look into this faster.
Also, have you tried restarting any of the agents?
Thank you for your help and support @Mbrantley! After the discussion on discord, we have narrowed this down and it looks like there is a 404 error returned from the agent when the cloud tries to connect to fetch the chart data to show.
We’ll check this further from both the cloud and the agent side to verify if everything is working as expected. Thank you also for sharing that there are some firewalls and network configurations connecting to these agents. This could also be what is causing these problems.