we are at our last resort, none of the netdata instances in our company can connect to app.netdata.cloud. I looked at your troubleshooting guide and first I tried to see if my IP is blocked, but I get “Banned By Cloud: No”. I asked the IT to show the firewall log and I can see that the SSL session is ended because of tcp-rst-from-server. We cannot see websocket. What can we look into?
Thank you in advance!
To help you figure out what is going on, I need a bit more information. Can you paste the lines from
/var/log/netdata/error.log that mention
ACLK? Also can you provide details on the version of the Agent, the contents of
/etc/lsb-release, and the output of
ls -al /etc/ssl/certs | grep ISRG.
Thank you ralphm!
This is our error log with ACLK
# less /var/log/netdata/error.log | grep ACLK
2023-10-09 14:17:11: netdata INFO : MAIN : EXIT: Stopping main thread: ACLK_Main
2023-10-09 14:17:12: netdata INFO : ACLK_Main : thread with task id 5638 finished
2023-10-09 14:51:45: netdata INFO : MAIN : ACLK sync initialization completed
2023-10-09 14:51:45: netdata INFO : ACLKSYNC : Starting ACLK synchronization thread
2023-10-09 14:51:45: netdata INFO : ACLK_MAIN : thread created with task id 113883
2023-10-09 14:51:45: netdata INFO : ACLK_MAIN : set name of thread 113883 to ACLK_MAIN
2023-10-09 14:51:45: netdata INFO : ACLK_MAIN : Waiting for Cloud to be enabled
2023-10-11 15:52:03: netdata INFO : MAIN : ACLK sync initialization completed
2023-10-11 15:52:03: netdata INFO : ACLKSYNC : Starting ACLK synchronization thread
2023-10-11 15:52:03: netdata INFO : ACLK_MAIN : thread created with task id 116785
2023-10-11 15:52:03: netdata INFO : ACLK_MAIN : set name of thread 116785 to ACLK_MAIN
2023-10-11 15:52:03: netdata INFO : ACLK_MAIN : Waiting for Cloud to be enabled
We also don’t have any ISRG certificates so the output is empty
This is the version of the netdata that I am testing it on
But we do have more than 100 instances that show the same behavior. And even the web access to https://app.netdata.cloud gets you an empty white page
My suspicion is that your operating system doesn’t have the latest root certificates to validate our Let’s Encrypt certificate. You mentioning that there are no ISRG certificates seems to confirm that. This happens on (older) platforms that haven’t updated their certificates before the expiry of the DST Root CA X3 certificate in October of 2021. You can find an overview of certificate compatibility here. along with the affected platforms. Unfortunately, you didn’t provide the contents of
/etc/lsb-release to confirm.
If your platform is indeed affected, then I strongly suggest you upgrade. If there are vendor-provided updates for just the certificates, then applying those would be a good first step. For Debian-based systems you can do this by running
sudo apt install ca-certificates. If there isn’t such an update, your systems may be vulnerable because other security updates also weren’t issued.
Hi @kiby, did you work out the issue, or can I help you debug this further?
Hi @ralphm, we still haven’t solved the issue, but we seem to not have enough bandwidth on IT site help troubleshooting. We had a user who travelled to the office in the different location and there it worked, so I do think it is our firewall issue. But it doesn’t help us. I will write down the solution when we get to it. Meanwhile thank you very much @ralphm for your help!