It seems that whenever I disconnect from an OpenVPN connection (via systemd unit), the Netdata agent stops showing up as alive in the cloud dashboard. The local dashboard works fine.
The local dashboard also tells me that the agent is not currently connected to the cloud.
Interestingly, reconnecting the VPN does not resolve the issue but a restart of the Netdata service does.
The error.log does not show any activity when the issue occurs either; perhaps something important has fallen over?
When the issue happened I used to netcat to check liveness of port 443 against “api.netdata.cloud” and “mqtt.netdata.cloud” and they come back as being live.
(Issue repro’s every time BTW)
EDIT for update: I ran metric correlation against the time-frame of an issue and saw something interesting. Right before the service restart I see IPv6 bandwidth being used and not IPv4, but right at and after the issue I see it has switch to IPv4 only.
It could be a red-herring but perhaps the agent is not handling a switch being IP protocol versions during run-time?
EDIT 2: I have repro’d the same behaviour when switching from non-VPN to a VPN connection
Ubuntu 22.04.1 LTS (arm64)
Edge Version 108.0.1462.46 (Official build) (64-bit)
Version: netdata v1.37.0-40-nightly
Configure options: ‘–prefix=/usr’ ‘–sysconfdir=/etc’ ‘–localstatedir=/var’ ‘–libexecdir=/usr/libexec’ ‘–libdir=/usr/lib’ ‘–with-zlib’ ‘–with-math’ ‘–with-user=netdata’ ‘–with-bundled-protobuf’ ‘CFLAGS=-O2 -pipe’ ‘LDFLAGS=’
Install type: kickstart-build
Native HTTPS: YES
Netdata Cloud: YES
TLS Host Verification: YES
Machine Learning: YES
Stream Compression: YES
protobuf: YES (bundled)
cgroup Network Tracking: YES
Xen VBD Error Tracking: NO
AWS Kinesis: NO
GCP PubSub: NO
Prometheus Remote Write: NO
Trace Allocations: NO
What do you guys need to troubleshoot this?
I’m using the standard systemd unit file for OpenVPN on Ubuntu (can provide that if needed).