Problem/Question:
We were able to “sucessfully” install a Netdata sensor on our vSphere vCenter node, but it is not reporting in the Netdata Cloud platform. We have created the vsphere.conf file at /opt/netdata/etc/netdata/go.d/vsphere.conf, the go.d.conf file at /opt/netdata/etc/netdata/go.d.conf and have verified through the go.d debugger that it is operational and pulling back related data. Systemctl status netdata reports:
● netdata.service - Real time performance monitoring
Loaded: loaded (/lib/systemd/system/netdata.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2022-09-28 03:30:55 UTC; 15h ago
Process: 14161 ExecStartPre=/bin/chown -R netdata:netdata /run/netdata (code=exited, status=0/SUCCESS)
Process: 14160 ExecStartPre=/bin/mkdir -p /run/netdata (code=exited, status=0/SUCCESS)
Process: 14158 ExecStartPre=/bin/chown -R netdata:netdata /opt/netdata/var/cache/netdata (code=exited, status=0/SUCCESS)
Process: 14156 ExecStartPre=/bin/mkdir -p /opt/netdata/var/cache/netdata (code=exited, status=0/SUCCESS)
Main PID: 14164 (netdata)
Tasks: 69 (limit: 19660)
Memory: 272.5M
CGroup: /system.slice/netdata.service
├─14164 /opt/netdata/bin/srv/netdata -P /run/netdata/netdata.pid -D
├─14166 /opt/netdata/bin/srv/netdata --special-spawn-server
├─14571 /opt/netdata/usr/libexec/netdata/plugins.d/apps.plugin 1
├─14578 /usr/bin/python3 /opt/netdata/usr/libexec/netdata/plugins.d/python.d.plugin 1
├─14582 /opt/netdata/usr/libexec/netdata/plugins.d/go.d.plugin 1
└─74354 bash /opt/netdata/usr/libexec/netdata/plugins.d/tc-qos-helper.sh 1
Sep 28 03:30:57 <URL REDACTED> [14561]: Cannot read process groups configuration file '/opt/netdata/etc/netdata/apps_groups.conf'. Will try '/opt/netdata/usr/lib/netdata/conf.d/apps_groups.conf'
Sep 28 03:31:42 <URL REDACTED> sendmail[15497]: NOQUEUE: SYSERR(netdata): can not chdir(/var/spool/clientmqueue/): Permission denied
Sep 28 03:32:08 <URL REDACTED> sendmail[15874]: NOQUEUE: SYSERR(netdata): can not chdir(/var/spool/clientmqueue/): Permission denied
Sep 28 04:20:49 <URL REDACTED> [53415]: Does not have a configuration file inside `/opt/netdata/etc/netdata/ebpf.d.conf. It will try to load stock file.
Sep 28 04:20:49 <URL REDACTED> [53415]: Your environment does not have BTF file /sys/kernel/btf//vmlinux. The plugin will work with 'legacy' code.
Sep 28 04:20:49 <URL REDACTED> [53415]: Name resolution is disabled, collector will not parser "hostnames" list.
Sep 28 04:20:49 <URL REDACTED> [53415]: The network value of CIDR 127.0.0.1/8 was updated for 127.0.0.0 .
Sep 28 04:20:49 <URL REDACTED> [53415]: Cannot read process groups configuration file '/opt/netdata/etc/netdata/apps_groups.conf'. Will try '/opt/netdata/usr/lib/netdata/conf.d/apps_groups.conf'
Sep 28 04:20:49 <URL REDACTED> [53415]: PROCFILE: Cannot open file '/proc/14561/status'
Sep 28 04:20:49 <URL REDACTED> [53415]: Cannot open /proc/14561/status
The /opt/netdata/var/log/netdata/error.log tail shows:
2022-09-28 18:32:50: netdata INFO : ACLK_Main : Attempting connection now
2022-09-28 18:32:50: netdata ERROR : ACLK_Main : Cert Chain verify error:num=20:unable to get local issuer certificate:depth=2:/C=US/O=Internet Security Research Group/CN=ISRG Root X1 (errno 2, No such file or directory)
2022-09-28 18:32:50: netdata ERROR : ACLK_Main : SSL_write Err: SSL_ERROR_SSL
2022-09-28 18:32:50: netdata ERROR : ACLK_Main : Couldn't write HTTP request header into SSL connection (errno 22, Invalid argument)
2022-09-28 18:32:50: netdata ERROR : ACLK_Main : Couldn't process request
2022-09-28 18:32:50: netdata ERROR : ACLK_Main : Error trying to contact env endpoint (errno 22, Invalid argument)
2022-09-28 18:32:50: netdata ERROR : ACLK_Main : Failed to Get ACLK environment
2022-09-28 18:32:50: netdata INFO : ACLK_Main : Wait before attempting to reconnect in 0.549 seconds
2022-09-28 18:32:51: netdata INFO : ACLK_Main : Attempting connection now
2022-09-28 18:32:51: netdata LOG FLOOD PROTECTION too many logs (201 logs in 16 seconds, threshold is set to 200 logs in 1200 seconds). Preventing more logs from process 'netdata' for 1184 seconds.
2022-09-28 18:36:00: go.d INFO: vsphere[<URL REDACTED>] discovering : found 1 dcs, 123 folders, 1 clusters (0 dummy), 5 hosts, 1127 vms, process took 350.603896ms
2022-09-28 18:36:00: go.d INFO: vsphere[<URL REDACTED>] discovering : building : removed 922 vms (not powered on)
2022-09-28 18:36:00: go.d INFO: vsphere[<URL REDACTED>] discovering : building : built 1/1 dcs, 123/123 folders, 1/1 clusters, 5/5 hosts, 205/1127 vms, process took 645.283µs
2022-09-28 18:36:00: go.d INFO: vsphere[<URL REDACTED>] discovering : hierarchy : set 1/1 clusters, 5/5 hosts, 205/205 vms, process took 64.816µs
2022-09-28 18:36:00: go.d INFO: vsphere[<URL REDACTED>] discovering : filtering : filtered 0/5 hosts, 0/205 vms, process took 16.207µs
2022-09-28 18:36:00: go.d INFO: vsphere[<URL REDACTED>] discovering : metric lists : collected metric lists for 5/5 hosts, 205/205 vms, process took 43.293µs
2022-09-28 18:36:00: go.d INFO: vsphere[<URL REDACTED>] discovering : discovered 5/5 hosts, 205/1127 vms, the whole process took 351.56398ms
2022-09-28 18:41:00: go.d INFO: vsphere[<URL REDACTED>] discovering : found 1 dcs, 123 folders, 1 clusters (0 dummy), 5 hosts, 1127 vms, process took 212.485425ms
2022-09-28 18:41:00: go.d INFO: vsphere[<URL REDACTED>] discovering : building : removed 922 vms (not powered on)
2022-09-28 18:41:00: go.d INFO: vsphere[<URL REDACTED>] discovering : building : built 1/1 dcs, 123/123 folders, 1/1 clusters, 5/5 hosts, 205/1127 vms, process took 444.154µs
2022-09-28 18:41:00: go.d INFO: vsphere[<URL REDACTED>] discovering : hierarchy : set 1/1 clusters, 5/5 hosts, 205/205 vms, process took 55.846µs
2022-09-28 18:41:00: go.d INFO: vsphere[<URL REDACTED>] discovering : filtering : filtered 0/5 hosts, 0/205 vms, process took 12.62µs
2022-09-28 18:41:00: go.d INFO: vsphere[<URL REDACTED>] discovering : metric lists : collected metric lists for 5/5 hosts, 205/205 vms, process took 49.414µs
2022-09-28 18:41:00: go.d INFO: vsphere[<URL REDACTED>] discovering : discovered 5/5 hosts, 205/1127 vms, the whole process took 213.228886ms
Relevant docs you followed/actions you took to solve the issue
Other links but since I’m a new user I’m limited to 5.
Environment/Browser/Agent’s version etc
Netdata -W buildinfo shows:
Version: netdata v1.36.0-154-g373c97d3b
Configure options: '--prefix=/opt/netdata/usr' '--sysconfdir=/opt/netdata/etc' '--localstatedir=/opt/netdata/var' '--libexecdir=/opt/netdata/usr/libexec' '--libdir=/opt/netdata/usr/lib' '--with-zlib' '--with-math' '--with-user=netdata' '--enable-cloud' '--without-bundled-protobuf' '--disable-dependency-tracking' 'CFLAGS=-static -O2 -I/openssl-static/include -pipe' 'LDFLAGS=-static -L/openssl-static/lib' 'PKG_CONFIG_PATH=/openssl-static/lib/pkgconfig'
Install type: kickstart-static
Binary architecture: x86_64
Features:
dbengine: YES
Native HTTPS: YES
Netdata Cloud: YES
ACLK: YES
TLS Host Verification: YES
Machine Learning: YES
Stream Compression: YES
Libraries:
protobuf: YES (system)
jemalloc: NO
JSON-C: YES
libcap: NO
libcrypto: YES
libm: YES
tcalloc: NO
zlib: YES
Plugins:
apps: YES
cgroup Network Tracking: YES
CUPS: NO
EBPF: YES
IPMI: NO
NFACCT: NO
perf: YES
slabinfo: YES
Xen: NO
Xen VBD Error Tracking: NO
Exporters:
AWS Kinesis: NO
GCP PubSub: NO
MongoDB: NO
Prometheus Remote Write: YES
What I expected to happen
Expect for the installer to be able to sucessfully install when it’s marked as Sucessful - or have error handling surrounding this issue. Hoping to get this fixed, as vSphere / vCenter host metrics tracking is a huge objective for our team.