Monitoring Netdata itself.

Hello,

I have a general question about monitoring what Netdata monitor. I’m using Netdata since a while now, to monitor around 40 servers for my company. From time to time, after an Netdata update or other change on the system, I “lost” some monitoring data / graph for various reason, without noticing those datas are missing. For example:

  • Loosing the Apache data because Netdata (or the system, not sure why) decide to use IPV6 as default configuration for the plugin, without success. So Apache data weren’t available anymore. Of course the moment I realize these data are missing is when I need them :slight_smile:

  • After Netdata update, MySQL monitoring stops running because the plugin changed from Python.d to go.d or something like that.

  • After Netdata update, a custom plugin stoped working because a python module was missing.

So my question is the following, what would be the best way to monitor which monitoring is enable or not, via a Netdata plugin or external (like a cron running on regulare basis). Which Netdata API could I use to write such check ?

Thanks for your feedback

DeWaRs.

Hi.

  • After Netdata update, MySQL monitoring stops running because the plugin changed from Python.d to go.d or something like that.

We have removed MySQL Python builder in version 1.35.0 - go.d/mysql is the only MySQL collector we support. The problem is that we never explicitly disabled it in python.d.conf. This is not a problem for new installations but for older Netdata installations. I will do it. If you are using the stable version of Netdata, just add mysql: no to python.d.conf.


  • After Netdata update, a custom plugin stoped working because a python module was missing.

This is I don’t understand. What module? What do you mean by missing?


Loosing the Apache data because Netdata (or the system, not sure why) decide to use IPV6 as default configuration for the plugin

go.d.plugin discover application running on the host by executing the local-listeners binary that reads/proc/net/{tcp,tcp6,udp,udp6} files. This binary is located in the same directory as all plugins.

Example from my host:

local-listeners output
# ./local-listeners no-udp6 no-local no-inbound no-outbound no-namespaces
TCP|0.0.0.0|22|sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
TCP6|::|22|sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
TCP6|::|19998|/usr/sbin/docker-proxy -proto tcp -host-ip :: -host-port 19998 -container-ip 172.17.0.8 -container-port 19999
UDP|127.0.0.1|8125|/opt/netdata/usr/sbin/netdata -P /run/netdata/netdata.pid -D
TCP6|::1|8125|/opt/netdata/usr/sbin/netdata -P /run/netdata/netdata.pid -D
TCP|127.0.0.1|8125|/opt/netdata/usr/sbin/netdata -P /run/netdata/netdata.pid -D
TCP|0.0.0.0|19998|/usr/sbin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 19998 -container-ip 172.17.0.8 -container-port 19999
TCP|0.0.0.0|19999|/opt/netdata/usr/sbin/netdata -P /run/netdata/netdata.pid -D
TCP6|::|19999|/opt/netdata/usr/sbin/netdata -P /run/netdata/netdata.pid -D
UDP|172.17.0.1|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec
UDP|0.0.0.0|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec
UDP|127.0.0.1|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec
UDP|10.10.10.20|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec
UDP|10.1.1.1|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec
TCP|127.0.0.1|25|/usr/lib/postfix/sbin/master -w
TCP6|::1|25|/usr/lib/postfix/sbin/master -w
UDP|127.0.0.1|161|/usr/sbin/snmpd -LOw -u Debian-snmp -g Debian-snmp -I -smux mteTrigger mteTriggerConf -f
TCP|127.0.0.1|38373|/usr/bin/containerd
TCP6|::|3000|grafana server --homepath=/usr/share/grafana --config=/etc/grafana/grafana.ini --packaging=docker cfg:default.log.mode=console cfg:default.paths.data=/var/lib/grafana cfg:default.paths.logs=/var/log/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins cfg:default.paths.provisioning=/etc/grafana/provisioning
TCP6|::|9090|/bin/prometheus --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/prometheus --web.console.libraries=/usr/share/prometheus/console_libraries --web.console.templates=/usr/share/prometheus/consoles
TCP|127.0.0.1|11332|rspamd: main process; 0.1 msg/sec, 0.0 msg/sec spam, 0.1 msg/sec ham; 0.00s avg processing time
TCP6|::1|11332|rspamd: main process; 0.1 msg/sec, 0.0 msg/sec spam, 0.1 msg/sec ham; 0.00s avg processing time
TCP|127.0.0.1|11334|rspamd: main process; 0.1 msg/sec, 0.0 msg/sec spam, 0.1 msg/sec ham; 0.00s avg processing time
TCP6|::1|11334|rspamd: main process; 0.1 msg/sec, 0.0 msg/sec spam, 0.1 msg/sec ham; 0.00s avg processing time
TCP|127.0.0.1|11333|rspamd: main process; 0.1 msg/sec, 0.0 msg/sec spam, 0.1 msg/sec ham; 0.00s avg processing time
TCP6|::1|11333|rspamd: main process; 0.1 msg/sec, 0.0 msg/sec spam, 0.1 msg/sec ham; 0.00s avg processing time
TCP6|::|8123|/usr/bin/clickhouse-server --config-file /etc/clickhouse-server/config.xml --pid-file /var/run/clickhouse-server/clickhouse-server.pid --daemon
TCP6|::|9000|/usr/bin/clickhouse-server --config-file /etc/clickhouse-server/config.xml --pid-file /var/run/clickhouse-server/clickhouse-server.pid --daemon
TCP6|::|9004|/usr/bin/clickhouse-server --config-file /etc/clickhouse-server/config.xml --pid-file /var/run/clickhouse-server/clickhouse-server.pid --daemon
TCP6|::|9005|/usr/bin/clickhouse-server --config-file /etc/clickhouse-server/config.xml --pid-file /var/run/clickhouse-server/clickhouse-server.pid --daemon
TCP6|::|9363|/usr/bin/clickhouse-server --config-file /etc/clickhouse-server/config.xml --pid-file /var/run/clickhouse-server/clickhouse-server.pid --daemon
TCP|127.0.0.1|6379|/usr/bin/redis-server 127.0.0.1:6379
TCP6|::1|6379|/usr/bin/redis-server 127.0.0.1:6379
TCP6|::|9009|/usr/bin/clickhouse-server --config-file /etc/clickhouse-server/config.xml --pid-file /var/run/clickhouse-server/clickhouse-server.pid --daemo

go.d.plugin analyzes the output and identifies applications based on the well-known port and/or binary name (config).

So, if the app listens on TCP6 - yes, it created a data collection job with an IPv6 address. If the app listens on both v4 and v6 - it prefers v4. I think the issue with your Apache is that it (according to the Linux kernel ) doesn’t listen on v4 but on v6 socket only.

Yep,

root@pve-deb-work:/opt/netdata/usr/libexec/netdata/plugins.d# ss -ntlp | grep apache2
LISTEN 0      511                *:80               *:*    users:(("apache2",pid=2361104,fd=4),("apache2",pid=2361103,fd=4),("apache2",pid=2361102,fd=4))

root@pve-deb-work:/opt/netdata/usr/libexec/netdata/plugins.d# ./local-listeners | grep apache2
TCP6|::|80|/usr/sbin/apache2 -k start

i will think what we can do about it.

Hello ilyam8,

Thanks for your answer. I think you misunderstood my point. I’m aware of why things stop working for most of the case, and found a workaround to fixe it. I’m looking for a way to be informed when a graph / data disappeared for any reason, so I can fix the issue before I need the data and notice they disappear.

I thinking about comparing some API result Netdata API from time too time, but not sure on which to look at.

Regards

DeWaRs