Hello,
I use netdata v1.31.1 sinc a couple of months now and most of the things work well appart the fping plugin randomly.
In my net data master, I’ve setup a couple of time the plugin as stated in the documentation for different hosts:
[root@monitoring-apps [DEV] ~]# ls -altr /opt/netdata/netdata-configs/fping*
-rw-r--r--. 1 netdata netdata 1233 Jul 9 11:27 /opt/netdata/netdata-configs/fping.conf
-rw-r--r--. 1 netdata netdata 1434 Sep 14 12:59 /opt/netdata/netdata-configs/fpingtalendrcc.conf
-rw-r--r--. 1 netdata netdata 1293 Sep 14 13:00 /opt/netdata/netdata-configs/fpingalerta.conf
-rw-r--r--. 1 netdata netdata 1299 Sep 14 13:00 /opt/netdata/netdata-configs/fpingattunityrcc.conf
-rw-r--r--. 1 netdata netdata 1277 Sep 14 13:00 /opt/netdata/netdata-configs/fpingbackupdns.conf
-rw-r--r--. 1 netdata netdata 1770 Sep 14 13:00 /opt/netdata/netdata-configs/fpingconfluentrcc.conf
-rw-r--r--. 1 netdata netdata 1302 Sep 14 13:00 /opt/netdata/netdata-configs/fpingselfservicepasswordrcc.conf
In every files I have max 8 hosts… so something small but 95% of the time, as soon as I restart, I only have half of the plugins which are active.
2021-09-14 13:05:48: fpingalerta.plugin: INFO: Loading config file '/opt/netdata/etc/netdata/fpingalerta.conf'...
2021-09-14 13:05:48: perf.plugin INFO : MAIN : no charts enabled - nothing to do.
2021-09-14 13:05:48: fpingalerta.plugin: INFO: starting fping: /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 alerta.xxxxxxxxxxxxxx.local
2021-09-14 13:05:48: netdata INFO : PLUGINSD[perf] : called DISABLE. Disabling it.
2021-09-14 13:05:48: netdata INFO : PLUGINSD[perf] : PARSER ended
2021-09-14 13:05:48: netdata ERROR : PLUGINSD[perf] : '/opt/netdata/usr/libexec/netdata/plugins.d/perf.plugin' (pid 16683) disconnected after 0 successful data collections (ENDs). (errno 22, Invalid argument)
2021-09-14 13:05:48: netdata ERROR : PLUGINSD[perf] : child pid 16683 exited with code 1.
2021-09-14 13:05:48: netdata ERROR : PLUGINSD[perf] : '/opt/netdata/usr/libexec/netdata/plugins.d/perf.plugin' (pid 16683) exited with error code 1 and haven't collected any data. Disabling it. (errno 22, Invalid argument)
2021-09-14 13:05:48: netdata INFO : PLUGINSD[perf] : thread with task id 16680 finished
So a couple of plugins generate this error, not always the same, and it’s REALLY complicated to finally have all of them available at the same time after restart. Of course, when I run it manually, everything works fine.
Can somebody help me to solve it please ?
Best,
Jerome
ilyam8
September 14, 2021, 4:53pm
2
Hi, @jrevillard . Let’s see what is happening.
The topic is about fping.pluigin
, but later you say you have problems with other plugins.
According to your logs, it works (or at least I don’t see it fails)
2021-09-14 13:05:48: fpingalerta.plugin: INFO: starting fping: /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 alerta.xxxxxxxxxxxxxx.local
I see only perf.plugin
in the logs. It doesn’t work, indeed. But that is expected because it is disabled by default (can be changed in netdata.conf
).
What plugins do you mean by saying “all of them”?
ilyam8
September 14, 2021, 6:00pm
3
To see running external plugins (fping is an external) you can use ps faxu | grep "[n]etdata"
That is what i get on my VM:
ilyam@debian-s-1vcpu-1gb-fra1-01:~$ ps faxu | grep "[n]etdata"
netdata 16003 1.2 13.5 402456 138196 ? Ssl 11:23 4:47 /opt/netdata/usr/sbin/netdata -P /opt/netdata/var/run/netdata/netdata.pid -D
netdata 16027 0.0 0.2 51524 2464 ? Sl 11:23 0:00 \_ /opt/netdata/usr/sbin/netdata --special-spawn-server
netdata 16246 1.0 0.4 37676 4760 ? S 11:23 4:02 \_ /opt/netdata/usr/libexec/netdata/plugins.d/apps.plugin 1
netdata 16247 0.0 0.3 7848 3128 ? S 11:23 0:23 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.8
netdata 16248 0.5 2.3 723888 23504 ? Sl 11:23 2:00 \_ /opt/netdata/usr/libexec/netdata/plugins.d/go.d.plugin 1
netdata 16249 0.2 5.1 108904 52808 ? Sl 11:23 0:52 \_ /usr/bin/python /opt/netdata/usr/libexec/netdata/plugins.d/python.d.plugin 1
netdata 18672 0.0 0.3 36136 3356 ? S 15:23 0:00 \_ /opt/netdata/usr/libexec/netdata/plugins.d/perf.plugin 1 cycles instructions
netdata 20054 0.1 0.2 9724 2796 ? S 17:23 0:02 \_ bash /opt/netdata/usr/libexec/netdata/plugins.d/tc-qos-helper.sh 1
Ok sorry if I didn’t explained properly … perhaps that the log part that I extracted is not relevant in fact.
My problem is only with the fping plugin (at least I concentrate my time on this one for the moment). As you can see in my first post, I have 6 different fping configuration but when I check the processes, I only have 5 processes actually:
[root@monitoring-apps [DEV] ~]# ps faxu | grep "[n]etdata" | grep fping
netdata 25323 0.0 0.0 1288 1124 ? SN Sep14 0:35 \_ /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 ...
netdata 25327 0.0 0.0 1288 856 ? SN Sep14 0:35 \_ /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 ...
netdata 25328 0.0 0.0 1288 860 ? SN Sep14 0:30 \_ /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 ...
netdata 25333 0.0 0.0 1288 860 ? SN Sep14 0:30 \_ /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 ...
netdata 25335 0.1 0.0 1292 1124 ? SN Sep14 0:52 \_ /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 ...
This is not a configuration issue because, for instance, if I restart, I will have finally the 6 available, or only 4, not always the same… This is really annoying because at every restart, it’s really complicated to get the all stuff in place.
Best,
Jerome
ilyam8
September 15, 2021, 7:51am
5
Ok, so the problem when using Multiple fping Plugins With Different Settings . I need to test it.
Btw, you use that feature if
For example, you may need to ping a few hosts 10 times per second, and others once per second.
From your “ps” output it looks like all the plugins have the same settings (different targets only). Have you considered using one plugin with multiple targets?
I need to separate because of the way that we use to deploy everything (Infrastructure as code)… so I need to separate for the moment yes…
but this does not explain the current behaviour…
ilyam8
September 15, 2021, 4:07pm
7
It doesn’t explain, yes. I suggested that so you can have a working fping.plugin
while we are investigating the problem.
Could you guide me please ? what kind of logs do you need ?
ilyam8
September 16, 2021, 11:41am
9
Ok, i tried to reproduce the issue, but it worked for me.
i created 10 fping.plugin
instances
[ilyam@pc ~]$ ps faxu | grep netdata | grep fping
netdata 64694 0.0 0.0 3504 1972 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.4
netdata 64699 0.0 0.0 3504 1756 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.8
netdata 64715 0.0 0.0 3504 2000 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.9
netdata 64725 0.0 0.0 3504 1968 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.88
netdata 64732 0.0 0.0 3504 1872 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.10
netdata 64733 0.0 0.0 3504 1964 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.3
netdata 64739 0.0 0.0 3504 1900 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.2
netdata 64748 0.0 0.0 3504 1964 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.5
netdata 64754 0.0 0.0 3504 1872 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.7
netdata 64764 0.0 0.0 3504 1968 ? S 14:04 0:01 \_ /usr/bin/fping -N -l -Q 1 -p 200 -R -b 56 -i 1 -r 0 -t 5000 8.8.8.6
restarted netdata.service 10 times and checked the number of running fping instances
[pc ilyam]# for i in $(seq 1 10); do systemctl restart netdata.service; sleep 5; echo "run $i, fping instances: $(ps faxu | grep netdata | grep -c fping)"; done
run 1, fping instances: 10
run 2, fping instances: 10
run 3, fping instances: 10
run 4, fping instances: 10
run 5, fping instances: 10
run 6, fping instances: 10
run 7, fping instances: 10
run 8, fping instances: 10
run 9, fping instances: 10
run 10, fping instances: 10
Let’s do the following
cd /opt/netdata/var/log/netdata/
sudo systemctl stop netdata
sudo cp /dev/null error.log
sudo systemctl start netdata
# wait for 5 seconds
grep fping error.log
Ok, so here it is:
[root@monitoring-apps [DEV] netdata]# grep fping error.log
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingconfluentrcc] : thread created with task id 20256
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingconfluentrcc] : set name of thread 20256 to PLUGINSD[fpingc
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingcruisecontrolrcc] : thread created with task id 20266
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingcruisecontrolrcc] : set name of thread 20266 to PLUGINSD[fpingc
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingalerta] : thread created with task id 20252
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingalerta] : set name of thread 20252 to PLUGINSD[fpinga
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingselfservicepasswordrcc] : thread created with task id 20254
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingselfservicepasswordrcc] : set name of thread 20254 to PLUGINSD[fpings
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingalerta] : connected to '/opt/netdata/usr/libexec/netdata/plugins.d/fpingalerta.plugin' running on pid 20274
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingcruisecontrolrcc] : connected to '/opt/netdata/usr/libexec/netdata/plugins.d/fpingcruisecontrolrcc.plugin' running on pid 20268
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fping] : thread created with task id 20259
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fping] : set name of thread 20259 to PLUGINSD[fping]
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingselfservicepasswordrcc] : connected to '/opt/netdata/usr/libexec/netdata/plugins.d/fpingselfservicepasswordrcc.plugin' running on pid 20279
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingbackupdns] : thread created with task id 20253
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingbackupdns] : set name of thread 20253 to PLUGINSD[fpingb
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingconfluentrcc] : connected to '/opt/netdata/usr/libexec/netdata/plugins.d/fpingconfluentrcc.plugin' running on pid 20267
2021-09-16 14:18:41: fpingcruisecontrolrcc.plugin: WARNING: Cannot find file '/opt/netdata/usr/lib/netdata/conf.d/fpingcruisecontrolrcc.conf'.
2021-09-16 14:18:41: fpingselfservicepasswordrcc.plugin: WARNING: Cannot find file '/opt/netdata/usr/lib/netdata/conf.d/fpingselfservicepasswordrcc.conf'.
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fping] : connected to '/opt/netdata/usr/libexec/netdata/plugins.d/fping.plugin' running on pid 20288
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingbackupdns] : connected to '/opt/netdata/usr/libexec/netdata/plugins.d/fpingbackupdns.plugin' running on pid 20306
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingattunityrcc] : 2021-09-16 14:18:41: fpingalerta.plugin: WARNING: Cannot find file '/opt/netdata/usr/lib/netdata/conf.d/fpingalerta.conf'.
2021-09-16 14:18:41: fpingconfluentrcc.plugin: WARNING: Cannot find file '/opt/netdata/usr/lib/netdata/conf.d/fpingconfluentrcc.conf'.
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingattunityrcc] : set name of thread 20255 to PLUGINSD[fpinga
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingtalendrcc] : thread created with task id 20265
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingtalendrcc] : 2021-09-16 14:18:41: tc-qos-helper.sh: WARNING: FireQoS is not installed on this system. Use FireQoS to apply traffic QoS and expose the class names to netdata. Check https://github.com/netdata/netdata/tree/master/collectors/tc.plugin#tcplugin
set name of thread 20265 to PLUGINSD[fpingt
2021-09-16 14:18:41: netdata INFO : WEB_SERVER[static1] : 2021-09-16 14:18:41: fpingselfservicepasswordrcc.plugin: INFO: Loading config file '/opt/netdata/etc/netdata/fpingselfservicepasswordrcc.conf'...
2021-09-16 14:18:41: fpingcruisecontrolrcc.plugin: INFO: Loading config file '/opt/netdata/etc/netdata/fpingcruisecontrolrcc.conf'...
2021-09-16 14:18:41: fpingconfluentrcc.plugin: INFO: Loading config file '/opt/netdata/etc/netdata/fpingconfluentrcc.conf'...
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingattunityrcc] : connected to '/opt/netdata/usr/libexec/netdata/plugins.d/fpingattunityrcc.plugin' running on pid 2021-09-16 14:18:41: fpingalerta.plugin: INFO: Loading config file '/opt/netdata/etc/netdata/fpingalerta.conf'...
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingtalendrcc] : connected to '/opt/netdata/usr/libexec/netdata/plugins.d/fpingtalendrcc.plugin' running on pid 20329
2021-09-16 14:18:41: fpingconfluentrcc.plugin: INFO: starting fping: /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 kafka.........
2021-09-16 14:18:41: fpingselfservicepasswordrcc.plugin: INFO: starting fping: /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 self-service-password..........
2021-09-16 14:18:41: fpingbackupdns.plugin: WARNING: Cannot find file '/opt/netdata/usr/lib/netdata/conf.d/fpingbackupdns.conf'.
2021-09-16 14:18:41: fping.plugin: INFO: Loading config file '/opt/netdata/usr/lib/netdata/conf.d/fping.conf'...
2021-09-16 14:18:41: fpingcruisecontrolrcc.plugin: INFO: starting fping: /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 cruise-control...........
2021-09-16 14:18:41: fpingalerta.plugin: INFO: starting fping: /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 alerta................
2021-09-16 14:18:41: fpingbackupdns.plugin: INFO: Loading config file '/opt/netdata/etc/netdata/fpingbackupdns.conf'...
2021-09-16 14:18:41: fpingattunityrcc.plugin: WARNING: Cannot find file '/opt/netdata/usr/lib/netdata/conf.d/fpingattunityrcc.conf'.
2021-09-16 14:18:41: fping.plugin: INFO: Loading config file '/opt/netdata/etc/netdata/fping.conf'...
2021-09-16 14:18:41: apps.plugin INFO : MAIN : 2021-09-16 14:18:41: fpingattunityrcc.plugin: INFO: Loading config file '/opt/netdata/etc/netdata/fpingattunityrcc.conf'...
2021-09-16 14:18:41: fpingbackupdns.plugin: INFO: starting fping: /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 192.168.50.20 192.168.50.30
2021-09-16 14:18:41: fpingtalendrcc.plugin: WARNING: Cannot find file '/opt/netdata/usr/lib/netdata/conf.d/fpingtalendrcc.conf'.
2021-09-16 14:18:41: fpingattunityrcc.plugin: INFO: starting fping: /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 attunity...........
2021-09-16 14:18:41: fpingtalendrcc.plugin: INFO: Loading config file '/opt/netdata/etc/netdata/fpingtalendrcc.conf'...
2021-09-16 14:18:41: netdata INFO : PLUGINSD[ioping] : 2021-09-16 14:18:41: fping.plugin: FATAL: no hosts configured - nothing to do.
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fping] : called DISABLE. Disabling it.
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fping] : PARSER ended
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fping] : '/opt/netdata/usr/libexec/netdata/plugins.d/fping.plugin' (pid 20288) disconnected after 0 successful data collections (ENDs). (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fping] : child pid 20288 exited with code 1.
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fping] : '/opt/netdata/usr/libexec/netdata/plugins.d/fping.plugin' (pid 20288) exited with error code 1 and haven't collected any data. Disabling it. (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fping] : thread with task id 20259 finished
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingalerta] : read failed: end of file (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingalerta] : PARSER ended
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingalerta] : 2021-09-16 14:18:41: fpingtalendrcc.plugin: INFO: starting fping: /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 talend.............
'/opt/netdata/usr/libexec/netdata/plugins.d/fpingalerta.plugin' (pid 20274) disconnected after 0 successful data collections (ENDs). (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingalerta] : child pid 20274 exited with code 2.
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingalerta] : '/opt/netdata/usr/libexec/netdata/plugins.d/fpingalerta.plugin' (pid 20274) exited with error code 2 and haven't collected any data. Disabling it. (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingalerta] : thread with task id 20252 finished
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingattunityrcc] : read failed: end of file (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingattunityrcc] : PARSER ended
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingattunityrcc] : '/opt/netdata/usr/libexec/netdata/plugins.d/fpingattunityrcc.plugin' (pid 20321) disconnected after 0 successful data collections (ENDs). (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingattunityrcc] : child pid 20321 exited with code 2.
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingattunityrcc] : '/opt/netdata/usr/libexec/netdata/plugins.d/fpingattunityrcc.plugin' (pid 20321) exited with error code 2 and haven't collected any data. Disabling it. (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingattunityrcc] : thread with task id 20255 finished
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingcruisecontrolrcc] : read failed: end of file (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingcruisecontrolrcc] : PARSER ended
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingcruisecontrolrcc] : '/opt/netdata/usr/libexec/netdata/plugins.d/fpingcruisecontrolrcc.plugin' (pid 20268) disconnected after 0 successful data collections (ENDs). (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingcruisecontrolrcc] : child pid 20268 exited with code 2.
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingcruisecontrolrcc] : '/opt/netdata/usr/libexec/netdata/plugins.d/fpingcruisecontrolrcc.plugin' (pid 20268) exited with error code 2 and haven't collected any data. Disabling it. (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingcruisecontrolrcc] : thread with task id 20266 finished
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingtalendrcc] : read failed: end of file (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingtalendrcc] : PARSER ended
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingtalendrcc] : '/opt/netdata/usr/libexec/netdata/plugins.d/fpingtalendrcc.plugin' (pid 20329) disconnected after 0 successful data collections (ENDs). (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingtalendrcc] : child pid 20329 exited with code 2.
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingtalendrcc] : '/opt/netdata/usr/libexec/netdata/plugins.d/fpingtalendrcc.plugin' (pid 20329) exited with error code 2 and haven't collected any data. Disabling it. (errno 22, Invalid argument)
2021-09-16 14:18:41: netdata INFO : PLUGINSD[fpingtalendrcc] : thread with task id 20265 finished
2021-09-16 14:18:42: go.d ERROR: prometheus[fping-exporter_local] Get "http://127.0.0.1:9605/metrics": dial tcp 127.0.0.1:9605: connect: connection refused
2021-09-16 14:18:42: go.d ERROR: prometheus[fping-exporter_local] check failed
And for this restart, only 3 fping processes are active:
[root@monitoring-apps [DEV] netdata]# ps faxu | grep "[n]etdata" | grep fping
netdata 20267 0.0 0.0 1292 4 ? SN 14:18 0:00 \_ /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 kafka........
netdata 20279 0.0 0.0 1288 4 ? SN 14:18 0:00 \_ /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 self-service-password.........
netdata 20306 0.0 0.0 1288 4 ? SN 14:18 0:00 \_ /opt/netdata/bin/fping -N -l -Q 5 -p 1000 -R -b 56 -i 1 -r 0 -t 5000 192.168.50.20 192.168.50.30
Best,
Jerome
Dear @ilyam8 , would you have time to look at the problem please ?
Please configure fping.conf
so that the plugin doesn’t stop with fping.plugin: FATAL: no hosts configured - nothing to do.
. What is the contents of the logs after that?
ilyam8
September 21, 2021, 9:22am
14
Hey, @jrevillard
I am not sure what is happening, I suspect there is a problem with resolving DNS names .
You have several log lines like
2021-09-16 14:18:41: netdata ERROR : PLUGINSD[fpingtalendrcc] : '/opt/netdata/usr/libexec/netdata/plugins.d/fpingtalendrcc.plugin' (pid 20329) exited with error code 2 and haven't collected any data. Disabling it. (errno 22, Invalid argument)
See fping man
DIAGNOSTICS
Exit status is 0 if all the hosts are reachable, 1 if some hosts were unreachable, 2 if any IP addresses were not found , 3 for invalid command line arguments, and 4 for a system call failure.
Hi @ilyam8 ,
Perhaps but what to do about it … I already have DNS cache, and if I run everything manually everything works… Perhaps starting all the fping processes at the same time overload DNS… wouldn’t it be possible to have a kind of retry functionality ?
Best,
Jerome
ilyam8
September 24, 2021, 7:01pm
16
@jrevillard let’s confirm that assumption, can you try switching to IP addresses temporarily and see if this fixes the problem?
Thx @ilyam8 , I confirm that with IPs everything works… I can tell you that I have the same issue with the x509check and httpcheck plugins…
The think is that the DNS resolution works well… perhaps some flooding on netdata restart ?
Also for information I put nscd on the server but it does not help…