go.d.plugin error fail to create cancel reader: add reader to epoll interest list component="functions manager"

Problem/Question

I am spawning go.d.plugin from a server in order to capture its output and do something differently with it. It has been working well until just a few days ago. I now get an error : ‘fail to create cancel reader: add reader to epoll interest list component=“functions manager”’

Relevant docs you followed/actions you took to solve the issue

I have explored the code in GitHub - muesli/cancelreader: A cancelable reader for Go where that manager.go using to create the new cancelreader. The cancelreader code has not changed for a couple of years.

Environment/Browser/Agent’s version etc

Visual Sudio/React/Javascript. The browser is irrelevant

What I expected to happen

The spawned go.d.plugin should be sending my server its output through a pipe, which I then read line by line. All I am seeing is a couple of CONFIG lines and then continuous stream of blank lines.

Hi, @Blane245. You are right, the cancel reader code hasn’t changed for a while. But that error message is misleading, it doesn’t say what the error is - see Improve invalid fd error messages for epoll.

The spawned go.d.plugin should be sending my server its output through a pipe

It does it. If you didn’t see CHART/SET it means there were no data collection jobs running.

The lack of the cancel reader is not fatal, it just means functions will not work. Only dynamic configuration (this feature is in development) uses function at the moment.

I can’t reproduce the problem. If you are willing to debug it, I’d suggest the following: build go.d.plugin manually with the updated version of cancelreader and check the errors.

I little more info. go.d.plugin works fine when I run it from the console or from a shell script. It has the described behavior only when spawned by my server. The cancelreader message is only in the log when spawned. It does not appear when run from the console or shell script.

I will start debugging. I was thinking about writing my own collector anyway.

As I said the cancel reader library doesn’t provide the actual error message. Function manager passes os.Stdin to cancelreader.NewReader(), maybe there are some problems with stdin when your server spawned the plugin.

The described behavior is correct and expected: there is nothing in stdout apart from some CONFIG lines and new lines if there are no data collection jobs running. Why there are no - log messages in stderr.

My spawned process is set up to ignore stdin and to pipe stdout. Does go.d.plugin actually read anything from stdin?
The problem started very recently. Were there any changes to the plugin recently that may have affected this?

Here’s a segment of the spawn code, in Javascript


const plugin = ‘./app/middleware/go.d.plugin.sh’
source = spawn(plugin, ,
{
stdio: [‘ignore’, ‘pipe’, process.stderr],
shell: true,
detached: true, // this allows all processes started by spawning to the killed as a group
});
reader = readline.createInterface(source.stdout);

Yes.

  • stdin: Netdata → plugin: Netdata sends FUNCTION_* commands, plugin reads.
  • srderr: logging.
  • stdout: Plugin->Netdata. Charts and response to FUNCTION_* commands.

The problem started very recently.

Check stderr? Let me repeat it one more time - no data collection jobs running → no CHART/SET commands in stdout. I have no idea where the plugin is located in your case but to start data collection jobs it needs to read configuration files or use some discovery mechanism (e.g. use local-listeners binary to find apps that listen on localhost).

If you need to resolve the problem consider providing full info - where is the plugin located, do you have custom configuration files, what collectors do you expect running?

A recent change: switch from portscan discovery of running apps to reading /proc/net/tcp to find listening applications. 12152 describes it. A few days ago this was implemented. After the change, go.d.plugin uses local-listeners binary to read/decode data from proc/netc/{tcp,tcp6,udp,udp6}

e.g.

./local-listeners no-udp6 no-tcp6 no-local no-inbound no-outbound no-namespaces
UDP|0.0.0.0|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec
UDP|127.0.0.1|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec
UDP|10.10.10.20|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec
UDP|10.1.1.1|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec
TCP|127.0.0.1|45603|/usr/bin/containerd
TCP|127.0.0.1|25|/usr/lib/postfix/sbin/master -w
TCP|0.0.0.0|22|sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
UDP|127.0.0.1|161|/usr/sbin/snmpd -LOw -u Debian-snmp -g Debian-snmp -I -smux mteTrigger mteTriggerConf -f
TCP|0.0.0.0|19999|/opt/netdata/usr/sbin/netdata -P /run/netdata/netdata.pid -D
UDP|127.0.0.1|8125|/opt/netdata/usr/sbin/netdata -P /run/netdata/netdata.pid -D
TCP|127.0.0.1|8125|/opt/netdata/usr/sbin/netdata -P /run/netdata/netdata.pid -D
UDP|172.17.0.1|123|/usr/sbin/ntpd -p /run/ntpd.pid -c /etc/ntpsec/ntp.conf -g -N -u ntpsec:ntpsec

And then it uses this config to identify apps and create appropriate data collection jobs.

I am not up enough on local-listeners to know if this is causing the problem. Here are some more details. Please note that I think I found a solution.

  1. I am running go.d.plugin from its installed location /usr/libexec/plugins.d. When I use the command /usr/libexec/netdata/plugins.d/go.d.plugin -m mysql everyting works fines. I get the log file and collection on the terminal.

  2. I am using the standard configuration files. I have enabled the mysql module. The logfile from the console execution is below. This logfile output is followed by the collection data (CHART. etc.)

  3. Just to reiterate - the plugin is working fine in command mode but not as a spawned process. I see the ERR fail to create cancel reader: add reader to epoll interest list component="functions manager" in the log file of the spawned process.

  4. I just changed my spawn state so that stdin was not ‘ignore’ but process.stdin. The epoll error does not appear and I am getting collection data.

INF env HTTP_PROXY ‘’, HTTPS_PROXY ‘’ component=agent
INF instance is started component=agent
INF loading config file component=agent
INF found '/etc/netdata/go.d.conf component=agent
INF config successfully loaded component=agent
INF using config: enabled ‘true’, default_run ‘true’, max_procs ‘0’ component=agent
INF loading modules component=agent
INF enabled/registered modules: 1/77 component=agent
INF building discovery config component=agent
INF dummy/read/watch paths: 0/1/0 component=agent
INF registered discoverers: [file discovery: [file reader] service discovery] component=“discovery manager”
INF found ‘/usr/lib/netdata/conf.d/vnodes’ (0 vhosts) component=agent
INF instance is started component=“discovery manager”
INF instance is started component=discovery discoverer=file
INF instance is started component=“service discovery”
INF instance is started component=discovery discoverer=file
INF instance is stopped component=discovery discoverer=file
INF instance is started component=“job manager”
CONFIG go.d:collector:mysql create accepted template /collectors/jobs internal ‘internal’ ‘add schema enable disable test’ 0x0000 0x0000

INF instance is started component=“functions manager”
CONFIG go.d:collector:mysql:local create accepted job /collectors/jobs user ‘discoverer=file_reader,file=/etc/netdata/go.d/mysql.conf’ ‘schema get enable disable update restart test’ 0x0000 0x0000

ERR open /etc/my.cnf: no such file or directory collector=mysql job=local
ERR init failed collector=mysql job=local
CONFIG go.d:collector:mysql:local status failed

CONFIG go.d:collector:mysql:local create accepted job /collectors/jobs user ‘discoverer=file_reader,file=/etc/netdata/go.d/mysql.conf’ ‘schema get enable disable update restart test’ 0x0000 0x0000

ERR key-value delimiter not found: !includedir /etc/mysql/conf.d/
collector=mysql job=local
ERR init failed collector=mysql job=local
CONFIG go.d:collector:mysql:local status failed

CONFIG go.d:collector:mysql:local create accepted job /collectors/jobs user ‘discoverer=file_reader,file=/etc/netdata/go.d/mysql.conf’ ‘schema get enable disable update restart test’ 0x0000 0x0000

INF application version: ‘8.0.36-0ubuntu0.22.04.1’, version_comment: ‘(Ubuntu)’ collector=mysql job=local
INF check success collector=mysql job=local
CONFIG go.d:collector:mysql:local status running

INF started, data collection interval 5s collector=mysql job=local
INF instance is started component=“service discovery” pipeline=“network listeners”
INF discoverer ‘sd:net_listeners’ exited before ctx done component=“service discovery” pipeline=“network listeners”
INF all discoverers exited before ctx done component=“service discovery” pipeline=“network listeners”
INF instance is stopped component=“service discovery” pipeline=“network listeners”