Custom python.d agent won't start, debug runs fine, example runs fine

I’ve developed a small custom python.d module to collect data from a sagemcom router: GitHub - vaiden/nedata_sagemcom_collector: A Netdata module to collect data from a sagemcom router

I can’t get the chart to show on my Netdata dashboard.
A debug test seems to run fine:

$ /usr/lib/netdata/plugins.d/python.d.plugin debug trace 1 sagemcom
2022-05-26 23:57:04: python.d INFO: plugin: main: Using python 3
2022-05-26 23:57:04: python.d DEBUG: plugin: main: loading '/etc/netdata/python.d.conf'
2022-05-26 23:57:10: python.d DEBUG: plugin: main: module load source: 'sagemcom' => [OK]
2022-05-26 23:57:10: python.d DEBUG: plugin: main: loading '/etc/netdata/python.d/sagemcom.conf'
2022-05-26 23:57:10: python.d DEBUG: plugin: main: job initialization: 'sagemcom sagemcom_fyber' => ['OK']
2022-05-26 23:57:10: python.d DEBUG: plugin: main: module status: 'sagemcom' => [OK] (jobs: 1)
2022-05-26 23:57:10: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
CHART netdata.runtime_sagemcom_Sagemcom_Fyber '' 'Execution time for sagemcom_Sagemcom_Fyber' 'ms' 'python.d' netdata.pythond_runtime line 145000 30
DIMENSION run_time 'run time' absolute 1 1

2022-05-26 23:57:10: python.d DEBUG: sagemcom: sagemcom_fyber: create() => [OK] (charts: 1)
2022-05-26 23:57:10: python.d DEBUG: sagemcom: sagemcom_fyber: started, update frequency: 30
get_data
Starting sagemcom client 
34:49:5B:12:F1:A0 Fast5657_GPON
-18210
2972
ret_val={'o': -18210, 't': 2972}
CHART sagemcom_Sagemcom_Fyber.rxtx '' 'GPON' 'dbm' 'GPON' 'GPON' line 60000 30 '' 'python.d.plugin' 'sagemcom'
DIMENSION 'o' 'Optical Module Rx Power' absolute 1 1 ''
DIMENSION 't' 'Optical Module Tx Power' absolute 1 1 ''

BEGIN sagemcom_Sagemcom_Fyber.rxtx 0
SET 'o' = -18210
SET 't' = 2972
END

BEGIN netdata.runtime_sagemcom_Sagemcom_Fyber 0
SET run_time = 641
END

2022-05-26 23:57:29: python.d DEBUG: sagemcom: sagemcom_fyber: update => [OK] (elapsed time: 641, failed retries in a row: 0)

My python.d.conf:

...
example: yes
sagemcom: yes
...

My netdata.conf:

...
[plugins]
        python.d = yes

[plugin:python.d]
        command options = -ppython3

Example python.d collector seems to run fine.

Worth noting:

  1. The dashboard python.d section contains an empty chart:
  2. The error log contains a single mention of my collector per run:
$ sudo cat /var/log/netdata/error.log | grep sagemcom
2022-05-26 00:13:57: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
2022-05-26 00:35:00: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
2022-05-26 00:46:26: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
2022-05-26 01:04:31: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
2022-05-26 01:39:38: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
2022-05-26 21:14:08: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
2022-05-26 21:24:03: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
2022-05-26 21:36:27: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
2022-05-26 21:42:52: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
  1. The error log seems to indicate Netdata is running check() on all of the python modules, even though they’re all disabled:
2022-05-26 21:42:52: python.d ERROR: redis: localipv4: [Errno 107] Transport endpoint is not connected
2022-05-26 21:42:52: python.d INFO: redis: localipv4: check() => [FAILED]
2022-05-26 21:42:52: python.d ERROR: redis: localipv6: Failed to connect to "::1", port 6379, error: [Errno 111] Connection refused
2022-05-26 21:42:52: python.d ERROR: redis: localipv6: [Errno 107] Transport endpoint is not connected
2022-05-26 21:42:52: python.d INFO: redis: localipv6: check() => [FAILED]
2022-05-26 21:42:52: python.d ERROR: rethinkdbs: local: "rethinkdb" module is needed to use rethinkdbs.py
2022-05-26 21:42:52: python.d INFO: rethinkdbs: local: check() => [FAILED]
2022-05-26 21:42:52: python.d ERROR: retroshare: localhost: Url: http://localhost:9090/api/v2/stats. Error: HTTPConnectionPool(host='localhost', port=9090): Max retries exceeded with url: /api/v2/stats (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xb6326250>: Failed to establish a new connection: [Errno 111] Connection refused'))
2022-05-26 21:42:52: python.d ERROR: retroshare: localhost: _get_data() returned no data or type is not <dict>
2022-05-26 21:42:52: python.d INFO: retroshare: localhost: check() => [FAILED]
2022-05-26 21:42:52: python.d INFO: sagemcom: sagemcom_fyber: check() => [OK]
2022-05-26 21:42:52: python.d ERROR: smartd_log: smartd_log: check() unhandled exception: [Errno 2] No such file or directory: '/var/log/smartd'
2022-05-26 21:42:52: python.d INFO: smartd_log: smartd_log: check() => [FAILED]
2022-05-26 21:42:52: python.d ERROR: spigotmc: spigotmc: Error connecting.
2022-05-26 21:42:52: python.d ERROR: spigotmc: spigotmc: ConnectionRefusedError(111, 'Connection refused')
2022-05-26 21:42:52: python.d INFO: spigotmc: spigotmc: check() => [FAILED]
2022-05-26 21:42:52: python.d ERROR: springboot: local: Url: http://localhost:8080/metrics. Error: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /metrics (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xb68ad330>: Failed to establish a new connection: [Errno 111] Connection refused'))
2022-05-26 21:42:52: python.d ERROR: springboot: local: _get_data() returned no data or type is not <dict>
2022-05-26 21:42:52: python.d INFO: springboot: local: check() => [FAILED]
2022-05-26 21:42:52: python.d ERROR: springboot: local_ip: Url: http://127.0.0.1:8080/metrics. Error: HTTPConnectionPool(host='127.0.0.1', port=8080): Max retries exceeded with url: /metrics (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xb5ac6290>: Failed to establish a new connection: [Errno 111] Connection refused'))
2022-05-26 21:42:52: python.d ERROR: springboot: local_ip: _get_data() returned no data or type is not <dict>

Any help appreciated.

1 Like

Cool! Love to see people building collectors.

I wonder if it could be something asyncio related. Do you have a way to try it without asyncio to see maybe.

Not sure though as is strange that the debug seems to work.

@ilyam8 anything obvious jump out to you?

Ps once we get it working, if could be something useful for others, it could be cool to submit it as a PR into either netdata/netdata or netdata/community

1 Like

Many thanks! I’m continuing the conversation on github.