Netdata Community

Writing a custom Python plugin

Hi,

I have been developing a custom plugin to collect monitoring data from one of our application to netdata.

The conceptual issue is the number of dimensions in some of the charts are dynamic. The examples and plugins that are based on the Framework classes (e.g. SimpleService, URLService, etc) all appear to have a static chart and dimensions definition, and to output it once when the plugin is started.

The page “External Plugin API” states that any program that can output on standard output can be a module (External plugins overview · Netdata Agent | Learn Netdata). Great, so I did just this, following up on an old example from a github thread (Struggling writing custom plugins · Issue #206 · netdata/netdata · GitHub), and following up the general meta-language algorithm given in the documentation.

Great thing: it works, and I have the charts as I want them.
Bad thing: when I put my plugin in the /usr/libexec/netdata/python.d, all the other python plugins don’t run, as if my plugin was hijacking the internal scheduling.

Questions:

  1. is there any signal/character/command that I need to output on the standard output that tells netdata “I am done for this iteration”
  2. what is the proper way to force reoutput of CHART/DIMENSION when using the framework classes such as SimpleService?

Many thanks for your help,
Quentin

Hi, lets start from the bad thing.

  • python.d.plugin debug mode

Debug mode is nice, easy to spot the problem, very handy during developing

# as netdata user

# debug specific module
./python.d.plugin debug trace module_name

# debug all modules
./python.d.plugin debug
  • error.log

If there is something doesn’t work error.log is a good place to check

python.d.plugin doesn’t work?

grep python.d.plugin error.log
grep my_python_module error.log

Hi,

Thank you for your kind reply.

Yes, I forgot to mention that I did run in debug mode. It runs fine; no error show up, the expected set of CHART/DIMENSION/BEGIN/SET/END commands are issued. Actually, the resulting charts are OK and perfect in every way; except from the fact that I can’t seem to run more than just my own plugin.

When debugging all modules, it correctly shows the debug info for reading the configuration file, and “built XX job config”. Then when it comes to the custom plugin, just prints the stdout of the module. I paste here a cropped version that shows the idea (to avoid hundreds of lines ;):
I have left two modules for which I have an explicit configuration: postgres and weblog. The log progresses nicely in alphabetical order through the modules, until it reaches mine:

ubuntu@monitor:/usr/libexec/netdata/python.d$ sudo -u netdata ./…/plugins.d/python.d.plugin debug
2021-02-03 23:10:41: python.d INFO: plugin[main] : using python v3
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : looking for ‘python.d.conf’ in [’/etc/netdata’, ‘/usr/lib/netdata/conf.d’]
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : loading ‘/usr/lib/netdata/conf.d/python.d.conf’
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : ‘/usr/lib/netdata/conf.d/python.d.conf’ is loaded
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : looking for ‘pythond-jobs-statuses.json’ in /var/lib/netdata
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : loading ‘/var/lib/netdata/pythond-jobs-statuses.json’
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : ‘/var/lib/netdata/pythond-jobs-statuses.json’ is loaded
[…]
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : [postgres] looking for ‘postgres.conf’ in [’/etc/netdata/python.d’, ‘/usr/lib/netdata/conf.d/python.d’]
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : [postgres] loading ‘/etc/netdata/python.d/postgres.conf’
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : [postgres] ‘/etc/netdata/python.d/postgres.conf’ is loaded
2021-02-03 23:10:41: python.d INFO: plugin[main] : [postgres] built 1 job(s) configs
2021-02-03 23:10:41: python.d DEBUG: plugin[main] : postgres[tcp] was previously active, applying recovering settings
[…]
2021-02-03 23:10:42: python.d DEBUG: plugin[main] : [web_log] looking for ‘web_log.conf’ in [’/etc/netdata/python.d’, ‘/usr/lib/netdata/conf.d/python.d’]
2021-02-03 23:10:42: python.d DEBUG: plugin[main] : [web_log] loading ‘/usr/lib/netdata/conf.d/python.d/web_log.conf’
2021-02-03 23:10:42: python.d DEBUG: plugin[main] : [web_log] ‘/usr/lib/netdata/conf.d/python.d/web_log.conf’ is loaded
2021-02-03 23:10:42: python.d INFO: plugin[main] : [web_log] built 13 job(s) configs
[…]
CHART wmspanel_nimble.incoming_streams_RTMP ‘’ ‘Nimble incoming stream status (RTMP)’ bool RTMP wmspanel_nimble_incoming_stream stacked 10000 60 ‘’ ‘python.d.plugin’ ‘nimble’
DIMENSION ‘hls/23_test-mmc_lqubtp9_vps252383’ ‘hls/23_test-mmc_lqubtp9_vps252383’ absolute 1 1 ’ ’
[…]
BEGIN wmspanel_nimble.incoming_streams_RTMP 0
SET “hls/23_test-mmc_lqubtp9_vps252383” = 1
END

and then waits for the next update (1 minute) to reprint the next set of values.

If I run debug on an official module, e.g. postgres, it behaves similarly, but afterwards there is a

python.d DEBUG: postgres[tcp] : update => [OK] (elapsed time: 28, failed retries in a row: 0)

once the iteration is done, which I don’t understand how to trigger?

Many thanks for your feedback,
Quentin

The log progresses nicely in alphabetical order through the modules, until it reaches mine

lets see your module debug output

./python.d.plugin debug trace wmspanel # or wmspanel_nimble, not sure what the name is

Sure, here is the log for two full iterations:

q@monitor:/usr/libexec/netdata/plugins.d$ sudo -u netdata ./python.d.plugin debug trace wmspanel_nimble
2021-02-04 13:31:13: python.d INFO: plugin[main] : using python v3
2021-02-04 13:31:13: python.d DEBUG: plugin[main] : looking for ‘python.d.conf’ in [’/etc/netdata’, ‘/usr/lib/netdata/conf.d’]
2021-02-04 13:31:13: python.d DEBUG: plugin[main] : loading ‘/usr/lib/netdata/conf.d/python.d.conf’
2021-02-04 13:31:13: python.d DEBUG: plugin[main] : ‘/usr/lib/netdata/conf.d/python.d.conf’ is loaded
2021-02-04 13:31:13: python.d DEBUG: plugin[main] : looking for ‘pythond-jobs-statuses.json’ in /var/lib/netdata
2021-02-04 13:31:13: python.d DEBUG: plugin[main] : loading ‘/var/lib/netdata/pythond-jobs-statuses.json’
2021-02-04 13:31:13: python.d DEBUG: plugin[main] : ‘/var/lib/netdata/pythond-jobs-statuses.json’ is loaded
CHART wmspanel_nimble.incoming_streams_MPEGTS ‘’ ‘Nimble incoming stream status (MPEGTS)’ bool MPEGTS wmspanel_nimble_incoming_stream stacked 10000 60 ‘’ ‘python.d.plugin’ ‘nimble’
DIMENSION ‘stream_0’ ‘stream_0’ absolute 1 1 ’ ’
DIMENSION ‘stream_1’ ‘stream_1’ absolute 1 1 ’ ’
DIMENSION ‘stream_10’ ‘stream_10’ absolute 1 1 ’ ’
DIMENSION ‘stream_11’ ‘stream_11’ absolute 1 1 ’ ’
DIMENSION ‘stream_12’ ‘stream_12’ absolute 1 1 ’ ’
DIMENSION ‘stream_13’ ‘stream_13’ absolute 1 1 ’ ’
DIMENSION ‘stream_14’ ‘stream_14’ absolute 1 1 ’ ’
DIMENSION ‘stream_15’ ‘stream_15’ absolute 1 1 ’ ’
DIMENSION ‘stream_16’ ‘stream_16’ absolute 1 1 ’ ’
DIMENSION ‘stream_17’ ‘stream_17’ absolute 1 1 ’ ’
DIMENSION ‘stream_19’ ‘stream_19’ absolute 1 1 ’ ’
DIMENSION ‘stream_2’ ‘stream_2’ absolute 1 1 ’ ’
DIMENSION ‘stream_20’ ‘stream_20’ absolute 1 1 ’ ’
DIMENSION ‘stream_21’ ‘stream_21’ absolute 1 1 ’ ’
DIMENSION ‘stream_22’ ‘stream_22’ absolute 1 1 ’ ’
DIMENSION ‘stream_23’ ‘stream_23’ absolute 1 1 ’ ’
DIMENSION ‘stream_24’ ‘stream_24’ absolute 1 1 ’ ’
DIMENSION ‘stream_25’ ‘stream_25’ absolute 1 1 ’ ’
DIMENSION ‘stream_26’ ‘stream_26’ absolute 1 1 ’ ’
DIMENSION ‘stream_27’ ‘stream_27’ absolute 1 1 ’ ’
DIMENSION ‘stream_28’ ‘stream_28’ absolute 1 1 ’ ’
DIMENSION ‘stream_3’ ‘stream_3’ absolute 1 1 ’ ’
DIMENSION ‘stream_5’ ‘stream_5’ absolute 1 1 ’ ’
DIMENSION ‘stream_8’ ‘stream_8’ absolute 1 1 ’ ’
DIMENSION ‘stream_9’ ‘stream_9’ absolute 1 1 ’ ’
CHART wmspanel_nimble.incoming_streams_RTMP ‘’ ‘Nimble incoming stream status (RTMP)’ bool RTMP wmspanel_nimble_incoming_stream stacked 10000 60 ‘’ ‘python.d.plugin’ ‘nimble’
DIMENSION ‘stream_18’ ‘stream_18’ absolute 1 1 ’ ’
DIMENSION ‘stream_4’ ‘stream_4’ absolute 1 1 ’ ’
DIMENSION ‘stream_7’ ‘stream_7’ absolute 1 1 ’ ’
CHART wmspanel_nimble.incoming_streams_ICECAST ‘’ ‘Nimble incoming stream status (ICECAST)’ bool ICECAST wmspanel_nimble_incoming_stream stacked 10000 60 ‘’ ‘python.d.plugin’ ‘nimble’
DIMENSION ‘stream_6’ ‘stream_6’ absolute 1 1 ’ ’
BEGIN wmspanel_nimble.incoming_streams_MPEGTS 0
SET “stream_0” = 1
SET “stream_1” = 1
SET “stream_10” = 1
SET “stream_11” = 1
SET “stream_12” = 1
SET “stream_13” = 0
SET “stream_14” = 0
SET “stream_15” = 0
SET “stream_16” = 1
SET “stream_17” = 1
SET “stream_19” = 1
SET “stream_2” = 1
SET “stream_20” = 1
SET “stream_21” = 1
SET “stream_22” = 1
SET “stream_23” = 1
SET “stream_24” = 1
SET “stream_25” = 1
SET “stream_26” = 1
SET “stream_27” = 1
SET “stream_28” = 1
SET “stream_3” = 1
SET “stream_5” = 1
SET “stream_8” = 1
SET “stream_9” = 1
END
BEGIN wmspanel_nimble.incoming_streams_RTMP 0
SET “stream_18” = 1
SET “stream_4” = 1
SET “stream_7” = 1
END
BEGIN wmspanel_nimble.incoming_streams_ICECAST 0
SET “stream_6” = 1
END

1 minute wait for the next collection (log stays the same, nothing moves or anything), then:

CHART wmspanel_nimble.incoming_streams_MPEGTS ‘’ ‘Nimble incoming stream status (MPEGTS)’ bool MPEGTS wmspanel_nimble_incoming_stream stacked 10000 60 ‘’ ‘python.d.plugin’ ‘nimble’
DIMENSION ‘stream_0’ ‘stream_0’ absolute 1 1 ’ ’
DIMENSION ‘stream_1’ ‘stream_1’ absolute 1 1 ’ ’
DIMENSION ‘stream_10’ ‘stream_10’ absolute 1 1 ’ ’
DIMENSION ‘stream_11’ ‘stream_11’ absolute 1 1 ’ ’
DIMENSION ‘stream_12’ ‘stream_12’ absolute 1 1 ’ ’
DIMENSION ‘stream_13’ ‘stream_13’ absolute 1 1 ’ ’
DIMENSION ‘stream_14’ ‘stream_14’ absolute 1 1 ’ ’
DIMENSION ‘stream_15’ ‘stream_15’ absolute 1 1 ’ ’
DIMENSION ‘stream_16’ ‘stream_16’ absolute 1 1 ’ ’
DIMENSION ‘stream_17’ ‘stream_17’ absolute 1 1 ’ ’
DIMENSION ‘stream_19’ ‘stream_19’ absolute 1 1 ’ ’
DIMENSION ‘stream_2’ ‘stream_2’ absolute 1 1 ’ ’
DIMENSION ‘stream_20’ ‘stream_20’ absolute 1 1 ’ ’
DIMENSION ‘stream_21’ ‘stream_21’ absolute 1 1 ’ ’
DIMENSION ‘stream_22’ ‘stream_22’ absolute 1 1 ’ ’
DIMENSION ‘stream_23’ ‘stream_23’ absolute 1 1 ’ ’
DIMENSION ‘stream_24’ ‘stream_24’ absolute 1 1 ’ ’
DIMENSION ‘stream_25’ ‘stream_25’ absolute 1 1 ’ ’
DIMENSION ‘stream_26’ ‘stream_26’ absolute 1 1 ’ ’
DIMENSION ‘stream_27’ ‘stream_27’ absolute 1 1 ’ ’
DIMENSION ‘stream_28’ ‘stream_28’ absolute 1 1 ’ ’
DIMENSION ‘stream_3’ ‘stream_3’ absolute 1 1 ’ ’
DIMENSION ‘stream_5’ ‘stream_5’ absolute 1 1 ’ ’
DIMENSION ‘stream_8’ ‘stream_8’ absolute 1 1 ’ ’
DIMENSION ‘stream_9’ ‘stream_9’ absolute 1 1 ’ ’
CHART wmspanel_nimble.incoming_streams_RTMP ‘’ ‘Nimble incoming stream status (RTMP)’ bool RTMP wmspanel_nimble_incoming_stream stacked 10000 60 ‘’ ‘python.d.plugin’ ‘nimble’
DIMENSION ‘stream_18’ ‘stream_18’ absolute 1 1 ’ ’
DIMENSION ‘stream_4’ ‘stream_4’ absolute 1 1 ’ ’
DIMENSION ‘stream_7’ ‘stream_7’ absolute 1 1 ’ ’
CHART wmspanel_nimble.incoming_streams_ICECAST ‘’ ‘Nimble incoming stream status (ICECAST)’ bool ICECAST wmspanel_nimble_incoming_stream stacked 10000 60 ‘’ ‘python.d.plugin’ ‘nimble’
DIMENSION ‘stream_6’ ‘stream_6’ absolute 1 1 ’ ’
BEGIN wmspanel_nimble.incoming_streams_MPEGTS 60059
SET “stream_0” = 1
SET “stream_1” = 1
SET “stream_10” = 1
SET “stream_11” = 1
SET “stream_12” = 1
SET “stream_13” = 0
SET “stream_14” = 0
SET “stream_15” = 0
SET “stream_16” = 1
SET “stream_17” = 1
SET “stream_19” = 1
SET “stream_2” = 1
SET “stream_20” = 1
SET “stream_21” = 1
SET “stream_22” = 1
SET “stream_23” = 1
SET “stream_24” = 1
SET “stream_25” = 1
SET “stream_26” = 1
SET “stream_27” = 1
SET “stream_28” = 1
SET “stream_3” = 1
SET “stream_5” = 1
SET “stream_8” = 1
SET “stream_9” = 1
END
BEGIN wmspanel_nimble.incoming_streams_RTMP 60059
SET “stream_18” = 1
SET “stream_4” = 1
SET “stream_7” = 1
END
BEGIN wmspanel_nimble.incoming_streams_ICECAST 60059
SET “stream_6” = 1
END
^C2021-02-04 13:32:18: python.d INFO: plugin[main] : exiting from main…

CTRL+C’d

(EDIT: copy/paste error)

it is debug level message, only seen if you add debug cli option


and then waits for the next update (1 minute) to reprint the next set of values.

unlikely, but… it is possible you add some sleep function? :thinking:

I am not sure what you mean…?

Yes there is a call to time.sleep(), as recommended from the pseudo-code at External plugins overview · Netdata Agent | Learn Netdata (bottom)

The plugin code looks like this:

def main_loop():
    # internal defaults for the command line arguments                                                                                                                                                      
    update_every = 60 # seconds                                                                                                                                                                             
    update_every *= 1000
    get_millis = lambda: int(round(time.time() * 1000))
   
    count = 0
    last_run = next_run = now = get_millis()
    while True:
        if next_run <= now:
            count += 1
            now = get_millis()
            while next_run <= now:
               next_run += update_every
            dt = now - last_run
            last_run = now
            # on the first iteration, don't set dt allowing netdata to align itself                                                                                                                                                              
            if count == 1:
                dt = 0
          
            status = nimble_get_incoming_streams()
            netdata_generate_charts(status)
            netdata_generate_datapoints(status, dt)

            time.sleep(update_every / 1000 / 10)                                                                                                                                                               
            now = get_millis()

That is the problem, you only need to implement data collection logic, python.d.plugin executes _get_data method every update_every

Yes… to go back to the beginning of the thread, I attempted to approach things this way because the number of dimensions in my charts may be dynamic (e.g. when a new stream shows up)

The examples of plugin implementation use a static array for storing charts/dimensions which are only transmitted to netdata once on plugin startup.

If you know how to force the refresh of dimensions, I’d be happy to move my collector to a child of the SimpleService class :slight_smile:

At the same time, I do believe I followed the documentation to the letter, so it is a bit misleading…

Hi all,

I come to resolve my own issue and share if anyone has the same issue of:

  1. Having a dynamic number of charts
  2. Having dynamic dimensions inside those charts.

The way I got it to work is to implement a child of SimpleService.
Then, in the get_data() function:

  1. Redefineself.order = [], and self.definitions = { }
  2. Clear self.charts: you need to remake the call that is done (for you) in the SimpleService constructor, i.e. self.charts = Charts(job_name=self.actual_name, priority=configuration["priority"], cleanup=configuration["chart_cleanup"], get_update_every=self.get_update_every, module_name=self.module_name). You will need to have deep copied the configuration OrderedDict that netdata has given you in your own class constructor.
  3. Fill in your new self.order and self.definitions, based on the templates, dynamically constructed from the data that you have gathered.
  4. Explicitly call SimpleService.create(self) from within get_data(). This will regenerate the chart content and transmit the commands to the backend.

If any netdata team member is watching this thread, I would like to kindly point out that:

  1. the claim that “Any program that can print a few values to its standard output can become a Netdata external plugin” is a bit misleading as the plugin scheduling basically makes this impossible to just output data and expect it to work alongside other things (on this page → External plugins overview · Netdata Agent | Learn Netdata )
  2. I have already stated in another thread there is next to no documentation in the source code… I don’t know if you have an internal codebase with comments that you just strip out for release… or if you just remember everything… but it makes it unnecessarily complex to figure out the inner workings…

Many thanks
Quentin

1 Like

@Skykeeper

This is invaluable feedback. I can’t thank you enough! We are actually working on a python collector guide, so your feedback will probably be incorporated into the guide, so you are actively contributing in making this easier for everyone :v:.

If you are interested, I would love your feedback ahead of Guide’s release. Pinging our documentation guru for extra visibility @joel .

Many thanks and I will come back here :slight_smile:

@OdysLam

I am happy to help if you think my contribution can be worthwhile.
Do not hesitate to ping me or email through the registered address.

Cheers

Just chiming in in case others find useful - i think the alarms collector also has a sort of example of dynamically creating charts and/or dims on the fly as needed.

Using the add_dimension() and del_dimension() methods of self.charts

For adding charts dynamically i have also had a need to do this recently and there is an example of how i got it to work for me here: netdata/aggregator.chart.py at aggregator-collector · andrewm4894/netdata · GitHub

Basically use the add_chart() method if needed.

I’m not sure if thats quite the same as the use cases needed here but felt like it was maybe close enough to share some more info in case anyone in future finds it useful.