Need help on creating basic custom alarm.

Environment

Ubuntu 18.04 / netdata v1.33.0-50-g49d5e73c8

Problem/Question

I created a custom python.d monitoring script to monitor the status of UFW.

-- coding: utf-8 --

Description: Check UFW Status using Pythin

from bases.FrameworkServices.ExecutableService import ExecutableService
from bases.collection import find_binary

priority = 90000

ORDER = [
‘Status’,
]

CHARTS = {
‘Status’: {
‘options’: [‘UFW Status’, ‘0 disable, 1 enable’, None, None, None, ‘line’],
‘lines’: [
[‘status’]
]
}
}

SUDO = ‘sudo’
UFWSTATUS = ‘ufw’

class Service(ExecutableService):
def init(self, configuration=None, name=None):
ExecutableService.init(self, configuration=configuration, name=name)
self.order = ORDER
self.definitions = CHARTS
self.num_lines = 1
self.lower = 0
self.upper = 2

@staticmethod
def check():
    return True

def get_data(self):
    data = dict()
    if 'Status' not in self.charts['Status']:
        self.charts['Status'].add_dimension(['Status'])
    sudo_binary = find_binary(SUDO)
    ufwstatus_binary = find_binary(UFWSTATUS)

    command = [sudo_binary, ufwstatus_binary, 'status']
    smbstatus = '{0} status'.format(ufwstatus_binary)
    allowed = self._get_raw_data(command=command)
    if allowed[0].strip() == 'Status: inactive':
        data['status'] = 0
    else:
        data['status'] = 1

    return data

Basically, this create a graph with two values (0 or 1) depending of the status of UFW.

I would like know to create an alarm which would raise if the status = 0, but I can’t really understand the documentation to do so.
I have been looking for some example in the existing health conf but didn’t find any.
Does anyone has an example somewhere I could referee to ?

Thanks for the help.

DeWaRs

Hi, @DeWaRs1206.

What documentation you are referring to? Did you check Health Configuration Reference? See Example 2 - disk space, it is close to what you need.

If you still have problems and the alarm doesn’t work - share the alarm you created and we will help you to spot the problem.

@ilyam8 thanks a lot for the answer. I was indeed checking the wrong documentation: Configure health alarms | Learn Netdata

This seems to work as expected.

alarm: ufw_status
on: ufw.UFW_Status
calc: $status
every: 1m
warn: $this = 0
crit: $this = 0
repeat: warning 120s critical 10s

I will read a bit more to see what could be improve.

Thanks again for your help.

@ilyam8 I have some weird behaviour with my alarm. My alarm is the following:

alarm: ufw_status
on: ufw.UFW_Status
calc: $status
every: 1m
warn: $this = 0
crit: $this = 0
repeat: critical 120s

My status is 0 in my chart:

I receive an initial alarm as expected, but one minute later, I have another alarm saying everything is fixed, which is wrong:

My alarm is the following from Netdata dashboard:

Any idea what I’m missing ?

Thanks again for your help.

EDIT:

I think my misunderstanding is about the “every” option. From the documentation I see “Sets the update frequency of this alarm.” but I’m not sure what this means. While my ufw status is still inactive, at the next occurence of the frequency, the alarm is “cleared”, which seems weird to me…