Scheduling

Hello.
I have two servers, at a certain time there is a backup of these servers. Netdata sends problem reports to Telegram, httpcheck is used because the site being monitored goes down. How to reconfigure Netdata so that messages are not sent during the backup, and at the end of each backup, the state of Netdata is restored.
Thanks.

Hi @maxz - Welcome!

I believe you can use the health management api to turn off alarms before the backup and then just turn them back on after the backup.

So maybe you could schedule it as part of the backup process perhaps.

Thank you very much, andrewm4894.

I applied this, but messages continue to come to Telegram:

“#!/bin/bash
TIMESTAMP=date +%Y-%m-%d_%H-%M-%S
echo $TIMESTAMP
curl “http://127.0.0.1:19999/api/v1/manage/health?cmd=DISABLE&context=KAT” -H “X-Auth-Token:MyToken””

This is the message in the log file :

"2023-01-11_21-58-01
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed

0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0
100 78 100 78 0 0 15600 0 --:–:-- --:–:-- --:–:-- 15600
Health checks disabled for alarms matching the selectors
Alarm selector added "

What could be the problem?

root@1-0-3-0:/opt/netdata/etc/netdata# curl “http://127.0.0.1:19999/api/v1/manage/health?cmd=LIST” -H “X-Auth-Token: MyToken”
{
“all”: false,
“type”: “DISABLE”,
“silencers”: [
{
“context”: “KAT”
}
]
}

Hi @maxz ,

Is there a reason why your providing context=KAT? You want to disable all alarms, right?

If this is try the command should without providing any context, since the context is in case you want to disable alarms for charts linked to that context

  • context : Chart context, as shown on the dashboard. These will match the on entry of a configured template.

When you disable all your LIST command should show

Regards,
Hugo

Hi,Hugo. Big thanks.
The httpcheck.conf file describes two servers that are backed up at different times. Have to turn off alarms at different times and only alarms for a specific server.

Got it, thanks for the clarification.

The context won’t allow that because that is our internal Netdata names, for example for httpcheck the context is something like httpcheck.response_time or httpcheck.response_length. Please check the screenshot below:

There are other options you have are to define silencers for :

  • chart: Chart ids/names, as shown on the dashboard. These will match the on entry of a configured alarm.
  • hosts : The hostnames that will need to match.

Checking the httpcheck config we don’t have a host attribute anywhere, so probably this is for other collectors, but you could try there.

My best guess would the chart since you can use the chart name as it appears on the dashboard, e.g. from the screenshoot above httpcheck_Bangalore_Demo_Site__CloudFlare_.request_status you can confirm yours on the dashboard or with a request tohttps://127.0.0.1:/api/v1/charts.

Correction: Was looking to the definition of an alert on httpcheck.conf and we have something like:

template: httpcheck_web_service_timeouts
 families: *
       on: httpcheck.status
    class: Latency
     type: Web Server
component: HTTP endpoint
   lookup: average -5m unaligned percentage of timeout
    every: 10s
    units: %
     warn: $this >= 10 AND $this < 40
     crit: $this >= 40
    delay: down 5m multiplier 1.5 max 1h
     info: percentage of timed-out HTTP requests to ${label:url} in the last 5 minutes
       to: webmaster

So as chart is the name that matches on the on of the configuration, so this doesn’t have the server specific part on the name…

I think we’ll need some help from someone on the Agent team, maybe @Manolis_Vasilakis you could help here?

Cheers,
Hugo

I tried to solve this problem but I can’t so please can anyone share the solution.

Hi! So, assuming you have 2 httpcheck web sites to monitor:

Check the alerts that are running on netdata by checking http://127.0.0.1:19999/api/v1/alarms?all

The http check alerts should be listed there (there are a few, e.g. timeouts, status, etc for each site you’re monitoring).

You can disable all the alert that are active on a specific chart using:

curl 'http://127.0.0.1:19999/api/v1/manage/health?cmd=DISABLE&chart=httpcheck_cool_website.request_status' -H 'X-Auth-Token: MyToken'

You can get the chart value from the api call above. Does this work in your case @maxz ?

@TradeLabelSoftware Is it something similar you’re trying to do?

1 Like