Alert configuration httpcheck timeout

I am using httpcheck to verify my websites are up and running.
I setup this for several websites in the go.d/httpcheck.conf.
This works very good.

The problem is about the timeout alerts, specified in health.d/httpcheck.conf.
The reponse times are good and I expect no alerts to be raised.
However, thay are often raised and recovered.

Looking in the alert definition:

 template: httpcheck_web_service_timeouts
       on: httpcheck.status
    class: Latency
     type: Web Server
component: HTTP endpoint
   lookup: average -5m unaligned percentage of timeout
    every: 10s
    units: %
     warn: $this >= 10 AND $this < 40
     crit: $this >= 40
    delay: down 5m multiplier 1.5 max 1h
  summary: HTTP check for ${label:url} timeouts
     info: Percentage of timed-out HTTP requests to ${label:url} in the last 5 minutes
       to: webmaster

This alert seems to calculate average timeouts, so response times are rlated to each other…
I would like to be alerted only when the specified timeout (in go.d/httpcheck.conf) is reached several, say 3, in the last, say 2 minutes or so.

How should I change the settings in go.d/httpcheck.conf and the alert definition in health.d/httpcheck.conf?

Hi, @JB_Walton. The response times don’t matter, your alert is: “If httpcheck status was “timeout” 10+% of the time in the last 5 minutes - warning”

Thanks,

I am missing something I guess.

You say, that cesponse times don’t matter.
Only timeout is checked.

What does define “timeout”?

The httpcheck.status metric is the status of HTTP check. Possible values are:

  • success: got response and the response status code is in configured status_accepted (200 by default).
  • bad_status: got response but the response status code is not in configured status_accepted (not 200 by default).
  • bad_content: got response and the response status code is in status_accepted but unexpected body content (when using response_match).
  • bad_header: got response and response status code is in status_accepted but unexpected heders (when using headers_match).
  • timeout: no response within configured timeout (1 second by default).
  • no_connection: connection refused.

Thanks,

I think I understand it.
I have set the timeout in go.d/httpcheck.conf.

In the netdata httpcheck.status chart I can see the timeouts occurring.
Now I just need to find in my server logs what is causing those timeouts.
I cannot find it quickly on my debian /var/log files, so I have to dig into it.

Added statuses description to docs in 18351.