When automatically deploying NetData, we have no chance to use
edit-config on each host when we need to modify any stock configurations. Instead, we’ve a technique to write e.g.
/etc/netdata/health.d/my_httpcheck.conf to modify a single definition from the stock
It seems that this is no longer working - not sure for how long but with the latest NetData agent it doesn’t work anymore. Those files are just ignored.
Has this been removed deliberately?
What’s the recommended approach instead to modify a single component in a config file? We don’t want to copy all of the configuration files to make modifications, we only want to modify individual components.
Hi, @jurgenhaas. Overwriting alerts configuration in the way you did should work, I just did:
cd health.d/ && mv entropy.conf my_entropy.conf
@@ -15,5 +15,5 @@ component: Cryptography
warn: $this < (($status >= $WARNING) ? (200) : (100))
delay: down 1h multiplier 1.5 max 2h
- info: minimum number of entries in the random numbers pool in the last 5 minutes
+ info: MY minimum number of entries in the random numbers pool in the last 5 minutes
systemctl restart netdata
The changes have been applied.
@jurgenhaas possible problems:
Invalid syntax, can you
grep "Health configuration" error.log right after the restart?
/etc/netdata/health.d/my_httpcheck.conf is not readable by
health configuration directory. To check:
$ curl -s "localhost:19999/netdata.conf" | grep "health configuration directory"
# stock health configuration directory = /opt/netdata/usr/lib/netdata/conf.d/health.d
# health configuration directory = /opt/netdata/etc/netdata/health.d
Thanks @ilyam8 this seems to work indeed. Not sure why we were under the impression it wasn’t. Something was pretty strange but with you instructions above, we’ve rebuild everything and the behavior now feels ok.
Only one caveat we’re seeing in the error.log:
netdata ERROR : MAIN : Health configuration template 'httpcheck_web_service_slow' already exists for host 'mweb1'. (errno 2, No such file or directory)
is in there twice after each restart. The config files looks like this and seems to work:
type: Web Server
component: HTTP endpoint
lookup: average -3m unaligned of time
warn: ($this > ($httpcheck_1h_web_service_response_time * 1) )
crit: ($this > ($httpcheck_1h_web_service_response_time * 2) )
delay: down 1m multiplier 1.5 max 1h
info: MY average HTTP response time over the last 3 minutes, compared to the average over the last hour
@jurgenhaas changed the severity level to
info in 12873.