We are very excited to beta launch our new “Anomaly Advisor” feature for early adopters in the Netdata community. The Anomaly Advisor builds on the recent ML capabilities we have added to the Netdata Agent in order to easily surface potentially anomalous charts and metrics.
What is the “Anomaly Advisor”?
The Anomaly Advisor gives Netdata Cloud a new “Anomalies” tab where you can quickly scan for periods of time with elevated numbers of anomalous metrics and highlight time periods of interest to surface a sorted list of the most anomalous metrics during the highlighted window.
Here is a quick sneak peak video of the feature and here is a slightly more extended one where we run a little chaos engineering attack on some nodes and see how it plays out in the Anomaly Advisor.
Getting Started
To enable the Anomaly Advisor you must first enable ML on your nodes via a small config change in netdata.conf
. Once the anomaly detection models have trained on the agent (with default settings this takes a couple of hours until enough data has been seen to train the models) you will then be able to enable the Anomaly Advisor feature in Netdata Cloud.
1. Enable ML on Netdata Agent
To enable ML on you Netdata Agent you just need to edit the [ml]
section in your netdata.conf
to look something like below.
Once done, restart Netdata with a command like sudo systemctl restart netdata
for the config changes to take effect. You can find more info on restarting Netdata here.
At a minimum you just need to set enabled = yes
to enable ML with default params. More details can be found in the Netdata Agent ML docs.
[ml]
enabled = yes
# maximum num samples to train = 14400
# minimum num samples to train = 3600
# train every = 3600
# num samples to diff = 1
# num samples to smooth = 3
# num samples to lag = 5
# maximum number of k-means iterations = 1000
# dimension anomaly score threshold = 0.99
# host anomaly rate threshold = 0.01000
# minimum window size = 30.00000
# maximum window size = 600.00000
# idle window size = 30.00000
# window minimum anomaly rate = 0.25000
# anomaly event min dimension rate threshold = 0.05000
# hosts to skip from training = !*
# charts to skip from training = !* netdata.*
Note: follow this guide if you are unfamiliar with making configuration changes in Netdata.
2. Enable Anomaly Advisor in Netdata Cloud
To enable the Anomaly Advisor feature in Netdata Cloud itself you just need to set a anomaly_advisor
feature flag to true
in your browser.
Here is a short video showing how to do this.
While on Netdata Cloud, in your browser, if you press F12 you should see the developer tools tab. Press the “Application” tab and under the “Local Storage” section for https://app.netdata.cloud you can add a new key & value pair of anomaly_advisor
& true
. Once you refresh the page you should now see the new “Anomalies” tab.
Notes
- You can see a detailed list of notes relating to the anomaly detection capabilities of the Netdata Agent here.
- If you would like to learn in more detail how the Netdata Agent anomaly detection works please check out the Netdata Agent ML docs.
- The default configuration requires at least 3600 seconds (1 hour) of data and will (re)train every 3600 seconds. So after you enable ML on your node, it should take around 2 hours for the first set of models to be trained and anomaly rates to become available for use by the Anomaly Advisor in Netdata Cloud.
Feedback
We’d love to hear any feedback you have on this thread. This feature is still very much in beta and so may be subject to change. We would love the Netdata community to help us shape this feature more and contribute to its further development in the coming months.