Metrics Correlation



  • It is time consuming to go over thousands of metrics and hundreds of charts to manually identify metrics which are correlated.

    metrics-correlation.png
    Correlated Metrics

    For a specific time window when an anomaly occurred it should be possible to automatically find a subset of charts where for the same time window the same anomaly was detected.
    Let’s say a software issue resulted in high CPU and disk load over thirty seconds during an outage. Selecting that time window for the CPU load chart will provide users with the option to ask Netdata to find correlated charts. Netdata can then display the disk load chart automatically, while also hiding charts that are not relevant.
    This will allow you to very quickly narrow down all the metrics being affected by a specific issue, saving a lot of time in debugging specific software or infrastructure issues.
    We also want users to be able to share specific correlations they have found by sharing a link with their team.


  • Staff

    Working on this at the moment - happy to chat in more detail about it with anyone interested, just reply in here.


  • Staff


  • Staff

    Thanks @andrewm4894 for keeping this up to date!