Generic Prometheus/OpenMetrics collector

We recently deployed a generic Prometheus collector that works seamlessly with any application that makes its metrics available in the Prometheus/OpenMetrics exposition format, including support for Windows 10 via windows_exporter. Netdata will autodetect over 600 Prometheus endpoints and instantly generate charts with all the exposed metrics, meaningfully visualized. It will be part of the v1.24 stable release, but some of you will have already seen it in the nightly builds.

You can also quickly and easily configure the collector with the names and URLs of additional Prometheus endpoints to instantly view automatically generated charts with all the exposed metrics, meaningfully visualized within Netdata at the same high-granularity, per-second frequency you expect, all in real time.

Why does it matter for Netdata?

Netdata is designed to be the best tool possible for collecting every single metric available from any system or application, then presenting those metrics in a way that makes them understandable and actionable. Part of that intention means that Netdata must definitionally be open, interoperable, and extensible so that it can work with the entirety of modern infrastructure.

This new enhancement is exciting because it radically extends the number of metrics available to Netdata out of the box, but also because enables Netdata to support an evolving standard that will allow us to continue to fit into any technology stack possible, now and in the future, with no limitations on the number, kind, or frequency of metrics collected.

Limitations and future direction

This first attempt at visualizing Prometheus metrics with zero configuration has a few limitations that we are aware of.

Metrics with the same name and many different label key-value pairs can potentially have a very high cardinality, namely too many time series to visualize in a single chart. We debated automatically splitting such time series into multiple charts based on the cardinality of the detected label key-value pairs, but we decided that using the wrong labels to organize information is worse than not choosing any labels at all. So we opted to release this first version with a temporary solution that will split the time series arbitrarily and are working on providing configuration options to allow users to define meaningful grouping options. For example, for Prometheus metric X, group the time series into charts based on labels Y and Z.

Zero-configuration autodetection currently works for services running on the same host as the Netdata Agent. You can configure additional endpoints as with any Netdata collector. We will be extending our service discovery capabilities so that we can discover as many OpenMetrics endpoints as possible in Docker and Kubernetes.

When you use Netdata’s built-in, long-term storage (dbengine), the memory usage is currently directly related to the number of dimensions collected and stored in the database. We are working to significantly reduce that memory footprint, so we can collect arbitrary numbers of metrics, and store them for arbitrarily long periods of time, with limited memory requirements.

We need your help!

We are very proud of the direction we are taking with service discovery and automated OpenMetrics, but we need your feedback to improve. Let’s exchange ideas about the new collector here!