Monitoring Ethereum and Bitcoin Full Nodes with Netdata

Hey everyone,

Over the last few days, I have been doing some experiments around monitoring blockchain-related applications with Netdata, which I think it’s a natural fit.

In case you are not familiar with space, here is an awesome primer on the main concepts behind blockchain:

Why Netdata + Blockchain

Full-Node operators

First of all, there are roughly 2 ways that a full-node operator can make money. The first one is if they are a miner as well, that means that they invest value into the system, boosting its security. What they get in return is the system’s reward.

Roughly, there are 2 types of values that a miner can invest. In PoW systems, the miner invests computational power, which comes with a high cost in chips and electricity. In PoS systems, the miner invests capital in the form of the native cryptocurrency of the system.

In both systems, they are incentivized to play nicely, since in case they don’t, they lose part of the value they have invested. In PoW that is wasted computations and in PoS that is part of the staked capital.

The other group of full-node operators is those who offer their full-node as a service to receive transactions, verify them and broadcast them in the network for miners to pick them up and bundle them in blocks.

In the case of an outage, both operators lose money. The miner because their system stops supplying the system with security (thus the system punishes it) and in the case of a full-node operator, they lose a particular time window for a particular transaction.

That second externality is most relevant to trading, where the right transaction must be submitted timely to be included and profit (e.g arbitrage). Ofc, if they offer ‘Full-node as a service’, they also lose clients. (e.g Infura)

In both cases, we have very high stakes, where people need to monitor their systems and ensure a smooth 24/7 uptime at optimal performance. You can’t be a miner and suddenly lose 50% of your computational capacity because some GPU stopped working.

Why Netdata

I think that the Netdata Agent fits naturally in this context, as it offers per-second metrics of every system metric. Moreover, it does all the legwork for the user (auto-detection, auto-configuration, default alerts, chart creation).

Think about it. We have a large group of people, with very high financial stakes, who do not have the time, expertise and/or interest to set up complex monitoring solutions for their setups.

They just want something that works™️

We see that a considerable number of miners is not some big company that can spare to hire DevOps people to monitor and maintain their systems, but a lot of smaller operations.

These operations can’t afford to hire that specialized people.

Yet, they have the exact same needs.

So, this thread is actually a CTA for blockchain operators in the #Ethereum and #Bitcoin space.

I want to speak to you. so that I can better understand your needs and enrich Netdata with application-specific metrics.

That’s right. All the things that I mentioned above, not only for your system but for the full-node software itself.

Right now, we can gather all the metrics from both go-ethereum and OpenEthereum full nodes, using our Prometheus endpoint collector.

This is not ideal.

What we really want is our own collector, which will gather the metrics that we want and organize them into meaningful charts. Then, based on these charts, we will create sane default alarms.

(BTW, there is an FAQ about collectors:

This is where I need you.

Netdata is an OSS agent that democratizes monitoring, for everyone, for free. In order to do that, I need your domain expertise.

What metrics matter, why they matter, how should they be presented to the user. What grouping should we do?

If you are interested, leave a comment below and let’s bring zero-effort monitoring to the blockchain world! Ideally, we will implement these collectors in Golang, since we already have packages to gather data from prometheus endpoints.

3 Likes

With the following PRs, Netdata will auto-detect most widely-used Ethereum full nodes!

Check the video below, couple of highlights:

  • eBPF monitoring for every application
  • Minimal overhead: 2.500 metrics, but consume less than 1% of CPU
  • Application-specific metrics thanks to our Prometheus endpoint collector
1 Like

Follow the discussion for creating an Ethereum Collector in the following thread:

1 Like

Exactly! Looking forward to assisting.

1 Like

@DK.mpls are you running an Ethereum node?

I would love your input regarding the metrics that geth is making available. I shared the metrics in the topic that I linked above.

If you like, drop me a message in our Discord server and let’s continue the discussion there!

Hey everyone, a quick update on this.

EthCC was a blast and I compiled a full guide on how to monitor an Ethereum Node and extend it!

We go into both system monitoring and how Geth affects the underlying system.

Feedback is more than welcome!

Netdata’s definitely a solid pick for keeping things running smoothly, especially when downtime can cost you big. I’ve been in a similar spot, juggling system monitoring and making sure everything’s ticking along while also trying to stay on top of all the compliance stuff like MiCA and FATF. It can get overwhelming, but I’ve found that mixing good monitoring tools with something that handles crypto compliance on the side really saves you a ton of headaches.