Cpu traps / segfault

sdo · June 10, 2021, 8:02am

Since a few days ago our central netdata metrics collector have started to cause cpu traps. It runs as a docker container.

The host syslog error message is:

Jun 9 06:29:59 netdata01 kernel: [156866.969743] traps: WEB_SERVER[stat[24373] general protection ip:7fe2b2ff22d9 sp:7fe2b03c8878 error:0 in ld-musl-x86_64.so.1[7fe2b2feb000+47000]

The error happens approximately every 7 minute. Sometimes every 1 min. And sometimes as long as 12 minutes in between crashed.

I have tried version 1.26.0, 1.30.1 and 1.31.0 and each have precisely same error.

When running as a native systemd application directly on host Ubuntu, the error message in syslog is:

Jun 10 08:53:54 netdata01 kernel: [62290.509176] WEB_SERVER[stat[23533]: segfault at 55570589ba34 ip 000055570589ba34 sp 00007f9ca5486290 error 15

In both cases no crashdump information is available in apport. Just messages list this:

ERROR: apport (pid 26217) Wed Jun 9 06:29:59 2021: host pid 24140 crashed in a container without apport support
ERROR: apport (pid 24822) Thu Jun 10 08:53:54 2021: host pid 23366 crashed in a separate mount namespace, ignoring

The Ubuntu host is a VM running on vmware. I have tried to migrate the VM to different physical hardware, and the errors persist regardless of underlying hardware.

What can I do to identify cause?
Any suggestions for workaround?

OdysLam · June 10, 2021, 1:23pm

Hi @sdo and welcome back!

I have contacted our engineers and we will see this shortly. It sounds indeed disturbing. We are sorry that you experience this.

We will get it resolved

Vasilis_Kalintiris · June 10, 2021, 3:07pm

Hi @sdo, thanks for reporting this!

I have a couple questions mostly meant to understand a little bit better what your setup looks like:

Which installation method did you use to install Netdata?
What is the output of cat /etc/lsb-release /etc/os-release and netdata -W buildinfo?
Is it possible to provide the contents of netdata’s error.log?

Finally, there was a similar issue opened on Github a while ago. It might be worthwhile to follow the steps mentioned in that issue to rule out this one being a duplicate.

Vasilis_Kalintiris · June 24, 2021, 9:01am

Hi @sdo,

I’m curious if there’s any update on this issue. Is the Netdata agent still crashing or did you manage to fix the problem in some other way?

Topic		Replies	Views
Segfaults all of a sudden Help agent	14	754	October 18, 2023
Netdata 1.45.3 ubuntu 20.04 kernel segfault libpthread Help agent	4	209	April 29, 2024
Netdata does not start after install Help agent , installation	8	1083	December 6, 2023
Netdata broke yet again after auto update on 2 nodes Help	6	398	June 16, 2023
Problem with Process getting killed Help	3	428	December 28, 2023

Cpu traps / segfault

Related topics