Netdata 1.45.3 ubuntu 20.04 kernel segfault libpthread

Have many Ubuntu 20.04 servers running netdata. Version 1.44.3 had/has no issues. Some systems (all physical servers) are having issues with 1.45.3. The netdata service will try to start, fail, and try to restart, then continue in this loop with the below log for kernel segfault libpthread.

kernel: [4448124.523030] netdata[3777917]: segfault at 18 ip 00007fa35a072fc4 sp 00007ffd4d864ae8 error 4 in libpthread-2.31.so[7fa35a06e000+11000]

I tried a fresh install of 1.45.3 via ubuntu repo and running the netdata kickstart, but no luck. I had to roll the Ubuntu 20.04 systems back to 1.44.3 and all nodes are running fine without issues on the older netdata version.

Nothing stood out in the 1.45 changelog where it would cause it to act like this.

netdata.conf:

[global]
	error log = syslog
	access log = none

[web]
  # all defaults

[plugins]
  # all defaults

[health]
	enabled = yes
	run at least every seconds = 15

[registry]
  enabled = no
	# all defaults

[statsd]
	enabled = yes
	max private charts allowed = 5000
	max private charts hard limit = 5000
	private charts history = 60
  update every (flushInterval) = 15

Any help/thoughts are appreciated!

Could you please open a bug report for is?

If possible, could you check also if there is anything the logs that could point to a possible issue?
Note: Sometime back our logging changed from file into systemd journal by default, in case you need some details - Netdata Logging | Learn Netdata

Thanks for the reminder on the logging change with the upgrade. This is from the journal.

Apr 25 00:04:39  netdata[2559809]: SPAWN: Ran out of protocol buffer space.
Apr 25 00:04:39  netdata[2559809]: time=2024-04-25T00:04:39.891-05:00 comm=netdata source=daemon level=alert errno="4, Interrupted system call" tid=2559809 thread=netdata msg="Assertion `SPAWN_
PROT_EXEC_CMD == header->opcode' failed"
Apr 25 00:04:39  netdata[2559809]: /usr/sbin/netdata(+0x4a10b2)[0x563192d760b2]
Apr 25 00:04:39  netdata[2559809]: /usr/sbin/netdata(+0x38f318)[0x563192c64318]
Apr 25 00:04:39  netdata[2559809]: /lib/x86_64-linux-gnu/libuv.so.1(+0x1abb1)[0x7f3e150aebb1]
Apr 25 00:04:39  netdata[2559809]: /lib/x86_64-linux-gnu/libuv.so.1(+0x1b6e8)[0x7f3e150af6e8]
Apr 25 00:04:39  netdata[2559809]: /lib/x86_64-linux-gnu/libuv.so.1(uv__io_poll+0x360)[0x7f3e150b4b90]
Apr 25 00:04:39  netdata[2559809]: /lib/x86_64-linux-gnu/libuv.so.1(uv_run+0x11c)[0x7f3e150a485c]
Apr 25 00:04:39  netdata[2559809]: /usr/sbin/netdata(+0x38f657)[0x563192c64657]
Apr 25 00:04:39  netdata[2559809]: /usr/sbin/netdata(+0x799fb)[0x56319294e9fb]
Apr 25 00:04:39  netdata[2559809]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f3e1495f083]
Apr 25 00:04:39  netdata[2559809]: /usr/sbin/netdata(+0x7dcfe)[0x563192952cfe]
Apr 25 00:04:39  netdata[2559809]: time=2024-04-25T00:04:39.891-05:00 comm=netdata source=daemon level=info tid=2559809 thread=netdata msg="NETDATA SHUTDOWN: initializing shutdown with code 1..."
Apr 25 00:04:39  netdata[2566342]: sh: 1: Syntax error: Unterminated quoted string
Apr 25 00:04:39  netdata[2559799]: time=2024-04-25T00:04:39.893-05:00 comm=netdata source=daemon level=info tid=2559808 thread=DAEMON_SPAWN msg="EOF found in spawn pipe."
Apr 25 00:04:39  netdata[2559799]: time=2024-04-25T00:04:39.897-05:00 comm=netdata source=daemon level=alert tid=2559808 thread=DAEMON_SPAWN msg="Assertion `ret == 0' failed" 
Apr 25 00:04:39  netdata[2559799]: /usr/sbin/netdata(+0x4a10b2)[0x55e5c0aa10b2]
Apr 25 00:04:39  netdata[2559799]: /usr/sbin/netdata(+0x3900a3)[0x55e5c09900a3]
Apr 25 00:04:39  netdata[2559799]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7f0332adf609]
Apr 25 00:04:39  netdata[2559799]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f0332169353]
Apr 25 00:04:39  netdata[2559799]: time=2024-04-25T00:04:39.898-05:00 comm=netdata source=daemon level=info tid=2559808 thread=DAEMON_SPAWN msg="NETDATA SHUTDOWN: initializing shutdown with code 1..."

Cool there could be something there, Could you please open a bug report for is? Sign in to GitHub · GitHub

Yup, Bug Report opened.

1 Like