Problem with Process getting killed

Problem/Question

I just upgraded the netdata agent to the latest stable version and it seems there is a process that gets regularly killed and iniciated again. In the Portainer log I find a sequence like that which is constantly repeating itself:

time=2023-12-23T23:56:53.527+00:00 comm=cgroup-network source=collector level=error tid=1912 thread=cgroup-network msg=“child pid 1913 exited with code 1.”

time=2023-12-23T23:56:53.527+00:00 comm=cgroup-network source=collector level=error tid=1912 thread=cgroup-network msg=“Cannot find a cgroup PID from cgroup ‘/host/sys/fs/cgroup/system.slice/docker-b2cf84fe7aa34ef1e7d39f13d75d59f99bbfbbc593c4688fba241683a100afbb.scope’”

time=2023-12-23T23:56:53.528+00:00 comm=netdata source=daemon level=info tid=1 thread=netdata msg=“SIGNAL: reap_child(1912) exited with code: 1”

time=2023-12-23T23:56:53.528+00:00 comm=netdata source=daemon level=error tid=229 thread=P[cgroups] msg=“child pid 1912 exited with code 1.”

time=2023-12-23T23:56:56.359+00:00 comm=netdata source=health level=info tid=171 thread=HEALTH msg_id=9ce0cb58ab8b44df82c4bf1ad9ee22de node=manager-br1 instance=cgroup_b2cf84fe7aa3.mem_usage context=cgroup.mem_usage code=0 alert_id=1686213108 alert_unique_id=1686412878 alert_event_id=2 alert_transition_id=7524a2f93eea42f985c8ebf2e3332c21 alert_config=2c4cd05b53debd86d06634990fa04df5 alert=cgroup_ram_in_use alert_class=Utilization alert_component=Memory alert_type=Cgroups alert_recipient=silent alert_duration=0 alert_value=18.1612015 alert_value_old=null alert_status=CLEAR alert_value_old=UNINITIALIZED alert_units=% alert_summary=“Cgroup b2cf84fe7aa3 memory utilization” alert_info=“Cgroup b2cf84fe7aa3 memory utilization” alert_notification_timestamp=2023-12-23T23:56:56+00:00 msg=“ALERT ‘cgroup_ram_in_use’ of instance ‘cgroup_b2cf84fe7aa3.mem_usage’ on node ‘manager-br1’, transitioned from UNINITIALIZED to CLEAR”

time=2023-12-23T23:56:58.228+00:00 comm=netdata source=access level=debug tid=165 thread=ACLKSYNC msg=“ACLK RES [7c0a9c55-c687-488e-8cd8-56af6be3e421 (manager-br1)]: ALERTS SENT from 190917 to 190917”

time=2023-12-23T23:57:03.407+00:00 comm=cgroup-name.sh source=collector level=info tid=1937 thread=cgroup-name request="‘/usr/libexec/netdata/plugins.d/cgroup-name.sh’ ‘/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ ‘system.slice_docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ " msg=“Running API command: curl "/var/run/docker.sock/containers/6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987/json"”

curl: (3) URL using bad/illegal format or missing URL

time=2023-12-23T23:57:03.472+00:00 comm=cgroup-name.sh source=collector level=warning tid=1942 thread=cgroup-name request="‘/usr/libexec/netdata/plugins.d/cgroup-name.sh’ ‘/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ ‘system.slice_docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ " msg=“cannot find the name of docker container ‘6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987’”

time=2023-12-23T23:57:03.492+00:00 comm=cgroup-name.sh source=collector level=info tid=1943 thread=cgroup-name request="‘/usr/libexec/netdata/plugins.d/cgroup-name.sh’ ‘/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ ‘system.slice_docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ " msg=“cgroup ‘system.slice_docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ is called ‘6c3cf61a3632’, labels ‘’”

time=2023-12-23T23:57:03.494+00:00 comm=netdata source=daemon level=info tid=1 thread=netdata msg=“SIGNAL: reap_child(1931) exited with code: 2”

time=2023-12-23T23:57:03.494+00:00 comm=netdata source=daemon level=error tid=229 thread=P[cgroups] msg=“child pid 1931 exited with code 2.”

time=2023-12-23T23:57:04.403+00:00 comm=cgroup-name.sh source=collector level=info tid=1950 thread=cgroup-name request="‘/usr/libexec/netdata/plugins.d/cgroup-name.sh’ ‘/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ ‘system.slice_docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ " msg=“Running API command: curl "/var/run/docker.sock/containers/6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987/json"”

curl: (3) URL using bad/illegal format or missing URL

time=2023-12-23T23:57:04.526+00:00 comm=cgroup-name.sh source=collector level=warning tid=1955 thread=cgroup-name request="‘/usr/libexec/netdata/plugins.d/cgroup-name.sh’ ‘/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ ‘system.slice_docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ " msg=“cannot find the name of docker container ‘6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987’”

time=2023-12-23T23:57:04.546+00:00 comm=cgroup-name.sh source=collector level=info tid=1956 thread=cgroup-name request="‘/usr/libexec/netdata/plugins.d/cgroup-name.sh’ ‘/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ ‘system.slice_docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ " msg=“cgroup ‘system.slice_docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ is called ‘6c3cf61a3632’, labels ‘’”

time=2023-12-23T23:57:04.548+00:00 comm=netdata source=daemon level=info tid=1 thread=netdata msg=“SIGNAL: reap_child(1944) exited with code: 2”

time=2023-12-23T23:57:04.549+00:00 comm=netdata source=daemon level=error tid=229 thread=P[cgroups] msg=“child pid 1944 exited with code 2.”

time=2023-12-23T23:57:04.553+00:00 comm=cgroup-network source=collector level=info tid=1957 thread=cgroup-network msg=“Using host prefix directory ‘/host’”

time=2023-12-23T23:57:04.554+00:00 comm=cgroup-network source=collector level=info errno=“2, No such file or directory” tid=1957 thread=cgroup-network msg=“running: exec /usr/libexec/netdata/plugins.d/cgroup-network-helper.sh --cgroup ‘/host/sys/fs/cgroup/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’”

time=2023-12-23T23:57:04.585+00:00 comm=cgroup-network-helper.sh source=collector level=info tid=1963 thread=cgroup-network-helper request="‘/usr/libexec/netdata/plugins.d/cgroup-network-helper.sh’ ‘–cgroup’ ‘/host/sys/fs/cgroup/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’ " msg=“searching for network interfaces of cgroup ‘/host/sys/fs/cgroup/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’”

time=2023-12-23T23:57:04.619+00:00 comm=cgroup-network source=collector level=error tid=1957 thread=cgroup-network msg=“child pid 1958 exited with code 1.”

time=2023-12-23T23:57:04.620+00:00 comm=cgroup-network source=collector level=error tid=1957 thread=cgroup-network msg=“Cannot find a cgroup PID from cgroup ‘/host/sys/fs/cgroup/system.slice/docker-6c3cf61a363236353ca30695782d8951d63cde64d97a5919ed4b4ee481289987.scope’”

time=2023-12-23T23:57:04.621+00:00 comm=netdata source=daemon level=info tid=1 thread=netdata msg=“SIGNAL: reap_child(1957) exited with code: 1”

time=2023-12-23T23:57:04.621+00:00 comm=netdata source=daemon level=error tid=229 thread=P[cgroups] msg=“child pid 1957 exited with code 1.”

time=2023-12-23T23:57:06.716+00:00 comm=netdata source=health level=debug tid=171 thread=HEALTH msg_id=9ce0cb58ab8b44df82c4bf1ad9ee22de node=manager-br1 instance=cgroup_3ef0fb75e95d.mem_usage context=cgroup.mem_usage code=0 alert_id=1686213098 alert_unique_id=1686412881 alert_event_id=3 alert_transition_id=b2b61ec5675a479e845a22562be78ae5 alert_config=2c4cd05b53debd86d06634990fa04df5 alert=cgroup_ram_in_use alert_class=Utilization alert_component=Memory alert_type=Cgroups alert_recipient=silent alert_duration=65 alert_value=null alert_value_old=16.2445068 alert_status=REMOVED alert_value_old=CLEAR alert_units=% alert_summary=“Cgroup 3ef0fb75e95d memory utilization” alert_info=“Cgroup 3ef0fb75e95d memory utilization” alert_notification_timestamp=2023-12-23T23:57:06+00:00 msg=“ALERT ‘cgroup_ram_in_use’ of instance ‘cgroup_3ef0fb75e95d.mem_usage’ on node ‘manager-br1’, transitioned from CLEAR to REMOVED”

time=2023-12-23T23:57:06.826+00:00 comm=netdata source=health level=debug tid=171 thread=HEALTH msg_id=9ce0cb58ab8b44df82c4bf1ad9ee22de node=manager-br1 instance=cgroup_3ef0fb75e95d.cpu_limit context=cgroup.cpu_limit code=0 alert_id=1686213099 alert_unique_id=1686412882 alert_event_id=2 alert_transition_id=2a52b2f45f7543dbb70fc4eb83a8dd4d alert_config=7cc90e62c81141816a8cf8cc97a8dbe1 alert=cgroup_10min_cpu_usage alert_class=Utilization alert_component=CPU alert_type=Cgroups alert_recipient=silent alert_duration=65 alert_value=null alert_value_old=null alert_status=REMOVED alert_value_old=UNINITIALIZED alert_units=% alert_summary=“Cgroup 3ef0fb75e95d CPU utilization” alert_info=“Cgroup 3ef0fb75e95d average CPU utilization over the last 10 minutes” alert_notification_timestamp=2023-12-23T23:57:06+00:00 msg=“ALERT ‘cgroup_10min_cpu_usage’ of instance ‘cgroup_3ef0fb75e95d.cpu_limit’ on node ‘manager-br1’, transitioned from UNINITIALIZED to REMOVED”

time=2023-12-23T23:57:06.935+00:00 comm=netdata source=health level=info tid=171 thread=HEALTH msg_id=9ce0cb58ab8b44df82c4bf1ad9ee22de node=manager-br1 instance=cgroup_6c3cf61a3632.mem_usage context=cgroup.mem_usage code=0 alert_id=1686213110 alert_unique_id=1686412883 alert_event_id=2 alert_transition_id=7a48340a03894c0c86513230c3e6f23c alert_config=2c4cd05b53debd86d06634990fa04df5 alert=cgroup_ram_in_use alert_class=Utilization alert_component=Memory alert_type=Cgroups alert_recipient=silent alert_duration=0 alert_value=18.1051254 alert_value_old=null alert_status=CLEAR alert_value_old=UNINITIALIZED alert_units=% alert_summary=“Cgroup 6c3cf61a3632 memory utilization” alert_info=“Cgroup 6c3cf61a3632 memory utilization” alert_notification_timestamp=2023-12-23T23:57:06+00:00 msg=“ALERT ‘cgroup_ram_in_use’ of instance ‘cgroup_6c3cf61a3632.mem_usage’ on node ‘manager-br1’, transitioned from UNINITIALIZED to CLEAR”

time=2023-12-23T23:57:08.257+00:00 comm=netdata source=access level=debug tid=165 thread=ACLKSYNC msg=“ACLK RES [7c0a9c55-c687-488e-8cd8-56af6be3e421 (manager-br1)]: ALERTS SENT from 190918 to 190920”

Hi,

Could you please open a bug report it would provide more details for the team.

Regards,
Hugo

alright, just did that

thanks, just linking the issue to this discussion