docker_unhealthy_containers

Tasos_Katsoulas · November 3, 2021, 9:35pm

docker_unhealthy_containers

Containers | Docker

Docker is an open source containerization platform. It enables developers to package applications into containers—standardized executable components combining application source code with the operating system (OS) libraries and dependencies required to run that code in any environment

Sometimes while our container is running, the application inside may have crashed. To foresee those events, container runtimes (CR) and orchestrators perform health checks to endpoints inside the functional units of the container. A container marked as unhealthy by the CR, is malfunctioning and should be stopped. Those health checks are defined by the creator of the container with the HEALTHCHECK instructions. ¹

The Netdata Agent monitors the average number of unhealthy docker containers over the last 10 seconds. This alert indicates that some containers are not running due to failed health checks.

This alert is raised into warning when at least one container is unhealthy in your Docker engine.

References and sources

HEALTHCHECK instruction in Docker docs

Troubleshooting section

Inspect and restart the UNHEALTY container

Check all the containers in the system.
```
root@netdata # docker ps -a
```
Find the NAME of the container that is marked as UNHEALTHY.
Check the logs of this container to get some insights into what’s going wrong
```
root@netdata # docker logs <UNHEALTHY_CONTAINER>
```
In many cases, your app’s logs may not appear in docker log collector. A simple workaround is something like this, redirect your apps’s logs into stderr. Use this workaround purposefully. Another workaround is to redirect any log attempt to log directly into the /proc/self/fd/2.
Restart the container and see if this fixes the problem.
```
root@netdata # docker logs <UNHEALTHY_CONTAINER>
```
If you receive this alert often, you may have to do further investigation on why this event occurs

Juanra · February 6, 2023, 4:55pm

Hi guys,

Sorry if this question is documented somewhere, but where can I see which container is in this status?

It has not been easy for me to find it in the notification I receive:

ilyam8 · February 6, 2023, 9:01pm

Hi, @Juanra. Unfortunately, Netdata does not collect the health status of individual containers, so it is not possible to know which container is unhealthy from the Netdata user interface (and alarms).

IGG · March 30, 2023, 4:40pm

found the command for it ./edit-config health.d/dockerd.conf. it is confusing that we edit docker.conf to configure docker plugin, but dockerd.conf to configure docker health alert.

IGG · March 30, 2023, 4:40pm

How do you modify this alert so it is not too sensitive? Can it be averaged over a more extended period of time or change the threshold value?

ilyam8 · April 11, 2023, 4:07pm

Per container health metrics implemented in docker: add per-container stats by ilyam8 · Pull Request #1148 · netdata/go.d.plugin · GitHub

shalak · July 10, 2023, 9:06am

How do we use this feature?

ilyam8 · July 10, 2023, 3:45pm

It is enabled by default and available in both the latest stable and nightly releases have it. Netdata creates a “Docker container health status” chart for every Docker container + alarm for unhealthy containers.

spupuz · July 31, 2023, 3:35pm

Can I exclude a specific container or more from sending unhealthy alerts? I Have one with VPN and sometimes it restarts if not connected how can I exclude it

andrewm4894 · July 31, 2023, 3:40pm

I think it would be easy enough to define a silencing rule in netdata cloud for the alert instance of that specific container Cloud alert notifications | Learn Netdata.

@ilyam8 do you know if is another way to essentially disable for some subset of containers via the on or chart_labels maybe?

spupuz · July 31, 2023, 3:45pm

What if had configured alerts for telegram at node level?

ilyam8 · August 1, 2023, 8:40pm

chart_labels maybe

Yes, we have it in our documentation - chart_labels description with examples.

chart labels: container_name=!NameToExclude *

spupuz · August 4, 2023, 2:46pm

Cool which file should I edit?

andrewm4894 · August 9, 2023, 4:49pm

Add chart labels line to this I think.

/health.d/docker.conf

github.com

netdata/netdata/blob/master/health/health.d/docker.conf

 template: docker_container_unhealthy
       on: docker.container_health_status
    class: Errors
     type: Containers
component: Docker
    units: status
    every: 10s
   lookup: average -10s of unhealthy
     warn: $this > 0
     info: ${label:container_name} docker container health status is unhealthy
       to: sysadmin

More info

Topic		Replies	Views
Alert for another docker container crash Help agent-alarms , container-monitoring , agent	3	2358	February 10, 2021
Docker Container down Alert configuration Help agent , alerts	9	354	July 9, 2024
Disable all docker monitoring except network Help cloud	6	843	October 15, 2021
Monitor if a container is alive Help	1	475	November 29, 2022
Health alarm created in a netdata docker node not listed in Netdata Cloud -> Alert Configurations Help agent , cloud , alerts	10	1003	July 25, 2022

docker_unhealthy_containers

docker_unhealthy_containers

Troubleshooting section

Related topics