Docker is an open source containerization platform. It enables developers to package applications into containers—standardized executable components combining application source code with the operating system (OS) libraries and dependencies required to run that code in any environment
Sometimes while our container is running, the application inside may have crashed. To foresee those events, container runtimes (CR) and orchestrators perform health checks to endpoints inside the functional units of the container. A container marked as unhealthy by the CR, is malfunctioning and should be stopped. Those health checks are defined by the creator of the container with the HEALTHCHECK instructions. 1
The Netdata Agent monitors the average number of unhealthy docker containers over the last 10 seconds. This alert indicates that some containers are not running due to failed health checks.
This alert is raised into warning when at least one container is unhealthy in your Docker engine.
Find the NAME of the container that is marked as UNHEALTHY.
Check the logs of this container to get some insights into what’s going wrong
root@netdata # docker logs <UNHEALTHY_CONTAINER>
In many cases, your app’s logs may not appear in docker log collector. A simple workaround is something like this, redirect your apps’s logs into stderr. Use this workaround purposefully. Another workaround is to redirect any log attempt to log directly into the /proc/self/fd/2.
Restart the container and see if this fixes the problem.
root@netdata # docker logs <UNHEALTHY_CONTAINER>
If you receive this alert often, you may have to do further investigation on why this event occurs
Hi, @Juanra. Unfortunately, Netdata does not collect the health status of individual containers, so it is not possible to know which container is unhealthy from the Netdata user interface (and alarms).
found the command for it ./edit-config health.d/dockerd.conf. it is confusing that we edit docker.conf to configure docker plugin, but dockerd.conf to configure docker health alert.
It is enabled by default and available in both the latest stable and nightly releases have it. Netdata creates a “Docker container health status” chart for every Docker container + alarm for unhealthy containers.
Can I exclude a specific container or more from sending unhealthy alerts? I Have one with VPN and sometimes it restarts if not connected how can I exclude it
I think it would be easy enough to define a silencing rule in netdata cloud for the alert instance of that specific container Cloud alert notifications | Learn Netdata.
@ilyam8 do you know if is another way to essentially disable for some subset of containers via the on or chart_labels maybe?