I just switched to Netdata, and I’m absolutely loving it. I’m having an odd issue though, and I’m not quite sure how to go about fixing it.
I have ten nodes attached to Netdata Cloud. Everything works pretty well, but from time to time I’m getting
web_log_1m_unmatched errors on the five nodes that make up our CDN. This happens seemingly at random, and has thus far happened on all five of those nodes, although rarely at the same time. The description on the error is
percentage of unparsed log lines over the last minute. Usually, I’d assume that this is because my logs are in a custom format, or are throwing errors that can’t be read. But the five servers it’s happening on are all functionally identical to one another (and have exactly the same logging format), so I can’t see why it would happen on one while the rest are working correctly.
I’m not sure if this related, but I’ve also noticed that—again, seemingly at random—certain of those same five nodes don’t show up when I navigate to System Overview → web log → responses. For example: All five of those nodes running the same version of NGINX (v1.17.3), and all five have the same logging system, but, despite all being connected correctly to Netdata, I get all five of them in
But I only get three of them in
Moreover, if I navigate to the System Overview pages of the two nodes that are missing from
web_log_nginx.status_code_class_5xx_responses, there are no graphs at all for that class.
Final piece of information that may or may not be relevant: When I restart the Netdata agent with
systemctl restart netdata, it hangs for about a minute before restarting. I don’t know if that’s expected behavior or not.