Entropy Alarm outdated

Greg_Munro · March 17, 2021, 8:37pm

I have an entropy alarm that is over 850 hrs old. I fixed this ages ago. How long does it take until that alarm goes away and I never have to worry about it again?

Is there not some way in netdata cloud to acknowledge an alarm and clear it?

OdysLam · March 18, 2021, 10:58am

Hey @Greg_Munro and welcome to our forums!

This is a known bug and we are working towards fixing it. TBH we are currently reworking a lot of functionality on the backend, so this is why it has taken time.

Regarding Incident management (silence alarms and acknowledge an incident), is on our Roadmap, but I can’t share any timeline for the implementation. Perhaps the @netdata-product team might be able to share more details

If you have any other questions, please feel free to share. We are here to help!

Manos_Saratsis · March 18, 2021, 11:45am

@Greg_Munro Thanks for using Netdata. We are aware of this issue and we will address it. There is an open bug in Github

Grboy · February 27, 2022, 1:13pm

@Manos_Saratsis Is there any update? I currently have an alarm stuck for 25 days.

sashwathn · March 2, 2022, 4:50pm

@Grboy : Do you see this issue still? Can you provide us more details of your Agent version?

Grboy · March 2, 2022, 5:16pm

Yes, I still have this issue. The alarm was stuck 28 days ago. Netdata agent version is - v1.33.1

sashwathn · March 7, 2022, 1:42pm

Thank you for the information. Can you confirm if you have protobuf enabled on your agent and have migrated to the new architecture?

Can you please send us a snapshot of this command:
/usr/sbin/netdatacli aclk-state

Grboy · March 9, 2022, 8:57am

Can you confirm if you have protobuf enabled on your agent and have migrated to the new architecture?

How can I check that?
Btw, I have sent the command output to your DM.

sashwathn · March 9, 2022, 9:42am

You are indeed on the new architecture and this seems to be a case of some old alerts stuck during the migration (most likely).
I will ask my team to look at this. Can you also confirm if the same alert is also seen on your agent - localhost:19999 (or if you have changed these for your agent)?

sashwathn · March 9, 2022, 9:56am

We are fixing these stuck Alerts issues in various forms and this PR is a related one and should be fixed soon - [BUG] Stuck alerts in Netdata Cloud for agents that are no longer live. · Issue #217 · netdata/netdata-cloud · GitHub
Thanks for your feedback!

Grboy · March 10, 2022, 3:49pm

These alerts are not visible on my localhost:19999 agent

sashwathn · March 10, 2022, 4:03pm

Thanks again for the confirmation. This issue is being worked on and should be solved soon.

github.com/netdata/netdata-cloud

[BUG] Stuck alerts in Netdata Cloud for agents that are no longer live.

opened 09:46AM - 10 Dec 21 UTC

closed 02:18PM - 10 Mar 22 UTC

dimko

alerts-team

Stuck alarms happen when an agent/node has an alarm raised on its behalf and the… agent/node that raised it goes offline. ![image](https://user-images.githubusercontent.com/19268853/145553160-66a1d335-40b1-4909-8e0e-dfee418c100f.png) This mostly affects agents with parent/child setup e.g Parent has the alert configuration for a child agent, so he is the one raising the alert for that child. If the Parent dies, the alert for the child is not cleared. - [X] https://github.com/netdata/cloud-schemas/pull/91 - [X] https://github.com/netdata/cloud-alarm-streaming-service/pull/22 - [X] https://github.com/netdata/cloud-alarm-processor-service/pull/34 - [X] https://github.com/netdata/cloud-nodes-service/pull/345 - [X] https://github.com/netdata/cloud-alarm-notification-sender-service/pull/30 - [X] https://github.com/netdata/cloud-alarm-processor-service/pull/35 - [X] https://github.com/netdata/product/issues/2653 - [X] https://github.com/netdata/netdata/pull/11960 - [X] https://github.com/netdata/cloud-alarm-streaming-service/pull/23 - [X] https://github.com/netdata/netdata/pull/11965 - [X] https://github.com/netdata/netdata/pull/12021 - [X] https://github.com/netdata/cloud-alarm-processor-service/pull/46 - [X] https://github.com/netdata/cloud-alarm-processor-service/pull/47

DevNull · August 14, 2022, 12:18pm

Sorry to chime in on a closed issue but I found this while trying to find a way to clear defunct alarms. I see there hasn’t been any activity in quite awhile but it appears that I am having the same issue. As far as the agents are concerned the alarms don’t exist. Trying to click and follow them in the cloud interface leads to a page stating that chart/metric doesn’t exist (although the metrics do exist and are continuing to update.

In my case, I have suddenly (over the past couple of weeks) been getting constant floods of anomaly alarms that will almost immediately clear. They vary as to the metric they are alarming on. Usually ip, ram, hardirq etc.

The problem is that at one point while there were 60 or so alarms in the cloud, I was performing maintenance on all of the children (15 or so) as well as the parent. During this time, all the agents were updated and the servers were restarted.

The restarts were for resource increases and I think that briefly there was high RAM usage and Netdata may have been killed…unfortunately this may have also happened while it was temporarily unable to write to the disk as well. The reboots should’ve been clean and it appeared that way. Netdata seemed to start normally on the parent and children but ever since they came back up the cloud interface is still persisting the alerts (albeit with no actual values listed for the metric in the alerts tab list) while the children and parent nodes show no alarms and seem to be otherwise running normally…

Is there any news on this or way to clear the alarms out of the cloud interface? Restarting the agents doesn’t seem to do anything except change the “Triggered” time to when the agent was restarted.

Topic		Replies	Views
A lot of old Alerts without any Value Help cloud-dashboards , agent	23	2183	December 18, 2023
Clear or acknowledge alert Help cloud	3	358	October 8, 2024
Netdata Agent vs Cloud alarm notifications General faq	1	698	June 28, 2021
No notifications in Netdata app Help cloud	8	485	March 5, 2024
Issue with netdata (wrong) alert Help	1	49	September 20, 2024

Entropy Alarm outdated

Related topics