I am having an issue with Time/Clock ?
I am getting a lot of heartbeat clock: woke up 3863941 microseconds later than expected and the systems is crashing
Timedatectl and chronyd is ok and synced, what should I check ?
Feb 09 15:56:36 NetData ebpf.plugin[1966]: heartbeat clock: woke up 3886715 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 09 15:56:36 NetData apps.plugin[1967]: heartbeat clock: woke up 3863941 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Due to that i have disable netdata because causing a lot of CPU HIGH and systems is stuck
We are reworking with eBPF these days ([Feat]/[BUG]: eBPF memory improvement · Issue #14477 · netdata/netdata · GitHub), so let me do a question about your environment to understand the motive you are having these erros, because before 1.38 was released, I modified one of these plugins. It is possible they have a common issue, because they use the same librariies.
Which version of netdata you are running right now?
How many “PIDs” you have in your system when you run the command ?
I was updated to 1.38 but the issue was discovered back to 23/Nov with 1.37 version but still then I didn’t had CPU HOGs and slowdown. At the moment I reverted back with apt command to 1.37 latest night
root@NetData:/etc/netdata# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.5 LTS
Release: 20.04
Codename: focal
Even with 1.37 I am still having issues and i am forced to stop the service
EDIT : This is also happening with other module/plugin
Feb 09 17:16:35 NetData nfacct.plugin[4639]: heartbeat clock: woke up 2571663 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
I got an additional Kernel Panic to all this as below
eb 10 14:05:59 NetData apps.plugin[16039]: Using now_boottime_usec() for uptime (dt is 5 ms)
Feb 10 14:06:23 NetData nfacct.plugin[16043]: heartbeat clock: woke up 894116 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 10 14:06:23 NetData ebpf.plugin[16044]: heartbeat clock: woke up 831291 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 10 14:06:42 NetData nfacct.plugin[16043]: heartbeat clock: woke up 506099 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 10 14:06:42 NetData ebpf.plugin[16044]: heartbeat clock: woke up 1359574 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 10 14:06:59 NetData sudo[15785]: pam_unix(sudo:session): session closed for user root
Feb 10 14:07:23 NetData kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [kworker/1:0:15107]
Feb 10 14:07:23 NetData kernel: Modules linked in: ufs msdos xfs wireguard ip6_udp_tunnel udp_tunnel nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nfnetlink_acct nfnetlink isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua binfmt_misc joydev input_leds serio_raw sch_fq_codel msr efi_pstore ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid bochs_drm drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm psmouse virtio_net net_failover failover virtio_blk floppy
Feb 10 14:07:23 NetData kernel: CPU: 1 PID: 15107 Comm: kworker/1:0 Not tainted 5.4.0-137-generic #154-Ubuntu
Feb 10 14:07:23 NetData kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Feb 10 14:07:23 NetData kernel: Workqueue: events drm_fb_helper_dirty_work [drm_kms_helper]
Feb 10 14:07:23 NetData kernel: RIP: 0010:smp_call_function_many+0x205/0x270
Feb 10 14:07:23 NetData kernel: Code: e8 a0 8d 93 00 3b 05 4e 82 b0 01 89 c7 0f 83 9b fe ff ff 48 63 c7 48 8b 0b 48 03 0c c5 80 e9 64 b9 8b 41 18 a8 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c8 89 cf 48 c7 c2 60 9f e4 b9 4c 89 fe
Feb 10 14:07:23 NetData kernel: RSP: 0018:ffffa9e680a77c70 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
Feb 10 14:07:23 NetData kernel: RAX: 0000000000000003 RBX: ffff961a3eab0e00 RCX: ffff961a3ea37100
Feb 10 14:07:23 NetData kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
Feb 10 14:07:23 NetData kernel: RBP: ffffa9e680a77cb0 R08: ffff961a3a434460 R09: 0000000000000000
Feb 10 14:07:23 NetData kernel: R10: ffff961a3a434888 R11: 0000000000000000 R12: ffffffffb82829b0
Feb 10 14:07:23 NetData kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000004
Feb 10 14:07:23 NetData kernel: FS: 0000000000000000(0000) GS:ffff961a3ea80000(0000) knlGS:0000000000000000
Feb 10 14:07:23 NetData kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 10 14:07:23 NetData kernel: CR2: 000000c000c48000 CR3: 000000017eb70000 CR4: 00000000000006e0
Feb 10 14:07:23 NetData kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 10 14:07:23 NetData kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Feb 10 14:07:23 NetData kernel: Call Trace:
Feb 10 14:07:23 NetData kernel: ? load_new_mm_cr3+0xf0/0xf0
Feb 10 14:07:23 NetData kernel: on_each_cpu+0x2d/0x60
Feb 10 14:07:23 NetData kernel: flush_tlb_kernel_range+0x38/0x90
Feb 10 14:07:23 NetData kernel: __purge_vmap_area_lazy+0x70/0x6d0
Feb 10 14:07:23 NetData kernel: free_vmap_area_noflush+0xe1/0xf0
Feb 10 14:07:23 NetData kernel: remove_vm_area+0x9a/0xb0
Feb 10 14:07:23 NetData kernel: iounmap+0x86/0xc0
Feb 10 14:07:23 NetData kernel: ttm_bo_kunmap+0xac/0xd0 [ttm]
Feb 10 14:07:23 NetData kernel: drm_gem_vram_object_vunmap+0x24/0x40 [drm_vram_helper]
Feb 10 14:07:23 NetData kernel: drm_gem_vunmap+0x28/0x50 [drm]
Feb 10 14:07:23 NetData kernel: drm_client_buffer_vunmap+0x1a/0x30 [drm]
Feb 10 14:07:23 NetData kernel: drm_fb_helper_dirty_work+0x18f/0x1b0 [drm_kms_helper]
Feb 10 14:07:23 NetData kernel: process_one_work+0x1eb/0x3b0
Feb 10 14:07:23 NetData kernel: worker_thread+0x4d/0x400
Feb 10 14:07:23 NetData kernel: kthread+0x104/0x140
Feb 10 14:07:23 NetData kernel: ? process_one_work+0x3b0/0x3b0
Feb 10 14:07:23 NetData kernel: ? kthread_park+0x90/0x90
Feb 10 14:07:23 NetData kernel: ret_from_fork+0x35/0x40
Feb 10 14:07:23 NetData ebpf.plugin[16044]: heartbeat clock: woke up 30679669 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 10 14:07:23 NetData nfacct.plugin[16043]: heartbeat clock: woke up 31708706 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 10 14:07:33 NetData nfacct.plugin[16043]: heartbeat clock: woke up 7131562 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 10 14:07:33 NetData ebpf.plugin[16044]: heartbeat clock: woke up 9105841 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 10 14:07:46 NetData ebpf.plugin[16044]: heartbeat clock: woke up 11209102 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
Feb 10 14:07:46 NetData apps.plugin[16039]: heartbeat clock: woke up 10356680 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).
This is the latest log from journalctl. At the moment I am not facing any stuck or slowdown. But I had to configure also chrony daemon , I am not sure if this might be issue running under Proxmox Hypervisor.
Feb 14 02:47:10 NetData nfacct.plugin[38314]: NFACCT process exiting
Feb 14 06:23:55 NetData ebpf.plugin[34168]: heartbeat clock: woke up 881445 microseconds later than expected (can be due to s
ystem load or the CLOCK_REALTIME set to the future).
Feb 14 06:23:55 NetData apps.plugin[34166]: heartbeat clock: woke up 762508 microseconds later than expected (can be due to s
ystem load or the CLOCK_REALTIME set to the future).
Feb 14 06:43:05 NetData systemd[1]: Stopping Real time performance monitoring...
Feb 14 06:43:05 NetData ebpf.plugin[34168]: thread with task id 34211 finished
Feb 14 06:43:05 NetData ebpf.plugin[34168]: thread with task id 34214 finished
Feb 14 06:43:05 NetData ebpf.plugin[34168]: thread with task id 34209 finished
Feb 14 06:43:05 NetData ebpf.plugin[34168]: thread with task id 34217 finished
Feb 14 06:43:05 NetData ebpf.plugin[34168]: thread with task id 34215 finished
Feb 14 06:43:05 NetData ebpf.plugin[34168]: thread with task id 34213 finished
Feb 14 06:43:05 NetData ebpf.plugin[34168]: thread with task id 34208 finished
Feb 14 06:43:05 NetData ebpf.plugin[34168]: thread with task id 34210 finished
Feb 14 06:43:05 NetData ebpf.plugin[34168]: thread with task id 34212 finished
Feb 14 06:43:06 NetData apps.plugin[41906]: PROCFILE: Cannot open file '/etc/netdata/apps_groups.conf'
Feb 14 06:43:06 NetData apps.plugin[41906]: Cannot read process groups configuration file '/etc/netdata/apps_groups.conf'. Wi
ll try '/usr/lib/netdata/conf.d/apps_groups.conf'
Feb 14 06:43:06 NetData apps.plugin[41906]: Loaded config file '/usr/lib/netdata/conf.d/apps_groups.conf'
Feb 14 06:43:06 NetData apps.plugin[41906]: started on pid 41906
Feb 14 06:43:06 NetData apps.plugin[41906]: set name of thread 41911 to APPS_READER
Feb 14 06:43:07 NetData apps.plugin[41906]: Using now_boottime_usec() for uptime (dt is 2 ms)
Feb 14 06:43:11 NetData systemd[1]: netdata.service: Succeeded.
Feb 14 06:43:11 NetData systemd[1]: Stopped Real time performance monitoring.
Feb 14 06:43:11 NetData systemd[1]: Started Real time performance monitoring.
Feb 14 06:43:11 NetData netdata[41939]: CONFIG: cannot load cloud config '/var/lib/netdata/cloud.d/cloud.conf'. Running with
internal defaults.
Feb 14 06:43:11 NetData netdata[41939]: 2023-02-14 06:43:11: netdata INFO : MAIN : CONFIG: cannot load cloud config '/var/li
b/netdata/cloud.d/cloud.conf'. Running with internal defaults.
Feb 14 06:43:14 NetData ebpf.plugin[42226]: Does not have a configuration file inside `/etc/netdata/ebpf.d.conf. It will try
to load stock file.
Feb 14 06:43:14 NetData perf.plugin[42224]: no charts enabled - nothing to do.
Feb 14 06:43:14 NetData apps.plugin[42223]: PROCFILE: Cannot open file '/etc/netdata/apps_groups.conf'
Feb 14 06:43:14 NetData apps.plugin[42223]: Cannot read process groups configuration file '/etc/netdata/apps_groups.conf'. Wi
ll try '/usr/lib/netdata/conf.d/apps_groups.conf'
Feb 14 06:43:14 NetData apps.plugin[42223]: Loaded config file '/usr/lib/netdata/conf.d/apps_groups.conf'
Feb 14 06:43:14 NetData apps.plugin[42223]: started on pid 42223
Feb 14 06:43:14 NetData apps.plugin[42223]: set name of thread 42244 to APPS_READER
Feb 14 06:43:14 NetData ebpf.plugin[42226]: Cannot read process groups configuration file '/etc/netdata/apps_groups.conf'. Wi
ll try '/usr/lib/netdata/conf.d/apps_groups.conf'
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42264
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42264 to EBPF HARDIRQ
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42267
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42267 to EBPF SHM
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42262
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42262 to EBPF FD
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42265
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42265 to EBPF SOFTIRQ
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42266
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42266 to EBPF OOMKILL
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread with task id 42266 finished
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42259
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42259 to EBPF SYNC
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42260
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42260 to EBPF SWAP
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42261
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42261 to EBPF MOUNT
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42258
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42258 to EBPF CACHESTAT
Feb 14 06:43:14 NetData ebpf.plugin[42226]: thread created with task id 42257
Feb 14 06:43:14 NetData ebpf.plugin[42226]: set name of thread 42257 to EBPF PROCESS
Feb 14 06:43:15 NetData apps.plugin[42223]: Using now_boottime_usec() for uptime (dt is 6 ms)
Feb 14 06:44:15 NetData ebpf.plugin[42226]: heartbeat clock: woke up 18534517 microseconds later than expected (can be due to
system load or the CLOCK_REALTIME set to the future).
Feb 14 10:43:15 NetData nfacct.plugin[42230]: NFACCT process exiting
Do we have any clue why this happening? Does this happening due to Clock? or any misconfiguration with VM ?
1 eb 17 16:42:22 NetData ebpf.plugin[74534]: set name of thread 74578 to EBPF MOUNT
2 Feb 17 16:42:22 NetData ebpf.plugin[74534]: thread created with task id 74577
3 Feb 17 16:42:22 NetData ebpf.plugin[74534]: set name of thread 74577 to EBPF SWAP
4 Feb 17 16:42:23 NetData apps.plugin[74524]: Using now_boottime_usec() for uptime (dt is 4 ms)
5 Feb 17 16:43:07 NetData ebpf.plugin[74534]: heartbeat clock: woke up 1428778 microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future