Most of our machines are “standard” but not recognized by kickstarter as anything in particular. (Which is true. They’re all built from source.) And most of them are still arch=i686 rather than x86_64.
It took a little futzing but I didn’t really have any trouble getting netdata compiled. I messed with kickstarter but wound up cloning the repo and building it, which worked better.
The problem I have is that (in particular), apps.plugin doesn’t seem to do anything. It sits in its command loop but never reports any data.
I compiled it with “NETDATA_INTERNAL_CHECKS” defined and that does produce more output at startup, but I’m expecting a steady stream of stats on stdout, and there’s nothing.
Without any of the rest of the daemons running, if I run “apps.plugin 1”, here’s the output after clipping off the timestamp and “comm=apps.plugin source=collector”. Note: This is from an i686 machine, but it behaved the same on x86_64.
level=info tid=3820 thread=MAIN msg=“Loaded config file ‘/opt/netdata/etc/netdata/apps_groups.conf’”
level=info tid=3820 thread=MAIN msg=“started on pid 3820”
level=debug tid=3820 thread=MAIN msg="ARAL: ‘dict-items’ element size 48 (requested 44 bytes), min elements per page 85 (requested 2), max elements per page 1365, max page size 65520 bytes (requested 65536) "
level=debug tid=3820 thread=MAIN msg="ARAL: ‘dict-shared-items’ element size 16 (requested 12 bytes), min elements per page 256 (requested 2), max elements per page 4096, max page size 65536 bytes (requested 65536) "
FUNCTION GLOBAL “processes” 10 “Detailed information on the currently running processes.” “top” “members” 10
level=debug tid=3821 thread=APPS_READER msg=“set name of thread 3821 to APPS_READER”
level=debug tid=3822 thread=APPS_WORK[1] msg=“set name of thread 3822 to APPS_WORK[1]”
So the data header comes out, but that’s it. And then it just sits there forever. If I hit enter, I get this:
level=notice tid=3821 thread=APPS_READER msg=“Received unknown command: (unset)”
so I know it’s alive.
I spent a couple minutes with gdb to know that it was getting into the heartbeat_next() call in libnetdata/clocks/clocks.c and–as far as I could tell–never returning from that. So some sort of clock problem? It’s vmware, if that matters.
My next step was to instrument that and see where it’s getting stuck, but I thought I’d drop it in here before I burn a bunch more time on this.
(I think debugfs.plugin is doing the same thing but I haven’t dug into that one.)