We have just started using the Netdata tool and have successfully setup a few Linux nodes and this works great but we are having trouble getting a separate node showing for a Windows Server. We understand to get this working it needs to use an existing Linux Node as a passthrough to the cloud dashboard.
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0.0005083
go_gc_duration_seconds_sum 0.0010147
go_gc_duration_seconds_count 74
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 13
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.20.2"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 3.071136e+06
On one of our Linux servers we have edited the /etc/netdata/go.d/windows.conf file and added a job for this server.
Welcome to the forums and thanks for raising this to us.
Looking to the sample configs and troubleshoot debug output all seems ok.
Does your Linux Node appear on Netdata Cloud? Can you see the metrics from the Windows machine, even if this isn’t displayed as a node on Cloud?
Could you also look to the error.log of that Linux Node and see if anything relevant is there? Should be under /opt/netdata/var/log/netdata
Yes the Linux Node appears on Netdata Cloud. Yes the metrics URL on the windows machine returns a readable output.
Looking in the error log it all seems to be mostly informational. The most notable thing I can see is this repeated netdata ERROR : PD[nfacct] : PARSER: read failed: end of file (errno 22, Invalid argument).
error.log
2023-06-13 02:48:52: netdata ERROR : PD[nfacct] : PARSER: read failed: end of file (errno 22, Invalid argument)
2023-06-13 02:48:52: netdata INFO : PD[nfacct] : PLUGINSD: 'host:linuxserver', '/opt/netdata/usr/libexec/netdata/plugins.d/nfacct.plugin' (pid 7180) disconnected after 72010 successful data collections (ENDs).
2023-06-13 02:48:53: netdata INFO : PD[nfacct] : PLUGINSD: 'host:linuxserver' connected to '/opt/netdata/usr/libexec/netdata/plugins.d/nfacct.plugin' running on pid 17247
2023-06-13 02:49:38: netdata INFO : ANALYTICS : /opt/netdata/usr/libexec/netdata/plugins.d/anonymous-statistics.sh 'META' '-' '-'
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: creating new data and journal files in path /opt/netdata/var/cache/netdata/dbengine
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: created data file "/opt/netdata/var/cache/netdata/dbengine/datafile-1-0000000091.ndf".
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: created journal file "/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000091.njf".
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: journal file 90 is ready to be indexed
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: indexing file '/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000090.njfv2': extents 204, metrics 2598, pages 13056
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: recalculating tier 0 retention for 2611 metrics starting with datafile 54
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: migrated journal file '/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000090.njfv2', file size 779812
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: updating tier 0 metrics registry retention for 2611 metrics
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: deleting data file '/opt/netdata/var/cache/netdata/dbengine/datafile-1-0000000053.ndf'.
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: deleting data and journal files to maintain disk quota
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: deleted journal file "/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000053.njf".
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: deleted journal file "/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000053.njfv2".
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: deleted data file "/opt/netdata/var/cache/netdata/dbengine/datafile-1-0000000053.ndf".
2023-06-13 03:15:42: netdata INFO : LIBUV_WORKER : DBENGINE: reclaimed 7069816 bytes of disk space.
2023-06-13 03:19:48: netdata INFO : LIBUV_WORKER : METADATA: Checking dimensions starting after row 0
2023-06-13 03:19:48: netdata INFO : LIBUV_WORKER : METADATA: Checked 2642, deleted 0 -- will resume after row 0 in 3600 seconds
2023-06-13 04:19:49: netdata INFO : LIBUV_WORKER : METADATA: Checking dimensions starting after row 0
2023-06-13 04:19:49: netdata INFO : LIBUV_WORKER : METADATA: Checked 2642, deleted 0 -- will resume after row 0 in 3600 seconds
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: creating new data and journal files in path /opt/netdata/var/cache/netdata/dbengine
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: created data file "/opt/netdata/var/cache/netdata/dbengine/datafile-1-0000000092.ndf".
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: created journal file "/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000092.njf".
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: journal file 91 is ready to be indexed
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: indexing file '/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000091.njfv2': extents 210, metrics 2610, pages 13440
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: migrated journal file '/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000091.njfv2', file size 800308
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: recalculating tier 0 retention for 2597 metrics starting with datafile 55
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: updating tier 0 metrics registry retention for 2597 metrics
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: deleting data file '/opt/netdata/var/cache/netdata/dbengine/datafile-1-0000000054.ndf'.
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: deleting data and journal files to maintain disk quota
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: deleted journal file "/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000054.njf".
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: deleted journal file "/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000054.njfv2".
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: deleted data file "/opt/netdata/var/cache/netdata/dbengine/datafile-1-0000000054.ndf".
2023-06-13 04:45:29: netdata INFO : LIBUV_WORKER : DBENGINE: reclaimed 7054432 bytes of disk space.
2023-06-13 04:49:38: netdata INFO : ANALYTICS : /opt/netdata/usr/libexec/netdata/plugins.d/anonymous-statistics.sh 'META' '-' '-'
2023-06-13 05:19:50: netdata INFO : LIBUV_WORKER : METADATA: Checking dimensions starting after row 0
2023-06-13 05:19:50: netdata INFO : LIBUV_WORKER : METADATA: Checked 2642, deleted 0 -- will resume after row 0 in 3600 seconds
2023-06-13 05:52:01: netdata INFO : LIBUV_WORKER : DBENGINE: creating new data and journal files in path /opt/netdata/var/cache/netdata/dbengine-tier1
2023-06-13 05:52:01: netdata INFO : LIBUV_WORKER : DBENGINE: created data file "/opt/netdata/var/cache/netdata/dbengine-tier1/datafile-1-0000000016.ndf".
2023-06-13 05:52:01: netdata INFO : LIBUV_WORKER : DBENGINE: created journal file "/opt/netdata/var/cache/netdata/dbengine-tier1/journalfile-1-0000000016.njf".
2023-06-13 05:52:01: netdata INFO : LIBUV_WORKER : DBENGINE: journal file 15 is ready to be indexed
2023-06-13 05:52:01: netdata INFO : LIBUV_WORKER : DBENGINE: indexing file '/opt/netdata/var/cache/netdata/dbengine-tier1/journalfile-1-0000000015.njfv2': extents 324, metrics 2640, pages 20736
2023-06-13 05:52:01: netdata INFO : LIBUV_WORKER : DBENGINE: migrated journal file '/opt/netdata/var/cache/netdata/dbengine-tier1/journalfile-1-0000000015.njfv2', file size 1182604
2023-06-13 06:13:26: netdata INFO : LIBUV_WORKER : DBENGINE: creating new data and journal files in path /opt/netdata/var/cache/netdata/dbengine
2023-06-13 06:13:26: netdata INFO : LIBUV_WORKER : DBENGINE: created data file "/opt/netdata/var/cache/netdata/dbengine/datafile-1-0000000093.ndf".
2023-06-13 06:13:26: netdata INFO : LIBUV_WORKER : DBENGINE: created journal file "/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000093.njf".
2023-06-13 06:13:26: netdata INFO : LIBUV_WORKER : DBENGINE: journal file 92 is ready to be indexed
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: indexing file '/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000092.njfv2': extents 206, metrics 2607, pages 13184
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: recalculating tier 0 retention for 2595 metrics starting with datafile 56
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: migrated journal file '/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000092.njfv2', file size 786824
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: updating tier 0 metrics registry retention for 2595 metrics
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: deleting data file '/opt/netdata/var/cache/netdata/dbengine/datafile-1-0000000055.ndf'.
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: deleting data and journal files to maintain disk quota
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: deleted journal file "/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000055.njf".
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: deleted journal file "/opt/netdata/var/cache/netdata/dbengine/journalfile-1-0000000055.njfv2".
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: deleted data file "/opt/netdata/var/cache/netdata/dbengine/datafile-1-0000000055.ndf".
2023-06-13 06:13:27: netdata INFO : LIBUV_WORKER : DBENGINE: reclaimed 7013064 bytes of disk space.
2023-06-13 06:19:51: netdata INFO : LIBUV_WORKER : METADATA: Checking dimensions starting after row 0
2023-06-13 06:19:51: netdata INFO : LIBUV_WORKER : METADATA: Checked 2642, deleted 0 -- will resume after row 0 in 3600 seconds
2023-06-13 06:25:02: netdata INFO : MAIN : SIGNAL: Received SIGHUP. Reopening all log files...
2023-06-13 06:25:02: netdata INFO : MAIN : COMMAND: Reopening all log files.
You are right the netdata user didn’t seem to have access to the netdata directory. I have run sudo chmod -R 777 /opt/netdata which has allowed the netdata user to access this. After restarting the service the node is now shown in the dashboard. If its not already, I think this should be added to the documentation somewhere.
A separate problem but the graphs are blank on the nodes screen.
777 is problematic because it makes everything readable and writable by any user. The correct way to create vnodes.conf is the same as for every config - using the edit-config.sh script.
cd /opt/netdata/etc/netdata/
sudo ./edit-config vnodes/vnodes.conf
Is it possible to have different metrics selected for different machines in one room or do they currently have to be separate rooms to achieve this? I’m guessing I might need to have a windows room and linux room so I don’t have a bunch of blank graphs showing.
I understand your concerns with using 777. It was just an easy way to get it going for now. I will fix up later and use the edit-config script. Thanks for your help.
Yes, on the Nodes tab this configurations of what metrics to show in that tabular view is defined per Room, if you want different metrics for different nodes there you would need to have them split per rooms (note: a node can be in more than one room so you could have split rooms and a common one).
You also have the custom dashboards available, these don’t have that tabular view but you can add whatever charts make sense to you and combine different charts for different OS’s or node characteristics.
I’ve had a go and it seems to work pretty good. There does seem to be a bug where when you create multiple rooms, if you simply edit the default metrics it replaces that same metric in all rooms. The work around being to delete all the metrics in a new room and start fresh. Is this a known issue?
No, actually we haven’t seen that reported before but I just repoduced it. Thank you for reporting it
Will file in a bug report to get it sorted out but glad you found a workaround.