Cannot connect node to cloud

Environment/Browser

Server: Ubuntu 21.04
Netdata: v1.31.0-245-nightly

Problem/Question

When running the recommended command to connect my node to Netdata cloud as shown below, I am presented with the following error.

bash <(curl -Ss https://my-netdata.io/kickstart.sh) --claim-token <mytokenhere> --claim-url https://app.netdata.cloud

— Found existing install of Netdata under: / —
— Attempting to claim agent to https://app.netdata.cloud
Unable to communicate with Netdata daemon, querying config from disk instead.
Unable to communicate with Netdata daemon, querying config from disk instead.
/usr/sbin/netdata-claim.sh: line 171: /var/lib/netdata/registry/netdata.public.unique.id: Permission denied
Failed to write new machine GUID. Please make sure you have rights to write to /var/lib/netdata/registry/netdata.public.unique.id.

How can I make this command work?

What I expected to happen

My node would show up in my Netdata cloud portal.

First check why netdata isn’t running. The existing installation that was detected should have resulted in a running netdata. After that happens, run the command again.

If we still get the permissions failure, check the ownership and access rights of the directory, against the user netdata is running as. Normally the user is netdata and the registry directory is writeable by that user.

Let us know how it goes!

There’s a bug right now in the kickstart script’s handling of claiming that causes it to not work reliably if the agent is not running. This should work (and the new kickstart script in https://github.com/netdata/netdata/pull/11493 does make it work correctly), it just doesn’t right now. The only real solution is to ensure the agent is running as suggested by @Christopher_Akritid1.

Thank you both for your answers! @Christopher_Akritid1, @Austin_Hemmelgarn

To give you some background, I’m not super knowledgeable with Linux. I have played around with these OS for years but I’ve never used them in a work environment. For this I apologise if I’m being stupid.

I have checked and the Netdata service is running. Ontop of this, when using it’s local port I can access the dashboard which seems to be working correctly.

When I checked the service status, it was running but showed some errors in the status:

netdata.service - Real time performance monitoring
     Loaded: loaded (/lib/systemd/system/netdata.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2021-09-13 03:00:19 UTC; 9h ago
    Process: 698 ExecStartPre=/bin/mkdir -p /var/cache/netdata (code=exited, status=0/SUCCESS)
    Process: 758 ExecStartPre=/bin/chown -R netdata:netdata /var/cache/netdata (code=exited, status=0/SUCCESS)
    Process: 775 ExecStartPre=/bin/mkdir -p /var/run/netdata (code=exited, status=0/SUCCESS)
    Process: 778 ExecStartPre=/bin/chown -R netdata:netdata /var/run/netdata (code=exited, status=0/SUCCESS)
   Main PID: 804 (netdata)
      Tasks: 56 (limit: 4553)
     Memory: 232.2M
     CGroup: /system.slice/netdata.service
             ├─ 804 /usr/sbin/netdata -P /var/run/netdata/netdata.pid -D
             ├─ 808 /usr/sbin/netdata --special-spawn-server
             ├─ 954 /usr/libexec/netdata/plugins.d/go.d.plugin 1
             ├─ 955 /usr/libexec/netdata/plugins.d/ebpf.plugin 1
             ├─ 963 /usr/libexec/netdata/plugins.d/apps.plugin 1
             └─3396 bash /usr/libexec/netdata/plugins.d/tc-qos-helper.sh 1

Sep 13 03:00:19 halogen netdata[804]: '
Sep 13 03:00:19 halogen netdata[804]: TIMEZONE: fixed as 'Etc/UTC'
Sep 13 03:00:19 halogen netdata[804]: 2021-09-13 03:00:19: netdata INFO  : MAIN : TIMEZONE: fixed as 'Etc/UTC'
Sep 13 03:00:19 halogen netdata[804]: SIGNAL: Not enabling reaper
Sep 13 03:00:19 halogen netdata[804]: 2021-09-13 03:00:19: netdata INFO  : MAIN : SIGNAL: Not enabling reaper
Sep 13 03:00:20 halogen ebpf.plugin[955]: Does not have a configuration file inside `/etc/netdata/ebpf.d.conf. It will >
Sep 13 03:00:20 halogen ebpf.plugin[955]: Name resolution is disabled, collector will not parser "hostnames" list.
Sep 13 03:00:20 halogen ebpf.plugin[955]: The network value of CIDR 127.0.0.1/8 was updated for 127.0.0.0 .
Sep 13 03:00:20 halogen ebpf.plugin[955]: PROCFILE: Cannot open file '/etc/netdata/apps_groups.conf'
Sep 13 03:00:20 halogen ebpf.plugin[955]: Cannot read process groups configuration file '/etc/netdata/apps_groups.conf'

I have never had this issue before and installed Netdata on two Linux hosts before.
I’m really not sure what I have done wrong, I have installed Netdata onto a fresh Ubuntu OS with no other configurations applied before hand other than updating apt.

What do you think I should do next?

The next step would be to try running the the installed claiming script. The kickstart script is just supposed to find this and run it with the correct arguments, but you may have found a bug in the kickstart script itself here. The claiming script should be located at /usr/libexec/netdata/netdat6a-claim.sh in your case. Running that directly with the --claim-token and --claim-url options you tried previously should work, though you will have to run it as root probably (you can just do this with sudo). If it does, then you’ve found a bug in the kickstart script, if it doesn’t then we’ll have to look further (but in that case it’s probably a bug in the claiming script).

Thanks for the advice!

So I ran the command as you said but it looks like that bash script doesn’t exist on my host. That would maybe explain the kickstart error. I have installed Netdata a couple of times now on 2 fresh installs so I don’t think this is down to me:

kiweezi@halogen:~$ sudo bash /usr/libexec/netdata/netdat6a-claim.sh ...
bash: /usr/libexec/netdata/netdat6a-claim.sh: No such file or directory
kiweezi@halogen:~$ cd /usr/libexec/netdata/
kiweezi@halogen:/usr/libexec/netdata$ ls
charts.d  netdata-uninstaller.sh  netdata-updater.sh  node.d  plugins.d  python.d
kiweezi@halogen:/usr/libexec/netdata$ ll
total 52
drwxr-xr-x 6 root root  4096 Sep 11 13:47 ./
drwxr-xr-x 6 root root  4096 Sep 11 13:47 ../
drwxr-xr-x 2 root root  4096 Sep 11 13:47 charts.d/
-rwxr-x--- 1 root root 12058 Sep 11 13:47 netdata-uninstaller.sh*
-rwxr-xr-x 1 root root 15018 Sep 11 13:47 netdata-updater.sh*
drwxr-xr-x 3 root root  4096 Sep 11 13:47 node.d/
drwxr-xr-x 3 root root  4096 Sep 11 13:47 plugins.d/
drwxr-xr-x 3 root root  4096 Sep 11 13:47 python.d/

Also I’m not sure if you meant to put the ‘6’ in netdat6a-claim but I tried without that too, with the same error as a result.

Do you think it would be an idea to download that script and put it in this directory, then run it?

No, we need to find the bug with kickstart.sh and fix it.

@kiweezi Apologies, I gave you the wrong path. it should instead be /usr/bin/netdata-claim.sh.

Hey, no worries. Thank you for helping either way!
I ran the command with the new path you specified, but unfortunately this also produced the same error:

kiweezi@halogen:~$ sudo bash /usr/bin/netdata-claim.sh ...
bash: /usr/bin/netdata-claim.sh: No such file or directory

Okay, do you need anything else from me?

Shall I await an update from you that this is fixed? Or is there a GitHub issue related to this that I can tune in to?

Either way, thank you for all your help @Christopher_Akritid1, @Austin_Hemmelgarn. I really have enjoyed the Netdata project, thank you for your contributions and keep up the great work!

And of course when I go to try and give you the right path, I mistype it, it should, in fact, be /usr/sbin/netdata-claim.sh (missed one letter in my earlier response). Sorry about the delays because of this.

Not to worry. I have tried this and it seems the script does exist, however the arguments I’m passing are not recognised:

kiweezi@halogen:~$ sudo bash /usr/sbin/netdata-claim.sh --claim-token <token> --claim-rooms <room> --claim-url https://app.netdata.cloud
Unknown argument --claim-token
kiweezi@halogen:~$ sudo bash /usr/sbin/netdata-claim.sh -token <token> -rooms <room> -url https://app.netdata.cloud
Unknown argument -token

Ah, it’s -token= etc. See netdata/netdata-claim.sh.in at master · netdata/netdata · GitHub

Ah, thank you. I did have a look in there but missed the ‘=’ whoops!
Well, that worked and connected the node successfully, so thank you very much as this solves my issue!

Please let me know if you would like me to provide anything else to help you investigate the bug in the kickstart script.

Thank you very much @Christopher_Akritid1 and @Austin_Hemmelgarn for all your help!