Node was successfully claimed but doesn't appear in netdata.cloud

Problem/Question

My claim process ends successfully but node doesn’t show up in netdata.cloud

Relevant docs you followed/actions you took to solve the issue

I tried serveral times the process described at: Connect Agent to Cloud | Learn Netdata

  • Remove cloud.d
  • Restart netdata
  • Reclaim with command

netdata-claim.sh -token="MYTOKEN" -rooms="MYROOM-UUID" -url="https://app.netdata.cloud" -id="093d1c50-a73b-412f-b268-e1f0ee1477fa"

093d1c50-a73b-412f-b268-e1f0ee1477fa is generated by the command $(uuidgen)

The output of the command:

Token: ****************
              Base URL: https://app.netdata.cloud
              Id: MYID
              Rooms: MYROOM
              Hostname: MYHOSTNAME
              Proxy:
              Netdata user: netdata
              Generating private/public key for the first time.
              Extracting public key from private key.
              writing RSA key
              Node was successfully claimed

Environment/Browser/Agent’s version etc

lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.2 LTS
Release:	22.04
Codename:	jammy
netdata -v
netdata v1.41.0
netdata -W buildinfo
Packaging:
    Netdata Version ____________________________________________ : v1.41.0
    Installation Type __________________________________________ : binpkg-deb
    Package Architecture _______________________________________ : x86_64
    Package Distro _____________________________________________ :
    Configure Options __________________________________________ :  '--build=x86_64-linux-gnu' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--prefix=/usr' '--sysconfdir=/etc' '--localstatedir=/var' '--libdir=/usr/lib' '--libexecdir=/usr/libexec' '--with-user=netdata' '--with-math' '--with-zlib' '--with-webdir=/var/lib/netdata/www' '--disable-dependency-tracking' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -ffile-prefix-map=/usr/src/netdata=. -fstack-protector-strong -Wformat -Werror=format-security' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' 'CXXFLAGS=-g -O2 -ffile-prefix-map=/usr/src/netdata=. -fstack-protector-strong -Wformat -Werror=format-security'
Default Directories:
    User Configurations ________________________________________ : /etc/netdata
    Stock Configurations _______________________________________ : /usr/lib/netdata/conf.d
    Ephemeral Databases (metrics data, metadata) _______________ : /var/cache/netdata
    Permanent Databases ________________________________________ : /var/lib/netdata
    Plugins ____________________________________________________ : /usr/libexec/netdata/plugins.d
    Static Web Files ___________________________________________ : /var/lib/netdata/www
    Log Files __________________________________________________ : /var/log/netdata
    Lock Files _________________________________________________ : /var/lib/netdata/lock
    Home _______________________________________________________ : /var/lib/netdata
Operating System:
    Kernel _____________________________________________________ : Linux
    Kernel Version _____________________________________________ : 5.15.0-78-generic
    Operating System ___________________________________________ : Ubuntu
    Operating System ID ________________________________________ : ubuntu
    Operating System ID Like ___________________________________ : debian
    Operating System Version ___________________________________ : 22.04.2 LTS (Jammy Jellyfish)
    Operating System Version ID ________________________________ : none
    Detection __________________________________________________ : /etc/os-release
Hardware:
    CPU Cores __________________________________________________ : 12
    CPU Frequency ______________________________________________ : 3600000000
    CPU Architecture ___________________________________________ : 33548394496
    RAM Bytes __________________________________________________ : 960207962112
    Disk Capacity ______________________________________________ : x86_64
    Virtualization Technology __________________________________ : none
    Virtualization Detection ___________________________________ : systemd-detect-virt
Container:
    Container __________________________________________________ : none
    Container Detection ________________________________________ : systemd-detect-virt
    Container Orchestrator _____________________________________ : none
    Container Operating System _________________________________ : none
    Container Operating System ID ______________________________ : none
    Container Operating System ID Like _________________________ : none
    Container Operating System Version _________________________ : none
    Container Operating System Version ID ______________________ : none
    Container Operating System Detection _______________________ : none
Features:
    Built For __________________________________________________ : Linux
    Netdata Cloud ______________________________________________ : YES
    Health (trigger alerts and send notifications) _____________ : YES
    Streaming (stream metrics to parent Netdata servers) _______ : YES
    Replication (fill the gaps of parent Netdata servers) ______ : YES
    Streaming and Replication Compression ______________________ : YES (lz4)
    Contexts (index all active and archived metrics) ___________ : YES
    Tiering (multiple dbs with different metrics resolution) ___ : YES (5)
    Machine Learning ___________________________________________ : YES
Database Engines:
    dbengine ___________________________________________________ : YES
    alloc ______________________________________________________ : YES
    ram ________________________________________________________ : YES
    map ________________________________________________________ : YES
    save _______________________________________________________ : YES
    none _______________________________________________________ : YES
Connectivity Capabilities:
    ACLK (Agent-Cloud Link: MQTT over WebSockets over TLS) _____ : YES
    static (Netdata internal web server) _______________________ : YES
    h2o (web server) ___________________________________________ : YES
    WebRTC (experimental) ______________________________________ : NO
    Native HTTPS (TLS Support) _________________________________ : YES
    TLS Host Verification ______________________________________ : YES
Libraries:
    LZ4 (extremely fast lossless compression algorithm) ________ : YES
    zlib (lossless data-compression library) ___________________ : YES
    Judy (high-performance dynamic arrays and hashtables) ______ : YES (bundled)
    dlib (robust machine learning toolkit) _____________________ : YES (bundled)
    protobuf (platform-neutral data serialization protocol) ____ : YES (system)
    OpenSSL (cryptography) _____________________________________ : YES
    libdatachannel (stand-alone WebRTC data channels) __________ : NO
    JSON-C (lightweight JSON manipulation) _____________________ : YES
    libcap (Linux capabilities system operations) ______________ : NO
    libcrypto (cryptographic functions) ________________________ : YES
    libm (mathematical functions) ______________________________ : YES
    jemalloc ___________________________________________________ : NO
    TCMalloc ___________________________________________________ : NO
Plugins:
    apps (monitor processes) ___________________________________ : YES
    cgroups (monitor containers and VMs) _______________________ : YES
    cgroup-network (associate interfaces to CGROUPS) ___________ : YES
    proc (monitor Linux systems) _______________________________ : YES
    tc (monitor Linux network QoS) _____________________________ : YES
    diskspace (monitor Linux mount points) _____________________ : YES
    freebsd (monitor FreeBSD systems) __________________________ : NO
    macos (monitor MacOS systems) ______________________________ : NO
    statsd (collect custom application metrics) ________________ : YES
    timex (check system clock synchronization) _________________ : YES
    idlejitter (check system latency and jitter) _______________ : YES
    bash (support shell data collection jobs - charts.d) _______ : YES
    debugfs (kernel debugging metrics) _________________________ : YES
    cups (monitor printers and print jobs) _____________________ : YES
    ebpf (monitor system calls) ________________________________ : YES
    freeipmi (monitor enterprise server H/W) ___________________ : YES
    nfacct (gather netfilter accounting) _______________________ : YES
    perf (collect kernel performance events) ___________________ : YES
    slabinfo (monitor kernel object caching) ___________________ : YES
    Xen ________________________________________________________ : NO
    Xen VBD Error Tracking _____________________________________ : NO
Exporters:
    AWS Kinesis ________________________________________________ : NO
    GCP PubSub _________________________________________________ : NO
    MongoDB ____________________________________________________ : NO
    Prometheus (OpenMetrics) Exporter __________________________ : YES
    Prometheus Remote Write ____________________________________ : YES
    Graphite ___________________________________________________ : YES
    Graphite HTTP / HTTPS ______________________________________ : YES
    JSON _______________________________________________________ : YES
    JSON HTTP / HTTPS __________________________________________ : YES
    OpenTSDB ___________________________________________________ : YES
    OpenTSDB HTTP / HTTPS ______________________________________ : YES
    All Metrics API ____________________________________________ : YES
    Shell (use metrics in shell scripts) _______________________ : YES
Debug/Developer Features:
    Trace All Netdata Allocations (with charts) ________________ : NO
    Developer Mode (more runtime checks, slower) _______________ : NO
netdatacli aclk-state
ACLK Available: Yes
ACLK Version: 2
Protocols Supported: Protobuf
Protocol Used: Protobuf
MQTT Version: 5
Claimed: Yes
Claimed Id: 29f2385e-d2a3-4697-92e4-e4572930908b
Cloud URL: https://app.netdata.cloud
Online: Yes
Reconnect count: 0
Banned By Cloud: No
Last Connection Time: 2023-07-29 01:01:28
Last Connection Time + 3 PUBACKs received: 2023-07-29 01:01:28
Received Cloud MQTT Messages: 4
MQTT Messages Confirmed by Remote Broker (PUBACKs): 102

> Node Instance for mGUID: "7cf92c8c-2b42-11ee-ab0b-514e578540af" hostname "MYHOSTNAME"
	Claimed ID: 29f2385e-d2a3-4697-92e4-e4572930908b
	Node ID: 99e204dd-22b5-4c58-8436-eeffd36c8fd0
	Streaming Hops: 0
	Relationship: self
	Alert Streaming Status:
		Updates: 1
		Pending Min Seq ID: 0
		Pending Max Seq ID: 0
		Last Submitted Seq ID: 89

What I expected to happen

The claimed node appears in netdata.cloud

I found some intersting logs in /var/log/netdata/error.log:

2023-07-29 01:30:31: netdata ERROR : ACLK_MAIN : RRDCONTEXT: received version hash 0 for host 'MYHOST', does not match our version hash 34857214837. Sending snapshot of all contexts.

Full log of the claim process:

2023-07-29 01:30:28: netdata INFO  : DAEMON_COMMAND : EOF found in command pipe.
2023-07-29 01:30:28: netdata INFO  : DAEMON_COMMAND : EOF found in command pipe.
2023-07-29 01:30:28: netdata INFO  : DAEMON_COMMAND : EOF found in command pipe.
2023-07-29 01:30:28: netdata INFO  : UV_WORKER[7] : write-config cloud|global|enabled|yes
2023-07-29 01:30:28: netdata INFO  : UV_WORKER[7] : write-config conf_file=cloud section=global key=enabled value=yes
2023-07-29 01:30:28: netdata INFO  : DAEMON_COMMAND : EOF found in command pipe.
2023-07-29 01:30:28: netdata INFO  : UV_WORKER[43] : write-config cloud|global|cloud base url|https://app.netdata.cloud
2023-07-29 01:30:28: netdata INFO  : UV_WORKER[43] : write-config conf_file=cloud section=global key=cloud base url value=https://app.netdata.cloud
2023-07-29 01:30:28: netdata INFO  : DAEMON_COMMAND : EOF found in command pipe.
2023-07-29 01:30:28: netdata INFO  : UV_WORKER[22] : COMMAND: Reloading Agent Claiming configuration.
2023-07-29 01:30:28: netdata INFO  : UV_WORKER[22] : File '/var/lib/netdata/cloud.d/claimed_id' was found. Setting state to AGENT_CLAIMED.
2023-07-29 01:30:29: netdata INFO  : ACLK_MAIN : Wait before attempting to reconnect in 0.000 seconds
2023-07-29 01:30:29: netdata INFO  : ACLK_MAIN : Attempting connection now
2023-07-29 01:30:29: netdata INFO  : ACLK_STATS : thread created with task id 2954853
2023-07-29 01:30:29: netdata INFO  : ACLK_STATS : set name of thread 2954853 to ACLK_STATS
2023-07-29 01:30:29: netdata INFO  : ACLK_MAIN : HTTPS "GET" request to "app.netdata.cloud" finished with HTTP code: 200
2023-07-29 01:30:29: netdata INFO  : ACLK_MAIN : Getting Cloud /env successful
2023-07-29 01:30:29: netdata INFO  : ACLK_MAIN : New ACLK protobuf protocol negotiated successfully (/env response).
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : HTTPS "GET" request to "api.netdata.cloud" finished with HTTP code: 200
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : ACLK_OTP Got Challenge from Cloud
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : HTTPS "POST" request to "api.netdata.cloud" finished with HTTP code: 201
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : ACLK_OTP Got Password from Cloud
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : [mqtt_wss] I: Going to connect using internal MQTT 5 implementation
2023-07-29 01:30:30: netdata ERROR : ACLK_MAIN : [mqtt_wss] W: client_id provided is longer than 23 bytes, server might not allow that [MQTT-3.1.3-5]
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : [mqtt_wss] I: ws_client: Websocket Connection Accepted By Server
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : [mqtt_wss] I: mqtt_client: MQTT server limits message size to 5242880
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : [mqtt_wss] I: mqtt_client: MQTT Connection Accepted By Server
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : ACLK connection successfully established
2023-07-29 01:30:30: netdata INFO  : ACLK_MAIN : Starting 4 query threads.
2023-07-29 01:30:30: netdata INFO  : ACLK_QRY[0] : thread created with task id 2954861
2023-07-29 01:30:30: netdata INFO  : ACLK_QRY[1] : thread created with task id 2954862
2023-07-29 01:30:30: netdata INFO  : ACLK_QRY[1] : set name of thread 2954862 to ACLK_QRY[1]
2023-07-29 01:30:30: netdata INFO  : ACLK_QRY[2] : thread created with task id 2954863
2023-07-29 01:30:30: netdata INFO  : ACLK_QRY[2] : set name of thread 2954863 to ACLK_QRY[2]
2023-07-29 01:30:30: netdata INFO  : ACLK_QRY[3] : thread created with task id 2954864
2023-07-29 01:30:30: netdata INFO  : ACLK_QRY[3] : set name of thread 2954864 to ACLK_QRY[3]
2023-07-29 01:30:30: netdata INFO  : ACLK_QRY[0] : set name of thread 2954861 to ACLK_QRY[0]
2023-07-29 01:30:31: netdata INFO  : ACLK_MAIN : Queuing registration for host=c3e0067e-0587-11ee-b69b-917da1596d8c, hops=0
2023-07-29 01:30:31: netdata ERROR : ACLK_MAIN : RRDCONTEXT: received version hash 0 for host 'app1', does not match our version hash 34857214837. Sending snapshot of all contexts.

Hi @mickaelperrin

From the agent logs, it appears that the claim procedure, and the connection to cloud is working.

Could you please check access.log ? There should be there some more indication of cloud connection and command exchange.

Do you find that the the node does not appear at all on Nodes tab, or is it offline?

Hey @Manolis_Vasilakis thanks for for trying to help.

The new server doesn’t appear at all in netdata.cloud.

Here are the logs written in access.log during the claim process:

2023-08-03 09:01:00: ACLK CONNECTED
2023-08-03 09:01:00: ACLK REQ [6935f1b8-dc7b-462a-94ad-af69130fea7c (app1)]: STREAM CONTEXTS ENABLED
2023-08-03 09:01:00: ACLK REQ [6935f1b8-dc7b-462a-94ad-af69130fea7c (app1)]: STREAM ALERTS ENABLED (RESET REQUESTED)
2023-08-03 09:01:01: ACLK DISCONNECTED
2023-08-03 09:01:07: 1: 3995779 '[172.17.0.4]:41878' 'CONNECTED'
2023-08-03 09:01:07: 1: 3995779 '[172.17.0.4]:41878' 'DISCONNECTED'
2023-08-03 09:01:07: 1: 3995779 '[172.17.0.4]:41878' 'DATA' (sent/all = 6305/86112 bytes -93%, prep/sent/total = 14.37/0.51/14.88 ms) 200 '/api/v1/allmetrics?format=prometheus&server=app1'
2023-08-03 09:01:07: ACLK CONNECTED
2023-08-03 09:01:08: ACLK REQ [6935f1b8-dc7b-462a-94ad-af69130fea7c (app1)]: STREAM CONTEXTS ENABLED
2023-08-03 09:01:08: ACLK REQ [6935f1b8-dc7b-462a-94ad-af69130fea7c (app1)]: STREAM ALERTS ENABLED (RESET REQUESTED)
2023-08-03 09:01:08: ACLK RES [6935f1b8-dc7b-462a-94ad-af69130fea7c (app1)]: NODE INFO SENT for guid [c3e0067e-0587-11ee-b69b-917da1596d8c] (parent)
2023-08-03 09:01:39: ACLK RES [6935f1b8-dc7b-462a-94ad-af69130fea7c (app1)]: NODE COLLECTORS SENT

Hope it helps

Note that the real install / claim procedure is done through Ansible by using the official Ansible playbook for netdata. And I encounter the trouble on every new server I deploy. So basically, I can’t get any new server to be claimed on netdata.cloud.

For debugging purposes, I tried the procedure manually directly but didn’t get any difference.

Some updates on the issue.

I totally removed netdata and abandoned that way to install Netdata with ansible (by using netdata package published on https://packagecloud.io):

I switched to the way described within the documentation (by using kickstart.sh):

I successfully claimed the node on netdata.cloud.

But now I am facing an issue when trying to move the netdata/cache folder to /home/netdata/cache as described in Can't start Netdata with /home/netdata as cache directory