Netdata Community

Automate Netdata Cloud node claim on instance deploy

Currently we install and setup netdata with ansible and we’re trying to figure out a way to automate the netdata claim process from the instance being deployed.

It seems the token value changes when collecting the claim information from netdata cloud.

Is there a token that can be re-used to automate the claiming process?

@Youssef-Slassi said in Automate Netdata Cloud node claim on instance deploy:

PS : I’ve tried deploying with the same token, it doesn’t work when you claim 2 nodes with the same …

Claiming multiple agents with a single token should be possible.So you may stable upon a bug. Can you provide more information so we can investigate why it didn’t work? What error did you get the second time?

Also if you can share your account email or space name with us here cloud-team@netdata.cloud, we could take a look on our end.

@shrabok_cninc as @Harris-Bouzopoulos said the token does not expire. However, it is possible for new tokens to be generated and thus:

I did notice in netdata cloud the token shown in the UI changes over time (and assumed they were time based). Is there a reason the token isn’t the same token (like an api or account specific token)?

Is possible, but it is something we should fix. Don’t worry previous tokens are still valid.

@shrabok_cninc the claiming script does have a token that doesn’t expire and can be used for claiming multiple nodes in the same space.

!

After the claiming is completed, the agent will be assigned a unique id by which we can reference him in the future.

My point in my previous post was that this claiming script must be executed (with the same token and rooms) in every agent, on every instance, that you want to claim in the cloud. It mustn’t be executed in a base image from which you will spawn your instances hoping that they will have an already claimed agent.

Feel free to add any additional questions that you might have. We will do our best to answer in a timely manner!

Thanks for using Netdata Cloud!

Hey shrabok_cninc - If I’m not mistaken giving a key as a parameter would be possible, but I don’t know of the implications of doing/providing such an action. It will need some investigation, but definitely something we’ll keep in mind!

Hey everyone,

I think that @shrabok_cninc mentioned that the token the claiming command requires changes over time, thus the user can’t use the same ansible playbook for too long.

Can we verify or deny this?

@shrabok_cninc Soon you will be able to install an agent and claim it in a single step. For now you can copy the claiming command(Manage Space/Nodes) for your space and add it in ansible after the Netdata Agent installation is finished. Please be sure that the host system has internet access via HTTPS.

For unclaiming you will have to wait a couple of months due to conflicting prios. Sorry for this and thx for using Netdata Cloud!

This is great feedback @sharbok_cninc,

I have pinged the engineering and product team. We will get some more ideas shortly, thanks again!

If you have any other question please do ask!

Hi all thanks for the clarity and responses.
Ideally if each space had an api key generated, we could use that key in the claim script as an argument.
Then the script would request netdata cloud to generate the token on demand for the new instance.
This would also work for the requested unclaim/deregister feature where it would be able to use the same api key to deregister the instance.

I wonder if some sort of netdata cloud api that could be used to claim a node could be some sort of solution here?

Hi guys,

as Nikos described every agent is assigned a unique id after the claiming process. Thus, even though you can automate the installation process, you would have to claim each agent individually.
Doing otherwise will result in inconsistent behaviour, where all of the agents appear as unreachable.

For example, if you have a base image from which you spawn your VMs/containers you can have the agent installed there, but not claimed. Claiming should be part of the bootstrapping process.

Hey @shrabok_cninc @Youssef Slassi

I believe that each claiming process creates a unique claimID that belongs to each machine. And that we have to go through the process for each machine individually.

I will try and learn more about the deep details of the process and see if it’s possible to automate it even further in some use cases in the future :slight_smile:

1 Like

Hey,
Your post is just what i’ve been looking for the past 48 hours, the only difference is that we’re planning to deploy it directly from sh scripts, so can we do that in any way ? I mean is there an API to generate tokens or a similar thing ?
PS : I’ve tried deploying with the same token, it doesn’t work when you claim 2 nodes with the same …
Thank you for responding ^^

Hey @shrabok_cninc,

I was not aware of this design choice; let me bring some engineers here to verify and discuss further on the implications.

This is great feedback, thank you for taking the time to write it up :slight_smile:

Hi @OdysLam,

Thank you for the response. I did notice in netdata cloud the token shown in the UI changes over time (and assumed they were time based). Is there a reason the token isn’t the same token (like an api or account specific token)?

Hey @shrabok_cninic,

Welcome to our forums :slight_smile: The token is the same for claiming any number of nodes, and we have our own ansible quickstart example to show for it :v:
Here: https://github.com/netdata/community/tree/main/netdata-agent-deployment/ansible-quickstart

As this example has been created in the community repository, where we host sample applications, we would love to see your making improvements or contributing a different example of using Ansible and Netdata.

Regarding unclaiming, we are working on this and it should be available early next year. For now, please just remove the unavailable nodes from the War Rooms you use.

Thank you for your patience!

Additional note, ideally the ability to unclaim/cleanup host from netdata cloud would also be beneficial, in cases such as autoscaling groups where hosts are ephemeral and would want to add and remove instances from netdata cloud without intervention when the asg scales out and in.