AmazonLinux2 Installation -- is there an RPM?

When I installed netdata on my AmazonLinux2 EC2 instance:

wget -O /tmp/netdata-kickstart.sh https://my-netdata.io/kickstart.sh && sh /tmp/netdata-kickstart.sh --claim-token xxxxxxxxxxxxxxxxxxx --claim-url https://app.netdata.cloud

It evidently does not have a native package for installation and seemed to just extract files beneath /opt/netdata:

[rpalmer@ip-xxx-xx-xx-xxx_qa ~]$ rpm -qa | grep netdata
[rpalmer@ip-xxx-xx-xx-xxx_qa ~]$
[rpalmer@ip-xxx-xx-xx-xxx_qa ~]$ ls -l /opt/netdata
total 0
drwxrwxr-x 3 netdata netdata 145 Nov 14 00:23 bin
drwxrwxr-x 3 netdata netdata  32 Nov 14 00:23 etc
drwxr-xr-x 3 netdata netdata  18 Nov 14 00:20 include
drwxr-xr-x 3 netdata netdata  58 Nov 14 00:20 lib
lrwxrwxrwx 1 netdata netdata  11 Nov 14 04:01 netdata-configs -> etc/netdata
lrwxrwxrwx 1 netdata netdata  15 Nov 14 04:01 netdata-dbs -> var/lib/netdata
lrwxrwxrwx 1 netdata netdata  15 Nov 14 04:01 netdata-logs -> var/log/netdata
lrwxrwxrwx 1 netdata netdata  17 Nov 14 04:01 netdata-metrics -> var/cache/netdata
lrwxrwxrwx 1 netdata netdata  19 Nov 14 04:01 netdata-plugins -> usr/libexec/netdata
lrwxrwxrwx 1 netdata netdata  21 Nov 14 04:01 netdata-web-files -> usr/share/netdata/web
lrwxrwxrwx 1 netdata netdata   3 Nov 14 04:01 sbin -> bin
drwxrwxr-x 7 netdata netdata  66 Nov 14 00:23 share
drwxrwxr-x 2 netdata netdata 216 Nov 14 00:23 system
drwxrwxr-x 5 netdata netdata  81 Nov 14 04:01 usr
drwxrwxr-x 6 netdata netdata  52 Nov 14 00:23 var
[rpalmer@ip-xxx-xx-xx-xxx_qa ~]$

My installation is kinda working, but there are errors on systemctl status and the journalctl report complains that /usr/sbin/netdata is missing:

[root@ip-xxx-xx-xx-xxx_qa ~]# systemctl status netdata
● netdata.service - Real time performance monitoring
   Loaded: loaded (/usr/lib/systemd/system/netdata.service; enabled; vendor preset: disabled)
   Active: inactive (dead) (Result: exit-code) since Fri 2022-11-11 14:33:48 UTC; 2min 38s ago
  Process: 32682 ExecStart=/usr/sbin/netdata -D $EXTRA_OPTS (code=exited, status=203/EXEC)
 Main PID: 32682 (code=exited, status=203/EXEC)

Nov 11 14:33:43 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Unit netdata.service entered failed state.
Nov 11 14:33:43 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: netdata.service failed.
Nov 11 14:33:48 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Stopped Real time performance monitoring.
[root@ip-xxx-xx-xx-xxx_qa ~]#


Nov 11 14:32:34 ip-xxx-xx-xx-xxx.us-east-2.compute.internal sudo[31943]:  rpalmer : TTY=pts/0 ; PWD=/home/rpalmer/tmp ; USER=root ; COMMAND=/bin/su -
Nov 11 14:32:34 ip-xxx-xx-xx-xxx.us-east-2.compute.internal sudo[31943]: pam_unix(sudo:session): session opened for user root by rpalmer(uid=0)
Nov 11 14:32:34 ip-xxx-xx-xx-xxx.us-east-2.compute.internal su[31944]: (to root) rpalmer on pts/0
Nov 11 14:32:34 ip-xxx-xx-xx-xxx.us-east-2.compute.internal su[31944]: pam_unix(su-l:session): session opened for user root by rpalmer(uid=0)
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal groupadd[32333]: group added to /etc/group: name=netdata, GID=991
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal groupadd[32333]: group added to /etc/gshadow: name=netdata
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal groupadd[32333]: new group: name=netdata, GID=991
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal useradd[32340]: new user: name=netdata, UID=995, GID=991, home=/opt/netdata, shell=/usr/sbin/nologin
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal usermod[32351]: add 'netdata' to group 'docker'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal usermod[32351]: add 'netdata' to shadow group 'docker'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal usermod[32369]: add 'netdata' to group 'adm'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal usermod[32369]: add 'netdata' to shadow group 'adm'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal usermod[32388]: add 'netdata' to group 'nobody'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal usermod[32388]: add 'netdata' to shadow group 'nobody'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Reloading.
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:13] Unknown lvalue 'CacheDirectory' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:14] Unknown lvalue 'StateDirectory' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:15] Unknown lvalue 'LogsDirectory' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:17] Unknown lvalue 'StateDirectoryMode' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:18] Unknown lvalue 'CacheDirectoryMode' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:19] Unknown lvalue 'LogsDirectoryMode' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Reloading.
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:13] Unknown lvalue 'CacheDirectory' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:14] Unknown lvalue 'StateDirectory' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:15] Unknown lvalue 'LogsDirectory' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:17] Unknown lvalue 'StateDirectoryMode' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:18] Unknown lvalue 'CacheDirectoryMode' in section 'Service'
Nov 11 14:33:08 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: [/usr/lib/systemd/system/netdata.service:19] Unknown lvalue 'LogsDirectoryMode' in section 'Service'
Nov 11 14:33:13 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Started Real time performance monitoring.
-- Subject: Unit netdata.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit netdata.service has finished starting up.
--
-- The start-up result is done.
Nov 11 14:33:13 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[32607]: Failed at step EXEC spawning /usr/sbin/netdata: No such file or directory
-- Subject: Process /usr/sbin/netdata could not be executed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- The process /usr/sbin/netdata could not be executed and failed.
--
-- The error number returned by this process is 2.
Nov 11 14:33:13 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: netdata.service: main process exited, code=exited, status=203/EXEC
Nov 11 14:33:13 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Unit netdata.service entered failed state.
Nov 11 14:33:13 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: netdata.service failed.
Nov 11 14:33:35 ip-xxx-xx-xx-xxx.us-east-2.compute.internal dhclient[4504]: XMT: Solicit on eth0, interval 118240ms.
Nov 11 14:33:43 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: netdata.service holdoff time over, scheduling restart.
Nov 11 14:33:43 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Stopped Real time performance monitoring.
-- Subject: Unit netdata.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit netdata.service has finished shutting down.
Nov 11 14:33:43 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Started Real time performance monitoring.
-- Subject: Unit netdata.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit netdata.service has finished starting up.
--
-- The start-up result is done.
Nov 11 14:33:43 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[32682]: Failed at step EXEC spawning /usr/sbin/netdata: No such file or directory
-- Subject: Process /usr/sbin/netdata could not be executed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- The process /usr/sbin/netdata could not be executed and failed.
--
-- The error number returned by this process is 2.
Nov 11 14:33:43 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: netdata.service: main process exited, code=exited, status=203/EXEC
Nov 11 14:33:43 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Unit netdata.service entered failed state.
Nov 11 14:33:43 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: netdata.service failed.
Nov 11 14:33:48 ip-xxx-xx-xx-xxx.us-east-2.compute.internal systemd[1]: Stopped Real time performance monitoring.
-- Subject: Unit netdata.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit netdata.service has finished shutting down.
Nov 11 14:33:53 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [32749]: CONFIG: cannot load user config '/opt/netdata/etc/netdata/netdata.conf'. Will try the stock version.
Nov 11 14:33:53 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [32749]: CONFIG: cannot load stock config '/opt/netdata/usr/lib/netdata/conf.d/netdata.conf'. Running with internal defaults.
Nov 11 14:33:53 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [32749]: CONFIG: cannot load cloud config '/opt/netdata/var/lib/netdata/cloud.d/cloud.conf'. Running with internal defaults.
Nov 11 14:33:53 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [32749]: Found 0 legacy dbengines, setting multidb diskspace to 256MB
Nov 11 14:33:53 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [32749]: Created file '/opt/netdata/var/lib/netdata/dbengine_multihost_size' to store the computed value
Nov 11 14:33:54 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [438]: Does not have a configuration file inside `/opt/netdata/etc/netdata/ebpf.d.conf. It will try to load stock file.
Nov 11 14:33:54 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [438]: Your environment does not have BTF file /sys/kernel/btf//vmlinux. The plugin will work with 'legacy' code.
Nov 11 14:33:54 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [438]: Name resolution is disabled, collector will not parser "hostnames" list.
Nov 11 14:33:54 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [438]: The network value of CIDR 127.0.0.1/8 was updated for 127.0.0.0 .
Nov 11 14:33:54 ip-xxx-xx-xx-xxx.us-east-2.compute.internal [438]: Cannot read process groups configuration file '/opt/netdata/etc/netdata/apps_groups.conf'. Will try '/opt/netdata/usr/lib/netdata/conf.d/apps_groups.conf'
Nov 11 14:35:34 ip-xxx-xx-xx-xxx.us-east-2.compute.internal dhclient[4504]: XMT: Solicit on eth0, interval 108950ms.

Do you not have an AmazonLinux2 native rpm install?

Is there a better way for me to install and run Netdata?

-RP

Hi @rpalmer-koverly and thank you for your post!

I am not familiar with the repositories of AmazonLinux2 (although Netdata should have prioritised the installation of a native package over a static one, if there was one available - so I will assume there isn’t), but you try following the instructions from here on how to “Add repositories on an Amazon Linux instance” :

Note

To enable the EPEL repository on Amazon Linux 2, use the following command:

[ec2-user ~]$ sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

For information on enabling the EPEL repository on other distributions, such as Red Hat and CentOS, see the EPEL documentation at Extra Packages for Enterprise Linux (EPEL) :: Fedora Docs

You can see here that the latest stable version of Netdata is available on EPEL 7, 8 and 9. Please make sure to uninstall the static binary installation first.

Having said that, you are right that there is something odd with your current setup. Was this the first time you ran the kickstart script?

I’ve installed and uninstalled a couple of times now on this instance.

I did have the epel-7 repo enabled:

[rpalmer@ip-172-31-xx-xxx_qa ~]$ yum repolist
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
227 packages excluded due to repository priority protections
repo id                                                 repo name                                                                   status
!amzn2-core/2/x86_64                                    Amazon Linux 2 core repository                                                  29,345
!amzn2extra-docker/2/x86_64                             Amazon Extras repo for docker                                                       72
!epel/x86_64                                            Extra Packages for Enterprise Linux 7 - x86_64                              13,511+227
repolist: 42,928
[rpalmer@ip-172-31-xx-xxx_qa ~]$

[rpalmer@ip-172-31-xx-xxx_qa ~]$ yum --disablerepo="*" --enablerepo="epel" list available | grep netdata
netdata.x86_64                             1.36.1-1.el7                     epel
netdata-conf.noarch                        1.36.1-1.el7                     epel
netdata-data.noarch                        1.36.1-1.el7                     epel
netdata-freeipmi.x86_64                    1.36.1-1.el7                     epel
[rpalmer@ip-172-31-xx-xxx_qa ~]$

Looking at your get_system_info function, the kickstart script doesn’t support AmazonLinux:

supported_compat_names="debian ubuntu centos fedora opensuse ol arch"

So even though I have the epel repo enabled and have an /etc/os-release file:

[root@ip-172-31-xx-xxx_qa ~]# cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
[root@ip-172-31-xx-xxx_qa ~]#

it just used the tar.gz install:

--- Using /tmp/netdata-kickstart-CTKT30xq93 as a temporary directory. ---
 --- Checking for existing installations of Netdata... ---
 --- No existing installations of netdata found, assuming this is a fresh install. ---
 --- Attempting to install using native packages... ---
 WARNING  We do not provide native packages for amzn.

 WARNING  Could not install native binary packages, falling back to alternative installation method.

 --- Attempting to install using static build... ---
[/tmp/netdata-kickstart-CTKT30xq93]# curl --fail -q -sSL --connect-timeout 10 --retry 3 --output /tmp/netdata-kickstart-CTKT30xq93/netdata-x86_64-latest.gz.run https://storage.googleapis.com/netdata-nightlies/netdata-x86_64-latest.gz.run
 OK

Strange that the installer wouldn’t detect that repo configuration and simply do a yum install netdata. Doesn’t seem like it would be too hard to update the script to support AmazonLinux, it’s pretty centos7-like.

Are there any guides on how to do a yum install of netdata and then configure it?

-RP

OK, did a little playing. Here’s what I found.

I created a fresh AmazonLinux 2 EC2 instance and configured the EPEL repository on it (all commands run with root privileges):

amazon-linux-extras install epel -y

At this point, all I should need to do is run “yum install netdata” but it fails out of the box because it requires nodejs 16 and nodejs 16 has a dependency on libuv-1.43.0 which is not available through EPEL or the AmazonLinux 2 repos. The latest available is 1.36.1.

To work around this, you must first get nodejs installed and that installation must be forced to not pay attention to the version of libuv. So install libuv so the wrong (1.36.1) version of the dependency exists and then use rpm --nodeps to force install nodejs:

yum install libuv -y
rpm -Uvh --nodeps $(repoquery --location nodejs)

At this point I installed the other dependencies that seemed to be needed by netdata:

yum install brotli libbsd libretls netcat nodejs-libs openssl11 protobuf protobuf-c snappy -y

and was then able to install netdata. I suspect I could have skipped the previous step and gone straight to the netdata install since nodejs was already installed:

yum install netdata netdata-conf netdata-data -y

I then enabled and started netdata and then proceeded to claim my new node using the --claim-only option of the kickstart script:

systemctl enable netdata
systemctl start netdata
wget -O /tmp/netdata-kickstart.sh https://my-netdata.io/kickstart.sh && sh /tmp/netdata-kickstart.sh --claim-only --claim-token 4xvgbyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxOD9pME --claim-url https://app.netdata.cloud

This new node was then visible on the netdata cloud interface.

I tested running some docker containers:

amazon-linux-extras install docker
systemctl enable docker
systemctl start docker
docker pull wordpress && docker run --name wordpress -p 443:80 wordpress &
docker pull nginx && docker run --name nginx -p 80:80 nginx &
docker run --name nginx-8081 -p 8081:80 nginx &

But while my running containers were visible on the local port 19999 interface:

They again are not visible by default on the cloud interface:

Can you suggest a config file tweak to make them show up on the cloud interface?

It would be nice if your default settings picked them up for the cloud interface.

-RP

Glad to hear you got it sorted!

I was investigating earlier today whether we could (fairly easily) add support in the kickstart.sh script for Amazon Linux 2, however it’s not as straightforward as I thought it’d be (for various reasons). However, that’s on the roadmap and currently expected some time in the first half of 2023 (after we get migrated off of Package Cloud).

Also, as a general point, we don’t officially support third-party packages, so in that sense, it’s better to let kickstart.sh install static builds in this particular case (even though that’s not ideal either). We haven’t tested static builds on Amazon Linux 2 specifically, but I don’t see why they wouldn’t work (I also can’t understand what went wrong with your systemd original problem).

Let me get back to you about the containers not showing issue.

OK, it seems to be a recent change. Can you please check cgroups->cpu and select group by instance? Your container resources should show up there.

Ah, thanks, found the containers there under cgroups, see Docker containers running on EC2 not visible in cloud.

In terms of systemd, I just spun up a fresh AmazonLinux 2 EC2 instance and ran the kickstart installer on it.

The systemd file that got installed is:

[ec2-user@ip-172-31-xx-xxx ~]$ cat /lib/systemd/system/netdata.service
# SPDX-License-Identifier: GPL-3.0-or-later
[Unit]
Description=Real time performance monitoring

# append here other services you want netdata to wait for them to start
After=network.target httpd.service squid.service nfs-server.service mysqld.service mysql.service named.service postfix.service chronyd.service

[Service]
Type=simple
User=netdata
Group=netdata
RuntimeDirectory=netdata
CacheDirectory=netdata
StateDirectory=netdata
LogsDirectory=netdata
RuntimeDirectoryMode=0775
StateDirectoryMode=0755
CacheDirectoryMode=0755
LogsDirectoryMode=2750
EnvironmentFile=-/etc/default/netdata
ExecStart=/usr/sbin/netdata -D $EXTRA_OPTS

# saving a big db on slow disks may need some time
TimeoutStopSec=150

# restart netdata if it crashes
Restart=on-failure
RestartSec=30

# Valid policies: other (the system default) | batch | idle | fifo | rr
# To give netdata the max priority, set CPUSchedulingPolicy=rr and CPUSchedulingPriority=99
CPUSchedulingPolicy=batch

# This sets the scheduling priority (for policies: rr and fifo).
# Priority gets values 1 (lowest) to 99 (highest).
#CPUSchedulingPriority=1

# For scheduling policy 'other' and 'batch', this sets the lowest niceness of netdata (-20 highest to 19 lowest).
Nice=0

[Install]
WantedBy=multi-user.target
[ec2-user@ip-172-31-xx-xxx ~]$

As you can see the paths in the file are incorrect. The EnvironmentFile and ExecStart files don’t exist:

[ec2-user@ip-172-31-xx-xxx ~]$ ls -l /etc/default/netdata
ls: cannot access /etc/default/netdata: No such file or directory
[ec2-user@ip-172-31-xx-xxx ~]$
[ec2-user@ip-172-31-xx-xxx ~]$ ls -l /usr/sbin/netdata
ls: cannot access /usr/sbin/netdata: No such file or directory
[ec2-user@ip-172-31-xx-xxx ~]$

since all the netdata files were installed under /opt/netdata.

The correct paths to point to for should have /opt/netdata pre-prended:

[ec2-user@ip-172-31-xx-xxx ~]$ ls -l /opt/netdata/usr/sbin/netdata
-rwxr-xr-x 1 netdata netdata 146 Nov 21 00:22 /opt/netdata/usr/sbin/netdata
[ec2-user@ip-172-31-xx-xxx ~]$

/etc/default/netdata does not exist but there is an /opt/netdata/etc/netdata containing configuration files:

[ec2-user@ip-172-31-xx-xxx ~]$ ls -l /opt/netdata/etc/netdata
total 32
drwxr-xr-x 2 netdata netdata     6 Nov 21 00:22 charts.d
drwxr-xr-x 2 netdata netdata     6 Nov 21 00:22 custom-plugins.d
drwxr-xr-x 2 netdata netdata     6 Nov 21 00:22 ebpf.d
-rwxr-xr-x 1 netdata netdata  2069 Nov 21 00:22 edit-config
drwxr-xr-x 2 netdata netdata     6 Nov 21 00:22 go.d
drwxr-xr-x 2 netdata netdata     6 Nov 21 00:22 health.d
-rw-r--r-- 1 root    root    25824 Nov 21 15:10 netdata.conf
lrwxrwxrwx 1 netdata netdata    28 Nov 21 15:09 orig -> ../../usr/lib/netdata/conf.d
drwxr-xr-x 2 netdata netdata     6 Nov 21 00:22 python.d
drwxr-xr-x 2 netdata netdata     6 Nov 21 00:22 ssl
drwxr-xr-x 2 netdata netdata     6 Nov 21 00:22 statsd.d
[ec2-user@ip-172-31-xx-xxx ~]$

So the bug seems to be that the installer does not craft a systemd file to match where it un-tar’ed the binary and thus systemctl commands do not work and netdata won’t come up after reboot.

Interestingly, the kickstart installer does try to start it up from the correct location of the binary, but then says it can’t find the conf file (which does exist at that location) which is immediately followed by a chmod to set perms on the netdata.conf file:

[/opt/netdata]# /opt/netdata/bin/netdata
2022-11-21 15:10:37: netdata INFO  : MAIN : CONFIG: cannot load user config '/opt/netdata/etc/netdata/netdata.conf'. Will try the stock version.
2022-11-21 15:10:37: netdata INFO  : MAIN : CONFIG: cannot load stock config '/opt/netdata/usr/lib/netdata/conf.d/netdata.conf'. Running with internal defaults.
2022-11-21 15:10:37: netdata INFO  : MAIN : CONFIG: cannot load cloud config '/opt/netdata/var/lib/netdata/cloud.d/cloud.conf'. Running with internal defaults.
2022-11-21 15:10:37: netdata INFO  : MAIN : Found 0 legacy dbengines, setting multidb diskspace to 256MB
2022-11-21 15:10:37: netdata INFO  : MAIN : Created file '/opt/netdata/var/lib/netdata/dbengine_multihost_size' to store the computed value
 OK  ''

Downloading default configuration from netdata...
[/opt/netdata]# curl -sSL --connect-timeout 10 --retry 3 http://localhost:19999/netdata.conf
 OK  ''

[/opt/netdata]# mv /opt/netdata/etc/netdata/netdata.conf.new /opt/netdata/etc/netdata/netdata.conf
 OK  ''

 OK  New configuration saved for you to edit at /opt/netdata/etc/netdata/netdata.conf


  ^
  |.-.   .-.   .-.   .-.   .-.   .  netdata  .-.   .-.   .-.   .-.   .-.   .-
  |   '-'   '-'   '-'   '-'   '-'               '-'   '-'   '-'   '-'   '-'
  +----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+--->

[/opt/netdata]# chmod 0644 /opt/netdata/etc/netdata/netdata.conf
 OK

which makes it look like the chmod is out of order and should be done earlier so that the perms allow the startup command to read the conf file.

Don’t know if all this is AmazonLinux 2 specific behavior or it it’s applicable to other non-native package installations as well.

-RP

@Dim-P I think I have the easy way to install on AmazonLinux 2.

First, run the following command to create a link:

ln -s /opt/netdata/usr/sbin/netdata  /usr/sbin/netdata

Then just run the normal kickstart wget command shown in the Add Nodes dialog box and the sytemd startup will work. Easy-peasy.

For extra credit:

ln -s /opt/netdata/var/log/netdata  /var/log/netdata

I prefer the results of the kickstart install over the EPEL rpm-based install: the kickstart install shows the actual names of running containers instead of just the container ID, which is less than useful.

-RP