Borked installation and inability to re-install/un-install

Hello,

I tried installing the agent on my client’s Linux Ubuntu 20.04 LTS server using wget -O /tmp/netdata-kickstart.sh https://my-netdata.io/kickstart.sh && sh /tmp/netdata-kickstart.sh. The process never completed and I need to terminate the process using control + c.

I have since tried re-installing the agent using wget -O /tmp/netdata-kickstart.sh https://my-netdata.io/kickstart.sh && sh /tmp/netdata-kickstart.sh --reinstall-clean. The process starts, but it jams up during the reinstallation process. I have to force quit it using control + c.

I have also tried uninstalling Netdata using the uninstall script, but the following errors are displayed:

  • FATAL: netdata-uninstaller.sh: FAILED TO UNINSTALL NETDATA: Failed to completely remove Netdata from this system. FAILED
  • Command “sudo /tmp/netdata-kickstart-Mh6ltrS8oT/netdata-uninstaller.sh --yes” failed with exit code 1

When I navigate to the /etc/netdata directory, only the netdata-updater.conf netdata.conf files are in the directory.

There is no .environment file in the /etc/netdata directory.

The /var/lib/netdata directory contains dbengine_multihost_size etc god-jobs-statuses.json lock netdata.api.key pythond-jobs-statuses.json registry.

The /var/cache/netdata directory contains .netdata_bash_sleep_timer_fifo context-meta.db-shm dbengine dbengine-tier2 netdata-meta.db netdata-meta.db-wal context-meta.db context-meta.db-wal dbengine-tier1 ml.db netdata-meta.db-shm

I have tried manually uninstalling the Netdata agent, but had no success.

Results from sudo find / -d -name netdata command:

  • /opt/netdata/bin/srv/netdata
  • /opt/netdata/bin/netdata
  • /opt/netdata/usr/share/netdata
  • /opt/netdata/usr/libexec/netdata
  • /opt/netdata/usr/lib/netdata/system/initd/init.d/netdata
  • /opt/netdata/usr/lib/netdata/system/lsb/init.d/netdata
  • /opt/netdata/usr/lib/netdata/system/freebsd/rc.d/netdata
  • /opt/netdata/usr/lib/netdata/system/logrotate/netdata
  • /opt/netdata/usr/lib/netdata/system/openrc/conf.d/netdata
  • /opt/netdata/usr/lib/netdata/system/openrc/init.d/netdata
  • /opt/netdata/usr/lib/netdata
  • /opt/netdata/etc/netdata
  • /opt/netdata/var/cache/netdata
  • /opt/netdata/var/log/netdata
  • /opt/netdata/var/lib/netdata
  • /opt/netdata
  • /usr/libexec/netdata
  • /etc/init.d/netdata
  • /etc/netdata
  • /etc/logrotate.d/netdata
  • /etc/default/netdata
  • /var/cache/netdata
  • /var/log/netdata
  • /var/lib/netdata /opt/netdata/bin/srv/netdata
  • /opt/netdata/bin/netdata
  • /opt/netdata/usr/share/netdata
  • /opt/netdata/usr/libexec/netdata
  • /opt/netdata/usr/lib/netdata/system/initd/init.d/netdata
  • /opt/netdata/usr/lib/netdata/system/lsb/init.d/netdata
  • /opt/netdata/usr/lib/netdata/system/freebsd/rc.d/netdata
  • /opt/netdata/usr/lib/netdata/system/logrotate/netdata
  • /opt/netdata/usr/lib/netdata/system/openrc/conf.d/netdata
  • /opt/netdata/usr/lib/netdata/system/openrc/init.d/netdata
  • /opt/netdata/usr/lib/netdata
  • /opt/netdata/etc/netdata
  • /opt/netdata/var/cache/netdata
  • /opt/netdata/var/log/netdata
  • /opt/netdata/var/lib/netdata
  • /opt/netdata
  • /usr/libexec/netdata
  • /etc/init.d/netdata
  • /etc/netdata
  • /etc/logrotate.d/netdata
  • /etc/default/netdata
  • /var/cache/netdata
  • /var/log/netdata
  • /var/lib/netdata

Any suggestion on how to fix these issues?

Cheers,

Hi, @mikehermary.

The process never completed and I need to terminate the process using control + c.

The output would be very helpful. Without it, it is not possible to identify the problem.

To completely remove everything:

systemctl status netdata
# check if there is instance running^^, if yes do
systemctl stop netdata
dpkg -l | grep netdata
apt-get purge "LIST OF PACKAGES FROM PREV COMMAND IF ANY"
rm -rf /var/log/netdata
rm -rf /var/lib/netdata
rm -rf /var/cache/netdata
rm -rf /etc/netdata
rm /etc/cron.daily/netdata-updater
userdel -r netdata

Everything should be removed, try re-installing

Hello,

I completed the purge and removed all of the those directories plus /opt/netdata manually.

After that, I ran wget -O /tmp/netdata-kickstart.sh https://my-netdata.io/kickstart.sh && sh /tmp/netdata-kickstart.sh --stable-channel --claim-token UVjjkf10xi0DLoI74_8PqRvHoyi27rKS0ERdfgVf0NRhf-9S0cvi1aDF34hcyxwaPxyZWr2iz-dZkZR40zZg_P9_inVO0_0BjcvCXY1GtwKypcrGVIjqmcgj3qVrb0Xu_1t2g2Q --claim-rooms b8955d1b-94f6-4ddd-b042-b576dcd609d8 --claim-url https://app.netdata.cloud.

The installation process completed, but with errors. I have included them below.

dpkg: unrecoverable fatal error, aborting: unknown system user 'netdata' in statoverride file; the system user got removed before the override, which is most probably a packaging bug, to recover you can remove the override manually with dpkg-statoverride E: Sub-process /usr/bin/dpkg returned an error code (2)

Failed to install repository configuration package.

Could not install native binary packages, falling back to alternative installation method.

Failed to start netdata.service: Unit netdata.service is masked.

Netdata service still not started, attempting another forced restart by running '/opt/netdata/bin/netdata '

sudo systemctl status netdata netdata.service Loaded: masked (Reason: Unit netdata.service is masked.) Active: inactive (dead)

Cheers,

Hi @mikehermary

Could you check /var/lib/dpkg/statoverride and if netdata is there, remove it manually and re-try?

Hello,

I have removed the netdata related stuff from that file and tried a fresh installation. Again it has failed, and the installer seems to hang during the process. This sometimes happens when I am updating packages on the server. I usually use control + c to cancel the updates and try again. A second attempt usually works without issues.

Does this mean something is buggered up on the server? I know this is outside of the scope of Netdata, but any suggestions are greatly appreciated.

Cheers,

Hey!

Not sure. I don’t have much experience on package management on Ubuntu, I’m afraid I can’t help there much.

When you say it hangs during the process, what exactly do you mean? Can you provide some output?

Thanks.

Hello,

When I run the command sudo apt update on the server, the process begins and the percentage meter starts to rise, but then stops. Most times I have to force quit the process by using Control + C. Other times, it eventually continues and finds the updates.

I have done some reading and I am wondering if the repositories used on the server are not functioning correctly. All of my other servers that are similarly setup do not experience these issues.

I know this is out of scope for Netdata support, but any tips are greatly appreciated.

Cheers,