Hello,
I have upgrade netdata on my Ubuntu 22.04 serveur to the version 1.45.2
Since then, netdata refuses to starts.
In the logs, I can see:
Apr 10 10:36:07 qa netdata[1559585]: Cannot chown file '/home/netdata/dbengine-tier1/journalfile-1-0000000540.njf' to 116:123
Apr 10 10:36:07 qa netdata[1559585]: Cannot chown file '/home/netdata/dbengine-tier1/journalfile-1-0000000544.njf' to 116:123
Apr 10 10:36:07 qa netdata[1559585]: Cannot chown file '/home/netdata/dbengine-tier1/journalfile-1-0000000553.njf' to 116:123
Apr 10 10:36:07 qa netdata[1559585]: Cannot chown file '/home/netdata/dbengine-tier1/journalfile-1-0000000547.njf' to 116:123
Apr 10 10:36:07 qa netdata[1559585]: Cannot chown file '/home/netdata/dbengine-tier1/journalfile-1-0000000541.njf' to 116:123
I tried to delete the folder /home/netdata and reset the owner, but then the new logs are:
Apr 10 10:38:17 qa netdata[1561134]: Netdata agent version "v1.45.2" is starting
Apr 10 10:38:17 qa netdata[1561134]: IEEE754: system is using IEEE754 DOUBLE PRECISION values
Apr 10 10:38:17 qa netdata[1561134]: TIMEZONE: using the contents of /etc/timezone
Apr 10 10:38:17 qa netdata[1561134]: TIMEZONE: fixed as 'Etc/UTC'
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: next: initialize signals
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 0 ms, initialize signals - next: initialize static threads
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 0 ms, initialize static threads - next: initialize web server
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 0 ms, initialize web server - next: initialize ML
Apr 10 10:38:17 qa netdata[1561134]: Failed to initialize database at /home/netdata/ml.db, due to "unable to open database file"
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 3 ms, initialize ML - next: initialize h2o server
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 0 ms, initialize h2o server - next: set resource limits
Apr 10 10:38:17 qa netdata[1561134]: resources control: allowed file descriptors: soft = 1024, max = 524288
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 0 ms, set resource limits - next: become daemon
Apr 10 10:38:17 qa netdata[1561134]: Out-Of-Memory (OOM) score is already set to the wanted value 0
Apr 10 10:38:17 qa netdata[1561134]: Adjusted netdata scheduling policy to batch (3), with priority 0.
Apr 10 10:38:17 qa netdata[1561134]: Running with process scheduling policy 'batch', nice level 19
Apr 10 10:38:17 qa netdata[1561134]: Cannot chown directory '/home/netdata' to 116:123
Apr 10 10:38:17 qa netdata[1561134]: Cannot chown file '/home/netdata/.netdata_bash_sleep_timer_fifo' to 116:123
Apr 10 10:38:17 qa netdata[1561134]: Cannot chown directory '/var/lib/netdata/cloud.d' to 116:123
Apr 10 10:38:17 qa netdata[1561134]: netdata started on pid 1561134.
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 1 ms, become daemon - next: initialize threads after fork
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 0 ms, initialize threads after fork - next: initialize registry
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 0 ms, initialize registry - next: fork the spawn server
Apr 10 10:38:17 qa netdata[1561134]: Initializing spawn client.
Apr 10 10:38:17 qa netdata[1561134]: NETDATA STARTUP: in 0 ms, fork the spawn server - next: collecting system info
Apr 10 10:38:17 qa netdata[1561137]: Spawn server is up.
Apr 10 10:38:18 qa netdata[1561134]: NETDATA STARTUP: in 371 ms, collecting system info - next: initialize RRD structures
Apr 10 10:38:18 qa netdata[1561134]: Failed to initialize database at /home/netdata/netdata-meta.db, due to "unable to open database file"
Apr 10 10:38:18 qa netdata[1561134]: Failed to initialize SQLite
Apr 10 10:38:18 qa netdata[1561134]: NETDATA SHUTDOWN: initializing shutdown with code 1...
Apr 10 10:38:18 qa netdata[1561134]: /usr/sbin/netdata(+0x43d631)[0x55c341fcb631]
Apr 10 10:38:18 qa netdata[1561134]: /usr/sbin/netdata(+0x209ef0)[0x55c341d97ef0]
Apr 10 10:38:18 qa netdata[1561134]: /usr/sbin/netdata(+0x712d3)[0x55c341bff2d3]
Apr 10 10:38:18 qa netdata[1561134]: /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f2f2fca5d90]
Apr 10 10:38:18 qa netdata[1561134]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f2f2fca5e40]
Apr 10 10:38:18 qa netdata[1561134]: /usr/sbin/netdata(+0x743d5)[0x55c341c023d5]
Apr 10 10:38:18 qa netdata[1561134]: Shutdown process started
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): P[timex]
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [1/25] - 'create shutdown file' finished in 1654 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): P[idlejitter]
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [2/25] - 'dbengine exit mode' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): HEALTH
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): PLUGINSD
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): SERVICE
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [3/25] - 'close webrtc connections' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [4/25] - 'disable maintenance, new queries, new web requests, new streaming connections and aclk' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): STATSD_FLUSH
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [5/25] - 'stop maintenance thread' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [6/25] - 'stop exporters, health and web servers threads' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [7/25] - 'stop collectors and streaming threads' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): EXPORTING
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [8/25] - 'stop replication threads' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [9/25] - 'prepare metasync shutdown' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): WEB[1]
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [10/25] - 'disable ML detection and training threads' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): ACLK_MAIN
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [11/25] - 'stop context thread' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [12/25] - 'clear web client cache' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): RRDCONTEXT
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [13/25] - 'stop aclk threads' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: shutdown step: [14/25] - 'stop all remaining worker threads' finished in 0 milliseconds
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): REPLAY[1]
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): P[tc]
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): P[diskspace]
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): P[proc]
Apr 10 10:38:19 qa netdata[1561134]: EXIT: No thread running (marking as EXITED): P[cgroups]
Apr 10 10:38:19 qa netdata[1561134]: Waiting 15 threads to finish...
Apr 10 10:38:20 qa netdata[1561134]: All threads finished.
Apr 10 10:38:20 qa netdata[1561134]: No statements pending to finalize
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [15/25] - 'cancel main threads' finished in 100 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: EXIT: cannot unlink pidfile '/run/netdata/netdata.pid'.
Apr 10 10:38:20 qa netdata[1561134]: DIGEST-MD5 common mech free
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [16/25] - 'flush dbengine tiers' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [17/25] - 'stop collection for all hosts' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [18/25] - 'stop metasync threads' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [19/25] - 'wait for dbengine collectors to finish' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [20/25] - 'wait for dbengine main cache to finish flushing' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [21/25] - 'stop dbengine tiers' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [22/25] - 'close SQL databases' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [23/25] - 'remove pid file' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [24/25] - 'free openssl structures' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: shutdown step: [25/25] - 'remove incomplete shutdown file' finished in 0 milliseconds
Apr 10 10:38:20 qa netdata[1561134]: Shutdown process ended in 1755 milliseconds
Apr 10 10:38:20 qa netdata[1561137]: EOF found in spawn pipe.
Apr 10 10:38:20 qa netdata[1561137]: Shutting down spawn server event loop.
Apr 10 10:38:20 qa netdata[1561137]: Shutting down spawn server loop complete.
Apr 10 10:38:20 qa netdata[1561137]: DIGEST-MD5 common mech free
Apr 10 10:38:50 qa netdata[1561526]: time=2024-04-10T10:38:50.241+00:00 comm=netdata source=daemon level=info errno="2, No such file or directory" tid=1561526 thread=netdata msg="CONFIG: cannot load cloud config '/var/lib/netdata/cloud.d/cloud.conf'. Running with internal defaults."
Apr 10 10:38:50 qa netdata[1561526]: Netdata agent version "v1.45.2" is starting
Apr 10 10:38:50 qa netdata[1561526]: IEEE754: system is using IEEE754 DOUBLE PRECISION values
Apr 10 10:38:50 qa netdata[1561526]: TIMEZONE: using the contents of /etc/timezone
Apr 10 10:38:50 qa netdata[1561526]: TIMEZONE: fixed as 'Etc/UTC'
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: next: initialize signals
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 0 ms, initialize signals - next: initialize static threads
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 0 ms, initialize static threads - next: initialize web server
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 0 ms, initialize web server - next: initialize ML
Apr 10 10:38:50 qa netdata[1561526]: Failed to initialize database at /home/netdata/ml.db, due to "unable to open database file"
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 3 ms, initialize ML - next: initialize h2o server
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 0 ms, initialize h2o server - next: set resource limits
Apr 10 10:38:50 qa netdata[1561526]: resources control: allowed file descriptors: soft = 1024, max = 524288
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 0 ms, set resource limits - next: become daemon
Apr 10 10:38:50 qa netdata[1561526]: Out-Of-Memory (OOM) score is already set to the wanted value 0
Apr 10 10:38:50 qa netdata[1561526]: Adjusted netdata scheduling policy to batch (3), with priority 0.
Apr 10 10:38:50 qa netdata[1561526]: Running with process scheduling policy 'batch', nice level 19
Apr 10 10:38:50 qa netdata[1561526]: Cannot chown directory '/home/netdata' to 116:123
Apr 10 10:38:50 qa netdata[1561526]: Cannot chown file '/home/netdata/.netdata_bash_sleep_timer_fifo' to 116:123
Apr 10 10:38:50 qa netdata[1561526]: Cannot chown directory '/var/lib/netdata/cloud.d' to 116:123
Apr 10 10:38:50 qa netdata[1561526]: netdata started on pid 1561526.
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 1 ms, become daemon - next: initialize threads after fork
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 0 ms, initialize threads after fork - next: initialize registry
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 0 ms, initialize registry - next: fork the spawn server
Apr 10 10:38:50 qa netdata[1561526]: Initializing spawn client.
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 0 ms, fork the spawn server - next: collecting system info
Apr 10 10:38:50 qa netdata[1561529]: Spawn server is up.
Apr 10 10:38:50 qa netdata[1561526]: NETDATA STARTUP: in 353 ms, collecting system info - next: initialize RRD structures
Apr 10 10:38:50 qa netdata[1561526]: Failed to initialize database at /home/netdata/netdata-meta.db, due to "unable to open database file"
Apr 10 10:38:50 qa netdata[1561526]: Failed to initialize SQLite
Apr 10 10:38:50 qa netdata[1561526]: NETDATA SHUTDOWN: initializing shutdown with code 1...
Apr 10 10:38:50 qa netdata[1561526]: /usr/sbin/netdata(+0x43d631)[0x55a7790a3631]
Apr 10 10:38:50 qa netdata[1561526]: /usr/sbin/netdata(+0x209ef0)[0x55a778e6fef0]
Apr 10 10:38:50 qa netdata[1561526]: /usr/sbin/netdata(+0x712d3)[0x55a778cd72d3]
Apr 10 10:38:50 qa netdata[1561526]: /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fec96a63d90]
Apr 10 10:38:50 qa netdata[1561526]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fec96a63e40]
Apr 10 10:38:50 qa netdata[1561526]: /usr/sbin/netdata(+0x743d5)[0x55a778cda3d5]
Apr 10 10:38:50 qa netdata[1561526]: Shutdown process started
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [1/25] - 'create shutdown file' finished in 1452 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): P[timex]
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [2/25] - 'dbengine exit mode' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): P[idlejitter]
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [3/25] - 'close webrtc connections' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): HEALTH
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [4/25] - 'disable maintenance, new queries, new web requests, new streaming connections and aclk' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): PLUGINSD
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [5/25] - 'stop maintenance thread' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): SERVICE
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [6/25] - 'stop exporters, health and web servers threads' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): STATSD_FLUSH
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [7/25] - 'stop collectors and streaming threads' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [8/25] - 'stop replication threads' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): EXPORTING
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [9/25] - 'prepare metasync shutdown' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): WEB[1]
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [10/25] - 'disable ML detection and training threads' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): ACLK_MAIN
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [11/25] - 'stop context thread' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): RRDCONTEXT
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [12/25] - 'clear web client cache' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [13/25] - 'stop aclk threads' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): REPLAY[1]
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): P[tc]
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [14/25] - 'stop all remaining worker threads' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): P[diskspace]
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): P[proc]
Apr 10 10:38:52 qa netdata[1561526]: EXIT: No thread running (marking as EXITED): P[cgroups]
Apr 10 10:38:52 qa netdata[1561526]: Waiting 15 threads to finish...
Apr 10 10:38:52 qa netdata[1561526]: All threads finished.
Apr 10 10:38:52 qa netdata[1561526]: No statements pending to finalize
Apr 10 10:38:52 qa netdata[1561526]: EXIT: cannot unlink pidfile '/run/netdata/netdata.pid'.
Apr 10 10:38:52 qa netdata[1561526]: DIGEST-MD5 common mech free
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [15/25] - 'cancel main threads' finished in 100 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [16/25] - 'flush dbengine tiers' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [17/25] - 'stop collection for all hosts' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [18/25] - 'stop metasync threads' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [19/25] - 'wait for dbengine collectors to finish' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [20/25] - 'wait for dbengine main cache to finish flushing' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [21/25] - 'stop dbengine tiers' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [22/25] - 'close SQL databases' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [23/25] - 'remove pid file' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [24/25] - 'free openssl structures' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: shutdown step: [25/25] - 'remove incomplete shutdown file' finished in 0 milliseconds
Apr 10 10:38:52 qa netdata[1561526]: Shutdown process ended in 1553 milliseconds
Apr 10 10:38:52 qa netdata[1561529]: EOF found in spawn pipe.
Apr 10 10:38:52 qa netdata[1561529]: Shutting down spawn server event loop.
Apr 10 10:38:52 qa netdata[1561529]: Shutting down spawn server loop complete.
Apr 10 10:38:52 qa netdata[1561529]: DIGEST-MD5 common mech free
Connected with Netdata user, I can write / read file:
root@qa:/home/netdata# sudo su -s /bin/bash netdata
netdata@qa:/home/netdata$ pwd
/home/netdata
netdata@qa:/home/netdata$ touch test
netdata@qa:/home/netdata$ ll
ll: command not found
netdata@qa:/home/netdata$ ls
test
netdata@qa:/home/netdata$ ls -al
total 8
drwxrwx--- 2 netdata netdata 4096 Apr 10 10:43 .
drwxr-xr-x 15 root root 4096 Apr 10 10:36 ..
prw-rw---- 1 netdata netdata 0 Dec 16 14:53 .netdata_bash_sleep_timer_fifo
-rw-rw---- 1 netdata netdata 0 Apr 10 10:43 test
My netdata.conf file:
[global]
hostname = QA.redacted.IO_QA
enabled = yes
[web]
bind to = *
web files owner = root
web files group = netdata
[directories]
cache = /home/netdata
home = /home/netdata
Did I miss something in the upgrade process?
Thanks for your feedback.