Missing health-log.db.old file error log when starting up netdata
A quick question about an observed netdata log error during the service start up.
netdata reports this error log entry during start-up:
cannot open health file: /var/lib/netdata/health/health-log.db.old (errno 2, No such file or directory)
The netdata(v1.8.0) is running on a systemd(v241)-based Linux system (kernel 4.14.157). I would like to know if this is the correct behavior of netdata to load the rotated (.old) database file on a fresh system start up? On a fresh system start up, it is not necessarily to have a db file rotation triggered base on the default setting (i.e. “rotate log every line = 2000” as default), nor the system has any rotated db file to begin with.
Base on my understanding, this error log of missing rotated db file should not be marked as an error. I would like to check with the community if this should be the case or not.
Sorry for getting so long to respond! Can you please try to update netdata and report back if you still experience the same issue?
(Welcome to our community, we will get to the bottom of this ️)
/var/lib/netdata/health/health-log.db.oldis used during the log rotation as you said, the motive that it is marked as error, it is because when Netdata tries to open it, the operate system returns
I also understand your suggestion that this could be only an information, but this will not hide the fact that Netdata is generating an error when it tries to open a file that does not exist.
Finally as @OdysLam suggested, please, update your Netdata to receive the new features and welcome to our community.
It is normal for the agent to try to read the health-log.db.old Switching the error to info is a bit tricky because of the way the rotation logic works.
The health-log.db will reach 2000 lines (default setting) and then it will
- be renamed to health-log.db.old
- truncated to zero.
If at that point the agent is restarted and the “health-log.db.old” file cannot be read there is loss of information (regarding the alarms) and it should be an error.
A nearly full health-log.db and inability to read the health-log.db.old may not be as serious but again there is a change of info missing.
It will depend on the health-log.db having the full set of defined alarms recorded.
Thank you for following up my question and sorry for my late response. I would like to clarify my question:
Is it true that netdata loads rotated (.old) health database file regardless during start up? For a system that has no such rotated file to begin with, this error log will be recorded, in this case is this a real error?
I would assume netdata has logic to track the state of rotated health database file and skips loading the file if no rotation has happened. Please correct me if I am wrong.
FYI. I have
netdatav1.25.0 installed (i.e. using netdata-installer.sh) on my Ubuntu 18.04 and same error log is captured when the service starts.
Hello @leungsk ,
Netdata always load the health log when it loads, and it will rotate logs when the alarm log is equal the parameter
rotate log every lines, this parameter cannot have a value smaller than 100. If this limit is not reached, Netdata won’t rotate the logs.
How many lines do you have inside your
I have the default
rotate log every line = 2000as stated in my initial inquiry. File
/var/lib/netdata/health/health-log.dbhas 55 lines as of now, which has not crossed the rotation threshold yet.
IMO, netdata should have logic knowing the persisted state of health database rotation, and only loads such rotated file as needed. However I could have something missed still, input/clarification is appreciated.
I was a long time without to work with
Healthand I had to study it again before to answer this to be sure that I was no missing nothing. I also removed my file
/var/lib/netdata/health/health-log.db.oldbefore to run Netdata.
@leungsk the rule used to update and rotate log is in this line
And as you can see data will be stored on
oldfile when the value is above the specified threshold.
When Netdata started for the first time after I remove the file, it raised the same error that you had, and this is expected, because Netdata when it tries to read the health log, it also looks for rotated files, but thanks the fact it could not find, it created the error. This is the motive you are receiving errors, Netdata does not store internally if the file exist or not, it tries to load when it starts.