Netdata Community

Freeipmi.plugin - internal system error

Hi,

I need help. In my Centos7 the freeipmi don’t work because this error:

  • freeipmi.plugin ERROR : MAIN : ipmi_monitoring_sensor_readings_by_record_id(): internal system error (errno 9, Bad file descriptor)
  • freeipmi.plugin FATAL : freeipmi.plugin : freeipmi.plugin: data collection failed. # : Success

Does anyone know how to solve?

@Saruspete any thoughts?

The issue seems to come from freeipmi. “internal system error” message maps to libipmimonitoring/ipmi_monitoring.c :: ipmi_monitoring_errmsgs

The “errno 9, Bad file descriptor” looks like a syscall that failed, most likely due to a hardware specific implementation issue.

To get more details without recompiling freeipmi in debug, strace should help to get the culprit. @Alex can you run the following command, and post the (lengthy) result (as a file or a gist-like if possible)

sudo strace -fyy -s 1024 ./freeipmi.plugin
1 Like

Hi @Saruspete ,

On the server in question I can debug freeipmi smoothly and run. ERROR only gives when running Netdata daemon.

In debug mode, that’s ok:

Ok, so it’s working as expected when you’re running it with an interactive user, but not when running as a service, is that correct ?

If you’re running the daemon netdata in foreground ( sbin/netdata -D) as the target user, does it also fails with the errror ?
I suspect it’s due to the sandboxing from systemd service definition.

it’s working as expected when you’re running it with an interactive user, but not when running as a service

If it is due to CapabilityBoundingSet restrictions, i would try to reset it and see if it helps.

# as root
mkdir /etc/systemd/system/netdata.service.d
echo -e '[Service]\nCapabilityBoundingSet=~' | tee /etc/systemd/system/netdata.service.d/unset-capability-bounding-set.conf
systemctl daemon-reload
systemctl restart netdata.service

I’m running the netdata daemon in the foreground and it doesn’t give an error, but it stays here:

it doesn’t send a lot of output in stdout, as soon as the logfiles are created all logging goes there.
So you should be able to connect to the interface and see if IPMI is displayed.

can you please test what @ilyam8 proposed ? It’s a clean override for systemd definition of the service.

Thanks, the @ilyam8 solution worked.

@ilyam8 and @Saruspete
After all it only worked on some servers … not on all!
Some servers continue with the same error.

can you check the override file is correctly used by systemd, with systemctl cat netdata.service ?

yes, file is correctly used.

#/etc/systemd/system/netdata.service.d/startup.conf
[Unit]
Wants=fscrypt.service
After=
After=fscrypt.service
#/etc/systemd/system/netdata.service.d/unset-capability-bounding-set.conf
[Service]
CapabilityBoundingSet=~

@Alex same symptoms? Works when running in the debug mode and doesn’t work when running as a service?

yes @ilyam8 , exactly the same symptoms.
On the servers that were ok, I changed the CapabilityBoundingSet again, and instead of CapabilityBoundingSet = ~, just be CapabilityBoundingSet = CAP_SYS_RAWIO

In those that do not work I let it stay the same CapabilityBoundingSet = ~