Hey All!
I’ve been wrestling with this for about 6 hours now and still can’t get this GPU alert to work. I’m starting to think it might actually be impossible! ^^
-
Using netdata v2.8.4, agent only, Ubuntu
-
Having Nvidia GPU and using
nvidia_smicollector. All charts are working correctly under Metrics -
Created gpu.conf under /etc/netdata/health.d with
nvidia_smi.gpu_utilizationas this was on the usage chart:alarm: gpu_usage on: nvidia_smi.gpu_utilization lookup: average -1m units: % every: 1m warn: $this > 80 crit: $this > 90 info: GPU usage monitoring -
Above and dozens of configurations did not work. Alert is not showing in web and
/api/v1/alarms?all -
No relevant info in Journal and error.log
-
To narrow the problem, in above conf I only changed *
on: system.ram*and alert appeared successfully. So I guess that must be a bigger problem withnvidia_smi?
Please help, I’m sooo stuck with this ![]()