Netdata Community

Nvidia-smi use to monitor quadro gpus

Hi everybody,
I have just discovered netdata and I already think it is a great tool.

I installed it the recommended way using ‘curl’ and I have the latest stable branch on RHEL 7.8 with a Quadro P5200. Nvidia-smi works usually like a charm and the normal user can run it without escalated permissions.

I used inside the folder /etc/netdata/:

sudo ./edit-config python.d/nvidia_smi.conf

The conf file is created, with the values :

nvidia-hm:
	name: 		nvidia-smi
	update_every: 	1
	priority: 	60000
	penalty: 	no
	autodetection_retry:	0
	loop_mode: 	yes
	poll_seconds: 	1

After that I restart the netdata service using :

sudo systemctl restart netdata

… but nothing Nvidia related appears on the dashboard. I cannot wrap my head around it, I must have missed something but either on Github or the official documentation I have not found the answer.

Would someone please help me ?
Thanks in advance,

Alexandre

Thanks @rybue for providing another answer, you rock man :slight_smile:

OK ! Thanks to the hint in error log, I have the solution. It appears that the file does not read tabulations… my bad. An old habit to format code…
Huge thanks @Rybue for your precious help and your reactivity :slight_smile:

Well I did all of it. Thanks for the path to the error logs. Among others, I get a :

2020-09-14 10:35:22: python.d ERROR: plugin[main] : [nvidia_smi] error on loading '/etc/netdata/python.d/nvidia_smi.conf' : ScannerError()
2020-09-14 10:35:22: python.d INFO: plugin[main] : [nvidia_smi] has no job configs, skipping it

Could you try to simplify python.d/nvidia_smi.conf file and have only those lines in file:

loop_mode: yes
poll_seconds: 1

Restart service and check dashboard.

Also, check for any errors in the /var/log/netdata/error.log file, related to nvidia.

1 Like

Nope. Nothing called Nvidia-smi on the dashboard. I have a screenshot but I do not know how to add it to my answer…

On dashboard, new item, called nvidia smi should appeared. It should not be under NetData Monitoring > python.d

Hi @Rybue,
Thanks a lot for your reactivity, I deeply apologize for being late.
I ran :

sudo ./edit-config python.d.conf

and uncomented the nvidia_smi yes line. I then restarted the service but still nothing… On the dashboard I searched under NetData Monitoring > python.d but still no Nvidia Gpu… Am I missing something ?

@eidal, make sure to report back the results of the above comment, we want to make sure that you got it working!

Thanks @rybue for the suggestion! :pray:

Hi @eidal!

You should also enable nvidia-smi python plugin in python.d.conf
So, run

sudo ./edit-config python.d.conf

and uncoment nvidia_smi yes line

2 Likes

Thanks all it works great. Sorry for the delay I was sure I closed the subject last time. Thanks