Netdata container fails to collect MySQL stats after MariaDB restart when connecting via UNIX socket

Problem/Question

Hi All,
My netdata container collects MySQL stats as expected from the MariaDB database on the host, as long as netdata is started while MariaDB is running. The problem is when the database is restarted, netadata MySQL data collection stops. The logs show ‘connection refused’ when using MySQL collector in debug mode.

[ ERROR ] mysql[local] mysql.go:93 error on pinging the mysql database [netdata:net01@@unix(/var/lib/mysql/mysql.sock)/]: dial unix /var/lib/mysql/mysql.sock: connect: connection refused

Thanks.

Environment

netdata v1.31.0
Red Hat Enterprise Linux release 8.4 (Ootpa)
MariaDb v10.4.21

What I expected to happen

I expect the data collection to recommence and the MySQL graphs to render as per normal.

[ DEBUG ] mysql[local] mysql.go:97 connected using DSN [netdata:net01@@unix(/var/lib/mysql/mysql.sock)/]

I do recall we need a netdata user to be created on the database: MySQL monitoring with Netdata | Learn Netdata

Had you done that before and had it working?

@ilyam8 have you come across this before?

Thanks for the reply @andrewm4894,

Yes the netdata user has been setup and works as per mysql.conf extract below;

[ JOBS ]

jobs:

my.cnf

  • name: local
    my.cnf: ‘/etc/my.cnf’

  • name: local
    dsn: netdata:net01@@unix(/tmp/mysql.sock)/

The mysql collector works as expected as per extract below, however none of the mysql charts get updated;

bash-5.0$ /usr/libexec/netdata/plugins.d/go.d.plugin -d -m mysql
[ DEBUG ] main[main] main.go:111 plugin: name=go.d, version=v0.28.2
[ DEBUG ] main[main] main.go:113 current user: name=netdata, uid=201
[ INFO ] main[main] agent.go:106 instance is started
[ INFO ] main[main] setup.go:39 loading config file
[ INFO ] main[main] setup.go:47 looking for ‘go.d.conf’ in [/etc/netdata /usr/lib/netdata/conf.d]
[ INFO ] main[main] setup.go:54 found ‘/usr/lib/netdata/conf.d/go.d.conf
[ INFO ] main[main] setup.go:61 config successfully loaded
[ INFO ] main[main] agent.go:110 using config: enabled ‘true’, default_run ‘true’, max_procs ‘0’
[ INFO ] main[main] setup.go:66 loading modules
[ INFO ] main[main] setup.go:85 enabled/registered modules: 1/59
[ INFO ] main[main] setup.go:90 building discovery config
[ INFO ] main[main] setup.go:116 looking for ‘mysql.conf’ in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
[ INFO ] main[main] setup.go:123 found ‘/etc/netdata/go.d/mysql.conf
[ INFO ] main[main] setup.go:128 dummy/read/watch paths: 0/1/0
[ INFO ] discovery[manager] manager.go:90 registered discoverers: [file discovery: [file reader]]
[ INFO ] discovery[manager] manager.go:95 instance is started
[ INFO ] build[manager] build.go:106 instance is started
[ INFO ] run[manager] run.go:30 instance is started
[ INFO ] discovery[file manager] discovery.go:71 instance is started
[ INFO ] discovery[file reader] read.go:39 instance is started
[ INFO ] discovery[file reader] read.go:40 instance is stopped
[ DEBUG ] build[manager] build.go:153 received config group (’/etc/netdata/go.d/mysql.conf’): 3 jobs (added: 3, removed: 0)
[ DEBUG ] build[manager] build.go:302 building mysql[local] job, config: map[provider:file reader source:/etc/netdata/go.d/mysql.conf autodetection_retry:0 module:mysql my.cnf:/etc/my.cnf name:local priority:70000 update_every:1]
[ ERROR ] mysql[local] mysql.go:81 key-value delimiter not found: !includedir /etc/my.cnf.d
[ ERROR ] mysql[local] job.go:152 init failed
[ DEBUG ] build[manager] build.go:302 building mysql[local] job, config: map[provider:file reader source:/etc/netdata/go.d/mysql.conf autodetection_retry:0 dsn:netdata:net01@@unix(/tmp/mysql.sock)/ module:mysql name:local priority:70000 update_every:1]
[ DEBUG ] mysql[local] mysql.go:97 connected using DSN [netdata:net01@@unix(/tmp/mysql.sock)/]
[ DEBUG ] mysql[local] collect_version.go:17 executing query: ‘SELECT VERSION()’
[ DEBUG ] mysql[local] collect_version.go:23 application version: 10.4.21-MariaDB
[ DEBUG ] mysql[local] collect_global_status.go:144 executing query: ‘SHOW GLOBAL STATUS’
[ DEBUG ] mysql[local] collect_global_vars.go:23 executing query: ‘SHOW GLOBAL VARIABLES WHERE Variable_name LIKE ‘max_connections’ OR Variable_name LIKE ‘table_open_cache’’
[ DEBUG ] mysql[local] collect_slave_status.go:30 executing query: ‘SHOW ALL SLAVES STATUS’
[ DEBUG ] mysql[local] collect_user_statistics.go:25 executing query: ‘SHOW USER_STATISTICS’
[ INFO ] mysql[local] job.go:160 check success
[ INFO ] build[manager] build.go:208 mysql[local] job is being served by another job, skipping it
[ INFO ] mysql[local] job.go:180 started, data collection interval 1s
[ DEBUG ] run[manager] run.go:41 tick 0
[ DEBUG ] mysql[local] collect_global_status.go:144 executing query: ‘SHOW GLOBAL STATUS’
[ DEBUG ] mysql[local] collect_global_vars.go:23 executing query: ‘SHOW GLOBAL VARIABLES WHERE Variable_name LIKE ‘max_connections’ OR Variable_name LIKE ‘table_open_cache’’
[ DEBUG ] mysql[local] collect_slave_status.go:30 executing query: ‘SHOW ALL SLAVES STATUS’
[ DEBUG ] mysql[local] collect_user_statistics.go:25 executing query: ‘SHOW USER_STATISTICS’
CHART ‘netdata.execution_time_of_mysql_local’ ‘’ ‘Execution Time for mysql_local’ ‘ms’ ‘go.d’ ‘netdata.go_plugin_execution_time’ ‘’ ‘145000’ ‘1’ ‘’ ‘go.d’ ‘mysql’
DIMENSION ‘time’ ‘’ ‘’ ‘1’ ‘1’ ‘’

CHART ‘mysql_local.net’ ‘’ ‘Bandwidth’ ‘kilobits/s’ ‘bandwidth’ ‘mysql.net’ ‘area’ ‘70000’ ‘1’ ‘’ ‘go.d’ ‘mysql’
DIMENSION ‘bytes_received’ ‘in’ ‘incremental’ ‘8’ ‘1000’ ‘’
DIMENSION ‘bytes_sent’ ‘out’ ‘incremental’ ‘-8’ ‘1000’ ‘’

BEGIN ‘mysql_local.net’
SET ‘bytes_received’ = 1014140552
SET ‘bytes_sent’ = 4910862481
END


Cheers

Hi, @John.

You have edited your post several times, so let’s see if I understand the problem:

  • MySQL is not running inside a container.
  • Netdata is running inside a container.
  • Netdata is able to collect MySQL data when it starts after MySQL.
  • Netdata is connected via a UNIX socket.
  • Netdata fails to handle MySQL restart (the charts don’t get updated).

Is that correct?

If yes, let’s do the following:

  • run “go.d.plugin” in the debug mode.
  • restart MySQL service.
  • at this point, the debug output should tell us the problem.

Hi @ilyam8,

Yes the above is correct. Here are the last few lines of output from debug mode;

BEGIN ‘mysql_local.userstats_commands_maxscale’ 999832
SET ‘userstats_maxscale_select_commands’ = 1320
SET ‘userstats_maxscale_update_commands’ = 0
SET ‘userstats_maxscale_other_commands’ = 3
END

BEGIN ‘netdata.execution_time_of_mysql_local’ 999832
SET ‘time’ = 3
END

[ DEBUG ] run[manager] run.go:41 tick 4
[ DEBUG ] mysql[local] collect_global_status.go:144 executing query: ‘SHOW GLOBAL STATUS’
[mysql] 2021/10/15 02:30:03 packets.go:122: closing bad idle connection: EOF
[ INFO ] main[main] agent.go:191 received broken pipe signal (13). Terminating…
bash-5.0$

Cheers.

Thanks, @John. I see what is happening. A quick fix is to switch from Unix socket to TCP.

@John the PR with the fix is agent: dont exit on SIGPIPE by ilyam8 · Pull Request #630 · netdata/go.d.plugin · GitHub

Thanks @ilyam8,

The fix is working, FYI logs & graphs below;

BEGIN ‘mysql_local.userstats_commands_root’ 1000166
SET ‘userstats_root_select_commands’ = 962
SET ‘userstats_root_update_commands’ = 4
SET ‘userstats_root_other_commands’ = 962
END

BEGIN ‘netdata.execution_time_of_mysql_local’ 1000166
SET ‘time’ = 5
END

[ DEBUG ] run[manager] run.go:41 tick 41
[ DEBUG ] mysql[local] collect_global_status.go:144 executing query: ‘SHOW GLOBAL STATUS’
[mysql] 2021/10/22 09:23:39 packets.go:123: closing bad idle connection: EOF
[ ERROR ] mysql[local] mysql.go:105 error on collecting global status: dial tcp 10.195.225.51:3306: connect: connection refused
[ DEBUG ] run[manager] run.go:41 tick 42
[ DEBUG ] mysql[local] collect_global_status.go:144 executing query: ‘SHOW GLOBAL STATUS’
[ ERROR ] mysql[local] mysql.go:105 error on collecting global status: dial tcp 10.195.225.51:3306: connect: connection refused
[ DEBUG ] run[manager] run.go:41 tick 43
[ DEBUG ] mysql[local] collect_global_status.go:144 executing query: ‘SHOW GLOBAL STATUS’
[ DEBUG ] run[manager] run.go:41 tick 44
[ DEBUG ] mysql[local] job.go:174 skip the tick due to previous run hasn’t been finished
[mysql] 2021/10/22 09:23:42 packets.go:37: read tcp 10.0.2.100:32920->10.195.225.51:3306: read: connection reset by peer
[ INFO ] main[main] agent.go:191 received broken pipe signal (13). Terminating…
bash-5.0$

Cheers,
John.