Hey guys,
I installed netdat agent (v1.37.1) as a container on both Raspberry Pi 3 and Raspberry Pi 4 , both with (1 GB) RAM.
For PI 3, it runs without any issue, I can see the node online on cloud Dashboard.
For PI 4, it does not work, it shows in journalctl Out-Of-Memory, is it known issue with PI4?
root@xxx:~# journalctl --since=yesterday | grep "OOM"
Mar 02 09:39:54 d2e2226 eb0ae54c9c7d[1715]: 2023-03-02 09:39:54: netdata INFO : MAIN : Out-Of-Memory (OOM) score is already set to the wanted value 0
During the downtime of cloud service, the device shows no issue but when cloud service comes back, we start see some issues
netdata [ ERROR ] prometheus[pms5003_particulate_matter_sensor_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9662/metrics": dial tcp 127.0.0.1:9662: connect: connection refused
netdata [ ERROR ] prometheus[pms5003_particulate_matter_sensor_exporter_local] job.go:191 check failed
netdata [ ERROR ] prometheus[sap_nwrfc_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9663/metrics": dial tcp 127.0.0.1:9663: connect: connection refused
netdata [ ERROR ] prometheus[sap_nwrfc_exporter_local] job.go:191 check failed
netdata [ ERROR ] prometheus[linux_ha_clusterlabs_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9664/metrics": dial tcp 127.0.0.1:9664: connect: connection refused
The OOM log message doesn’t mean it’s out of memory, it’s a standard informational message, which you can ignore.
The go.d prometheus collector log messages just say that there’s nothing listening on those ports that we scan to autodetect any running services that expose their metrics in the prometheus format. So unless you have services that you know do that, those aren’t relevant either.
So let’s start again. What does doesn’t work mean exactly? That the agent crashes? That it keeps restarting? It’s running but you don’t see data in Netdata Cloud? Can you see its local dashboard in port 19999?
Thanks for your response!
the container keeps restarting all the time!
last messages I can see in logs are
2023-03-02 14:49:02: python.d ERROR: monit[localhost] : Url: http://localhost:2812/_status?format=xml&level=full. Error: HTTPConnectionPool(host='localhost', port=2812): Max retries exceeded with url: /_status?format=xml&level=full (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xf6f8b820>: Failed to establish a new connection: [Errno 111] Connection refused'))
netdata 2023-03-02 14:49:02: python.d ERROR: monit[localhost] : _get_data() returned no data or type is not <dict>
netdata 2023-03-02 14:49:02: python.d INFO: plugin[main] : monit[localhost] : check failed
netdata 2023-03-02 14:49:02: python.d ERROR: nsd[local] : Can't locate "nsd-control stats_noreset" binary
netdata 2023-03-02 14:49:02: python.d INFO: plugin[main] : nsd[local] : check failed
netdata 2023-03-02 14:49:02: python.d ERROR: openldap[openldap] : 'python-ldap' package is needed
netdata 2023-03-02 14:49:02: python.d INFO: plugin[main] : openldap[openldap] : check failed
netdata 2023-03-02 14:49:02: python.d ERROR: oracledb[oracledb] : 'cx_Oracle' package is needed to use oracledb module
netdata 2023-03-02 14:49:02: python.d INFO: plugin[main] : oracledb[oracledb] : check failed
netdata 2023-03-02 14:49:02: python.d ERROR: postfix[local] : Can't locate "postqueue -p" binary
unfortunately, we can’t see the local dashboard on port 19999 (normal since container is dead).
one note I have is why the device is seen as Bare Metal
not Linux as the other PI3
@Christopher_Akritid1 did you have the chance to look into it?
I saw it today. The container getting killed with OOM means that you’ll need to either disable the dbengine and switch to another memory mode , or reduce the number of tiers and retention , or collect fewer metrics .
See what’s best for you, the easiest thing to try is the first one, switching to a different memory mode.
I have an update here.
after some investigation I could run it. HostOs(balena) was 64bits and image Arch is arm64. The issue appears only when we have a hostOS 64 bits and image Arch is arm. Then, it does not work.
I though image with arch arm should work with HostOS(armed64), right?
I though image with arch arm should work with HostOS(armed64), right?
In theory yes, but in practice not always. There are occasional subtle differences that can cause unexpected behavior in cases like this, though I must admit I’m not sure what exactly is going wrong here (usually if this type of thing fails, it will be due to an illegal instruction exception).
1 Like
<netdata> [ INFO ] main[main] setup.go:123 looking for 'vsphere.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/vsphere.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'pika.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/pika.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'freeradius.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/freeradius.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'mongodb.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/mongodb.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'nginxvts.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/nginxvts.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'nginx.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/nginx.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'squidlog.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/squidlog.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'couchbase.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/couchbase.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'lighttpd.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/lighttpd.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'docker.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/docker.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'ping.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/ping.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'activemq.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/activemq.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'phpfpm.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/phpfpm.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'nvme.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/nvme.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'k8s_kubeproxy.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
<netdata> [ INFO ] main[main] setup.go:139 found '/usr/lib/netdata/conf.d/go.d/k8s_kubeproxy.conf
<netdata> [ INFO ] main[main] setup.go:123 looking for 'powerdns_recursor.conf' in [/etc/netdata/go.d /usr/lib/netdata/conf.d/go.d]
Warning: Suppressed 300 message(s) due to rate limiting
<netdata> [ ERROR ] prometheus[promacct_pcap-based_network_traffic_accounting_local] prometheus.go:89 Get "http://127.0.0.1:9112/metrics": dial tcp 127.0.0.1:9112: connect: connection refused
<netdata> [ ERROR ] prometheus[promacct_pcap-based_network_traffic_accounting_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[nginx_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9113/metrics": dial tcp 127.0.0.1:9113: connect: connection refused
<netdata> [ ERROR ] prometheus[nginx_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[elasticsearch_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9114/metrics": dial tcp 127.0.0.1:9114: connect: connection refused
<netdata> [ ERROR ] prometheus[elasticsearch_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[blackbox_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9115/metrics": dial tcp 127.0.0.1:9115: connect: connection refused
<netdata> [ ERROR ] prometheus[blackbox_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[snmp_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9116/metrics": dial tcp 127.0.0.1:9116: connect: connection refused
<netdata> [ ERROR ] prometheus[snmp_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[apache_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9117/metrics": dial tcp 127.0.0.1:9117: connect: connection refused
<netdata> [ ERROR ] prometheus[apache_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[jenkins_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9118/metrics": dial tcp 127.0.0.1:9118: connect: connection refused
<netdata> [ ERROR ] prometheus[jenkins_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[bind_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9119/metrics": dial tcp 127.0.0.1:9119: connect: connection refused
<netdata> [ ERROR ] prometheus[bind_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[powerdns_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9120/metrics": dial tcp 127.0.0.1:9120: connect: connection refused
<netdata> [ ERROR ] prometheus[powerdns_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[redis_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9121/metrics": dial tcp 127.0.0.1:9121: connect: connection refused
<netdata> [ ERROR ] prometheus[redis_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[influxdb_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9122/metrics": dial tcp 127.0.0.1:9122: connect: connection refused
<netdata> [ ERROR ] prometheus[influxdb_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[rethinkdb_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9123/metrics": dial tcp 127.0.0.1:9123: connect: connection refused
<netdata> [ ERROR ] prometheus[rethinkdb_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[freebsd_sysctl_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9124/metrics": dial tcp 127.0.0.1:9124: connect: connection refused
<netdata> [ ERROR ] prometheus[freebsd_sysctl_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[statsd_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9125/metrics": dial tcp 127.0.0.1:9125: connect: connection refused
<netdata> [ ERROR ] prometheus[statsd_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[new_relic_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9126/metrics": dial tcp 127.0.0.1:9126: connect: connection refused
<netdata> [ ERROR ] prometheus[new_relic_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[pgbouncer_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9127/metrics": dial tcp 127.0.0.1:9127: connect: connection refused
<netdata> [ ERROR ] prometheus[pgbouncer_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[ceph_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9128/metrics": dial tcp 127.0.0.1:9128: connect: connection refused
<netdata> [ ERROR ] prometheus[ceph_exporter_local] job.go:191 check failed
<netdata> 2023-03-09 07:24:15: python.d INFO: plugin[main] : using python v3
<netdata> [ ERROR ] prometheus[haproxy_log_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9129/metrics": dial tcp 127.0.0.1:9129: connect: connection refused
<netdata> [ ERROR ] prometheus[haproxy_log_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[unifi_poller_local] prometheus.go:89 Get "http://127.0.0.1:9130/metrics": dial tcp 127.0.0.1:9130: connect: connection refused
<netdata> [ ERROR ] prometheus[unifi_poller_local] job.go:191 check failed
<netdata> 2023-03-09 07:24:15: python.d INFO: plugin[main] : '/etc/netdata/python.d.conf' was not found
<netdata> [ ERROR ] prometheus[varnish_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9131/metrics": dial tcp 127.0.0.1:9131: connect: connection refused
<netdata> [ ERROR ] prometheus[varnish_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[airflow_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9132/metrics": dial tcp 127.0.0.1:9132: connect: connection refused
<netdata> [ ERROR ] prometheus[airflow_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[fritz_box_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9133/metrics": dial tcp 127.0.0.1:9133: connect: connection refused
<netdata> [ ERROR ] prometheus[fritz_box_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[zfs_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9134/metrics": dial tcp 127.0.0.1:9134: connect: connection refused
<netdata> [ ERROR ] prometheus[zfs_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[rtorrent_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9135/metrics": dial tcp 127.0.0.1:9135: connect: connection refused
<netdata> [ ERROR ] prometheus[rtorrent_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[collins_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9136/metrics": dial tcp 127.0.0.1:9136: connect: connection refused
<netdata> [ ERROR ] prometheus[collins_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[silicondust_hdhomerun_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9137/metrics": dial tcp 127.0.0.1:9137: connect: connection refused
<netdata> [ ERROR ] prometheus[silicondust_hdhomerun_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[heka_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9138/metrics": dial tcp 127.0.0.1:9138: connect: connection refused
<netdata> [ ERROR ] prometheus[heka_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[azure_sql_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9139/metrics": dial tcp 127.0.0.1:9139: connect: connection refused
<netdata> [ ERROR ] prometheus[azure_sql_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[mirth_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9140/metrics": dial tcp 127.0.0.1:9140: connect: connection refused
<netdata> [ ERROR ] prometheus[mirth_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[zookeeper_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9141/metrics": dial tcp 127.0.0.1:9141: connect: connection refused
<netdata> [ ERROR ] prometheus[zookeeper_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[big-ip_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9142/metrics": dial tcp 127.0.0.1:9142: connect: connection refused
<netdata> [ ERROR ] prometheus[big-ip_exporter_local] job.go:191 check failed
<netdata> 2023-03-09 07:24:15: python.d WARNING: plugin[main] : 'pythond-jobs-statuses.json' was not found
<netdata> [ ERROR ] prometheus[cloudmonitor_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9143/metrics": dial tcp 127.0.0.1:9143: connect: connection refused
<netdata> [ ERROR ] prometheus[cloudmonitor_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[aerospike_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9145/metrics": dial tcp 127.0.0.1:9145: connect: connection refused
<netdata> [ ERROR ] prometheus[aerospike_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[icecast_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9146/metrics": dial tcp 127.0.0.1:9146: connect: connection refused
<netdata> [ ERROR ] prometheus[icecast_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[nginx_request_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9147/metrics": dial tcp 127.0.0.1:9147: connect: connection refused
<netdata> [ ERROR ] prometheus[nginx_request_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[nats_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9148/metrics": dial tcp 127.0.0.1:9148: connect: connection refused
<netdata> [ ERROR ] prometheus[nats_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[passenger_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9149/metrics": dial tcp 127.0.0.1:9149: connect: connection refused
<netdata> [ ERROR ] prometheus[passenger_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[memcached_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9150/metrics": dial tcp 127.0.0.1:9150: connect: connection refused
<netdata> [ ERROR ] prometheus[memcached_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[varnish_request_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9151/metrics": dial tcp 127.0.0.1:9151: connect: connection refused
<netdata> [ ERROR ] prometheus[varnish_request_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[command_runner_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9152/metrics": dial tcp 127.0.0.1:9152: connect: connection refused
<netdata> [ ERROR ] prometheus[command_runner_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[coredns_local] prometheus.go:89 Get "http://127.0.0.1:9153/metrics": dial tcp 127.0.0.1:9153: connect: connection refused
<netdata> [ ERROR ] prometheus[coredns_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[postfix_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9154/metrics": dial tcp 127.0.0.1:9154: connect: connection refused
<netdata> [ ERROR ] prometheus[postfix_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[vsphere_graphite_local] prometheus.go:89 Get "http://127.0.0.1:9155/metrics": dial tcp 127.0.0.1:9155: connect: connection refused
<netdata> [ ERROR ] prometheus[vsphere_graphite_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[webdriver_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9156/metrics": dial tcp 127.0.0.1:9156: connect: connection refused
<netdata> [ ERROR ] prometheus[webdriver_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[ibm_mq_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9157/metrics": dial tcp 127.0.0.1:9157: connect: connection refused
<netdata> [ ERROR ] prometheus[ibm_mq_exporter_local] job.go:191 check failed
<netdata> 2023-03-09 07:24:15: python.d INFO: plugin[main] : [adaptec_raid] is disabled by default, skipping it
<netdata> [ ERROR ] prometheus[pingdom_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9158/metrics": dial tcp 127.0.0.1:9158: connect: connection refused
<netdata> [ ERROR ] prometheus[pingdom_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[apache_flink_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9160/metrics": dial tcp 127.0.0.1:9160: connect: connection refused
<netdata> [ ERROR ] prometheus[apache_flink_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[oracle_db_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9161/metrics": dial tcp 127.0.0.1:9161: connect: connection refused
<netdata> [ ERROR ] prometheus[oracle_db_exporter_local] job.go:191 check failed
Warning: Suppressed 150 message(s) due to rate limiting
<netdata> --- BEGIN TRACE ---
<netdata> [ ERROR ] prometheus[gitlab-pages_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9235/metrics": dial tcp 127.0.0.1:9235: connect: connection refused
<netdata> [ ERROR ] prometheus[gitlab-pages_exporter_local] job.go:191 check failed
<netdata> Error: Connection failure: Connection refused
<netdata> [ ERROR ] prometheus[gitlab_gitaly_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9236/metrics": dial tcp 127.0.0.1:9236: connect: connection refused
<netdata> [ ERROR ] prometheus[gitlab_gitaly_exporter_local] job.go:191 check failed
<netdata> --- END TRACE ---
<netdata> [ ERROR ] prometheus[sql_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9237/metrics": dial tcp 127.0.0.1:9237: connect: connection refused
<netdata> [ ERROR ] prometheus[sql_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[uwsgi_expoter_local] prometheus.go:89 Get "http://127.0.0.1:9238/metrics": dial tcp 127.0.0.1:9238: connect: connection refused
<netdata> [ ERROR ] prometheus[uwsgi_expoter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[surfboard_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9239/metrics": dial tcp 127.0.0.1:9239: connect: connection refused
<netdata> [ ERROR ] prometheus[surfboard_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[tinyproxy_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9240/metrics": dial tcp 127.0.0.1:9240: connect: connection refused
<netdata> [ ERROR ] prometheus[tinyproxy_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[arangodb_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9241/metrics": dial tcp 127.0.0.1:9241: connect: connection refused
<netdata> [ ERROR ] prometheus[arangodb_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[ceph_radosgw_usage_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9242/metrics": dial tcp 127.0.0.1:9242: connect: connection refused
<netdata> [ ERROR ] prometheus[ceph_radosgw_usage_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[chef_compliance_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9243/metrics": dial tcp 127.0.0.1:9243: connect: connection refused
<netdata> [ ERROR ] prometheus[chef_compliance_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[moby_container_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9244/metrics": dial tcp 127.0.0.1:9244: connect: connection refused
<netdata> [ ERROR ] prometheus[moby_container_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[naemon_nagios_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9245/metrics": dial tcp 127.0.0.1:9245: connect: connection refused
<netdata> [ ERROR ] prometheus[naemon_nagios_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[smartpi_local] prometheus.go:89 Get "http://127.0.0.1:9246/metrics": dial tcp 127.0.0.1:9246: connect: connection refused
<netdata> [ ERROR ] prometheus[smartpi_local] job.go:191 check failed
<netdata> 2023-03-09 07:24:15: charts.d: ERROR: nut: Cannot find UPSes - please set nut_ups='ups_name' in /nut.conf
<netdata> [ ERROR ] prometheus[sphinx_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9247/metrics": dial tcp 127.0.0.1:9247: connect: connection refused
<netdata> [ ERROR ] prometheus[sphinx_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[freebsd_gstat_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9248/metrics": dial tcp 127.0.0.1:9248: connect: connection refused
<netdata> [ ERROR ] prometheus[freebsd_gstat_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[apache_flink_metrics_reporter_local] prometheus.go:89 Get "http://127.0.0.1:9249/metrics": dial tcp 127.0.0.1:9249: connect: connection refused
<netdata> [ ERROR ] prometheus[apache_flink_metrics_reporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[opentsdb_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9250/metrics": dial tcp 127.0.0.1:9250: connect: connection refused
<netdata> [ ERROR ] prometheus[opentsdb_exporter_local] job.go:191 check failed
<netdata> 2023-03-09 07:24:15: charts.d: ERROR: nut: module's 'nut' check() function reports failure.
<netdata> [ ERROR ] prometheus[sensu_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9251/metrics": dial tcp 127.0.0.1:9251: connect: connection refused
<netdata> [ ERROR ] prometheus[sensu_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[gitlab_runner_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9252/metrics": dial tcp 127.0.0.1:9252: connect: connection refused
<netdata> [ ERROR ] prometheus[gitlab_runner_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[php-fpm_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9253/metrics": dial tcp 127.0.0.1:9253: connect: connection refused
<netdata> [ ERROR ] prometheus[php-fpm_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[kafka_burrow_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9254/metrics": dial tcp 127.0.0.1:9254: connect: connection refused
<netdata> [ ERROR ] prometheus[kafka_burrow_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[google_stackdriver_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9255/metrics": dial tcp 127.0.0.1:9255: connect: connection refused
<netdata> [ ERROR ] prometheus[google_stackdriver_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[td-agent_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9256/metrics": dial tcp 127.0.0.1:9256: connect: connection refused
<netdata> [ ERROR ] prometheus[td-agent_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[smart_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9257/metrics": dial tcp 127.0.0.1:9257: connect: connection refused
<netdata> [ ERROR ] prometheus[smart_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[hello_sense_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9258/metrics": dial tcp 127.0.0.1:9258: connect: connection refused
<netdata> [ ERROR ] prometheus[hello_sense_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[azure_resources_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9259/metrics": dial tcp 127.0.0.1:9259: connect: connection refused
<netdata> [ ERROR ] prometheus[azure_resources_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[buildkite_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9260/metrics": dial tcp 127.0.0.1:9260: connect: connection refused
<netdata> [ ERROR ] prometheus[buildkite_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[grafana_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9261/metrics": dial tcp 127.0.0.1:9261: connect: connection refused
<netdata> [ ERROR ] prometheus[grafana_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[bloomsky_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9262/metrics": dial tcp 127.0.0.1:9262: connect: connection refused
<netdata> [ ERROR ] prometheus[bloomsky_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[vmware_guest_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9263/metrics": dial tcp 127.0.0.1:9263: connect: connection refused
<netdata> [ ERROR ] prometheus[vmware_guest_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[nest_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9264/metrics": dial tcp 127.0.0.1:9264: connect: connection refused
<netdata> [ ERROR ] prometheus[nest_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[weather_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9265/metrics": dial tcp 127.0.0.1:9265: connect: connection refused
<netdata> [ ERROR ] prometheus[weather_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[openhab_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9266/metrics": dial tcp 127.0.0.1:9266: connect: connection refused
<netdata> [ ERROR ] prometheus[openhab_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[nagios_livestatus_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9267/metrics": dial tcp 127.0.0.1:9267: connect: connection refused
<netdata> [ ERROR ] prometheus[nagios_livestatus_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[cratedb_remote_remote_read_write_adapter_local] prometheus.go:89 Get "http://127.0.0.1:9268/metrics": dial tcp 127.0.0.1:9268: connect: connection refused
<netdata> [ ERROR ] prometheus[cratedb_remote_remote_read_write_adapter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[fluent-agent-lite_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9269/metrics": dial tcp 127.0.0.1:9269: connect: connection refused
<netdata> [ ERROR ] prometheus[fluent-agent-lite_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[jmeter_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9270/metrics": dial tcp 127.0.0.1:9270: connect: connection refused
<netdata> [ ERROR ] prometheus[jmeter_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[pagespeed_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9271/metrics": dial tcp 127.0.0.1:9271: connect: connection refused
<netdata> [ ERROR ] prometheus[pagespeed_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[vmware_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9272/metrics": dial tcp 127.0.0.1:9272: connect: connection refused
<netdata> [ ERROR ] prometheus[vmware_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[kubernetes_persistentvolume_disk_usage_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9274/metrics": dial tcp 127.0.0.1:9274: connect: connection refused
<netdata> [ ERROR ] prometheus[kubernetes_persistentvolume_disk_usage_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[nrpe_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9275/metrics": dial tcp 127.0.0.1:9275: connect: connection refused
<netdata> [ ERROR ] prometheus[nrpe_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[githubql_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9276/metrics": dial tcp 127.0.0.1:9276: connect: connection refused
<netdata> [ ERROR ] prometheus[githubql_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[azure_monitor_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9276/metrics": dial tcp 127.0.0.1:9276: connect: connection refused
<netdata> [ ERROR ] prometheus[azure_monitor_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[mongo_collection_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9277/metrics": dial tcp 127.0.0.1:9277: connect: connection refused
<netdata> [ ERROR ] prometheus[mongo_collection_exporter_local] job.go:191 check failed
<netdata> [ ERROR ] prometheus[crypto_miner_exporter_local] prometheus.go:89 Get "http://127.0.0.1:9278/metrics": dial tcp 127.0.0.1:9278: connect: connection refused
<netdata> [ ERROR ] prometheus[crypto_miner_exporter_local] job.go:191 check failed
I attach the logs if you have any suggestion
This log only contains messages from the new, collectors log file, nothing from netdata itself (error.log).
@a-alyousef are you certain this is all of it?
@Austin_Hemmelgarn / @ilyam8 can you think of a reason this would be the container’s output?
We created an issue on Githab with all the required info,
opened 03:00PM - 09 Mar 23 UTC
bug
needs triage
### Bug description
We have some raspberry pi 4 with arm64 processor that crash… es using armv7 builds of netdata. We need to use this armv7 since we can't deploy mixed platforms to the fleet using balena (our IoT deployment solution) and the fleet contains both armv7 and arm64 devices.
The same version, but arm64 works fine.
I was trying to look at the core files with gdb to get more insight but since there are no debug symbols in the release, was not very useful.
```
(gdb) bt full
#0 0x0078ff8e in ?? ()
No symbol table info available.
#1 0x0079215a in ?? ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
```
Then I figured out that you also have `netdata/netdata-debug` image that includes debug symbols, but then with this version of the image the crash doesn't happen.
Any idea what else can I do to debug this?
### Expected behavior
Netdata run without crashing
### Steps to reproduce
1. Use netdata docker image for arm32 in arm64 devices.
### Installation method
docker
### System info
```shell
# uname -a; grep -HvE "^#|URL" /etc/*release
Linux 9c5186f 5.10.95-v8 #1 SMP PREEMPT Thu Feb 17 11:43:01 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
/etc/os-release:ID="balena-os"
/etc/os-release:NAME="balenaOS"
/etc/os-release:VERSION="2023.1.0"
/etc/os-release:VERSION_ID="2023.1.0"
/etc/os-release:PRETTY_NAME="balenaOS 2023.1.0"
/etc/os-release:MACHINE="raspberrypi4-64"
/etc/os-release:META_BALENA_VERSION="2.107.40"
/etc/os-release:BALENA_BOARD_REV=""
/etc/os-release:META_BALENA_REV=""
/etc/os-release:SLUG="raspberrypi4-64"
```
### Netdata build info
```shell
# netdata -W buildinfo
Version: netdata v1.38.1
Configure options: '--prefix=/usr' '--sysconfdir=/etc' '--localstatedir=/var' '--libexecdir=/usr/libexec' '--libdir=/usr/lib' '--with-zlib' '--with-math' '--with-user=netdata' '--without-bundled-protobuf' '--disable-dependency-tracking' '--disable-ebpf' 'CFLAGS=-Og -ggdb -pipe' 'LDFLAGS='
Install type: oci
Binary architecture: armv7l
Features:
dbengine: YES
Native HTTPS: YES
Netdata Cloud: YES
ACLK: YES
TLS Host Verification: YES
Machine Learning: YES
Stream Compression: YES
Libraries:
protobuf: YES (system)
jemalloc: NO
JSON-C: YES
libcap: NO
libcrypto: YES
libm: YES
tcalloc: NO
zlib: YES
Plugins:
apps: YES
cgroup Network Tracking: YES
CUPS: NO
EBPF: NO
IPMI: YES
NFACCT: NO
perf: YES
slabinfo: YES
Xen: NO
Xen VBD Error Tracking: NO
Exporters:
AWS Kinesis: NO
GCP PubSub: NO
MongoDB: YES
Prometheus Remote Write: YES
Debug/Developer Features:
Trace Allocations: NO
```
### Additional info
_No response_