Hi,
Maybe it is a bit too late, but it may be informative on how to solve this problem (at least for me).
I encountered the “node(s) had volume node affinity conflict” problem as well. However I didn’t try to deploy it on AWS, but on Azure instead.
The Netdata helm chart (chart version 3.7.30) creates three persistent volumes (PV), but all these PV’s were deployed on different availability zones within my cluster. For the “netdata-k8s-state” pod this is fine, since it only uses 1 PV and kubernetes makes sure that the pod is mounted on the same zone as the PV.
The problem arises with the “netdata-parent” pod, since it uses 2 PV’s (one for the database and one for the alarms). These PV’s are mounted to different zones and thus the pod cannot be allocated such that it can access both PV’s. For Azure this is the case when the PV’s are created as LRS disks.
Volumes that use Azure managed LRS disks are not zone-redundant resources, those volumes cannot be attached across zones and must be co-located in the same zone as a given node hosting the target pod.
(source: https://docs.microsoft.com/en-us/azure/aks/availability-zones)
When you are able to use ZRS disks I guess the problem should be solved, since these disks are volumes that can be scheduled on all zone and non-zone agent nodes. I wasn’t able to use ZRS disks.
Solution
I created a new StorageClass by copying my Azure default StorageClass and added the allowedTopologies
property (more information over here. This makes sure that when you use this StorageClass to create a PV, it is created in your specified zone.
My new StorageClass looks like this:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.beta.kubernetes.io/is-default-class: 'false'
labels:
kubernetes.io/cluster-service: 'true'
name: default-single-zone
parameters:
cachingmode: ReadOnly
kind: Managed
storageaccounttype: StandardSSD_LRS
provisioner: kubernetes.io/azure-disk
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
- key: failure-domain.beta.kubernetes.io/zone
values:
- westeurope-1
When you deploy the Netdata helm chart, you can change the helm values and set the StorageClass to use to the one you just created.
database:
persistence: true
## Make sure to set the storageclass to your newly created storage class
storageclass: 'default-single-zone'
volumesize: 2Gi
alarms:
persistence: true
## Make sure to set the storageclass to your newly created storage class
storageclass: 'default-single-zone'
volumesize: 1Gi
When the Netdata chart was redeployed my database and alarms PV were both created in “westeurope-1” zone and thus the “netdata-parent” pod was mounted to the same zone.