In my understanding the value 0 of vm.swappiness
doesn’t mean swapping disabled, but instead “avoid as much as possible to swap”. Linux kernel docs say:
This control is used to define how aggressive the kernel will swap
memory pages. Higher values will increase aggressiveness, lower values
decrease the amount of swap. A value of 0 instructs the kernel not to
initiate swap until the amount of free and file-backed pages is less
than the high water mark in a zone.
A kernel developer further explains that value of 0 for vm.swappiness
is not recommended, but 1, if you want maximum avoiding of swapping. That is a long insightful article on ram and swap that I haven’t managed to finish reading as I want first to finish commenting here.
In my particular use case that triggered the alarm, the machine has vm.swappiness
20 and is running a rsync cron job. This makes me remember the proposal I formulated last summer about the ability to have a veto configuration option so for example I could edit the alarm and instruct it that if the conditions are met, the alarm should not be raised if the process rsync is found running and time is between 7:30am and 08:00am when the script is expected to run. Has any progress been made internally or was the idea further discussed later? I see there is some progress with the ML engine, but it seems to not be able to handle this case.
Also recent Linux kernels have implemented memory pressure stall information. I see netdata is already monitoring it. Perhaps the available/used memory alarm should check with memory PSI and conclude the alarm should not be raised unless the memory PSI metric also confirms a real deficit of needed memory?