Read only volumes in migrated clusters after upgrade

Update:
This issue is fixed in Kubernetes versions 1.19.12 and 1.20.8.
These versions are safe to upgrade to.

Because of an issue in upstream Kubernetes, clusters with migrated PVs (from in-tree to external CSI plugin) and newer Kubernetes patch versions have their volumes remounted read-only after Cluster changes.
If your cluster was migrated to the external cloud controller, this will likely affect you.

Background

A fix in the attacher-detacher controller in Kubernetes introduced a serious bug that affects migrated Persistent Volumes.

Every time the Kubernetes controller manager restarts, it compares the node.status.volumesAttached field with VolumeAttachments, as differences could indicate unfinished attach/detach operations.
For migrated volumes it "thinks" that they aren't actually attached, so it marks the attachment state as Uncertain and the volume gets detached from the node.
When the CSI controller attaches the volume again, the Kernel remounts the device as read only, since the original device has disappeared.

On affected clusters, this happens every time the controller manager restarts.
A cluster upgrade to any affected version, triggers this instantly, but it can also happen spontaneously.

Symptoms

The problem usually manifests in multiple ways:

  1. Errors in applications

    This usually surfaces as stateful pods, such as DBs, going into the CrashLoopBackOff state.
    This may also cascade to other applications, e.g. ones writing to the DB over the network.
    Not all applications cause the container to exit, though. Some tolerate it silently, even if all I/O operations fail.

  2. The FilesystemReadOnly condition on nodes becomes true

Affected K8s versions

The following Kubernetes versions are affected:

  • 1.18.16 <= x < 1.19.0
  • 1.19.8 <= x < 1.19.12
  • 1.20.3 <= x < 1.20.8

Note:

The Kubernetes version installed on the nodes does not affect this issue.


Mitigation

If your cluster is running a version that's affected, you will have to switch to a safe version.
There are two ways to do this:

  • Upgrade to one of the versions 1.19.7 or 1.20.2
  • If you don't want to upgrade, contact us; we can downgrade your cluster for you to a safe patch version

If you already got read-only mounts, the pods will need to be recreated, preferably on a different node or with a short "downtime".
This causes Kubernetes to remove and recreate the VolumeAttachment and the CSI Controller to detach and attach the volume correctly.

Complete fix

Update:
This issue is fixed in Kubernetes versions 1.19.12 and 1.20.8.

References