Troubleshoot volumes

How does a Pod use a persistent volume?

In order for a container to be able to read/write to a volume backed by shared network storage, a lot of things need to happen:

  1. Kubernetes creates a Pod (e.g. the StatefulSet controller in the kube-controller-manager). Initially the Pod is in the Pending state.
  2. The kube-scheduler assigns the Pod to a Node.
  3. Kubernetes (the attacher-detacher) creates a VolumeAttachment to signal to the CSI-Controller that a volume needs to be made available to the Node.
  4. The external-attacher sidecar container of the CSI-Controller watches the VolumeAttachment and calls the ControllerPublishVolume RPC call on the CSI-Controller.
  5. The CSI-Controller calls the Cloud-API to attach the volume to the Node.
  6. Openstack attaches a disk to the server.
  7. Kubelet on the Node calls the NodePublishVolume RPC call on the CSI-Node-Plugin.
  8. The CSI-Node-Plugin mounts the disk to the mountpoint where Kubelet manages the Pod's volumes and prepares the filesystem for use by the container (e.g. chown).
  9. Kubelet calls CRI (Docker or Containerd) to create the containers of the Pods.
  10. Containerd creates the container with a bind-mount to the desired mountpoint within the container's mount namespace.

Architecture

The Cinder CSI plugin has two components:

  1. The "Controller Plugin"
  2. The "Node Plugin"

The Node plugin runs as a DaemonSet on all nodes in the cluster and performs volume operations on the machine itself,
like mounts.

The Controller Plugin is managed by SysEleven and does not run in the cluster. It performs volume operations that are independent of a machine,
like creating/deleting, attaching/detaching and resizing volumes.

Troubleshooting guide

Check events of a Pod and the PVC it uses

kubectl -n <namespace> describe pod <pod name>

This will show if the steps that Kubelet takes to make the volume available succeed or not.

kubectl -n <namespace> describe pvc <pvc name>

This will show if any resizing operations are failing (e.g. because of lacking quota)

Check the Openstack volume backing the PV

To get the PV:

PV=$(kubectl -n <namespace> get pvc <pvc name> -o jsonpath='{.spec.volumeName}')

To get the Openstack volume ID:

VOLUME_ID=$(kubectl get pv "$PV" -o jsonpath='{.spec.csi.volumeHandle}')

To check the Openstack volume (requires Openstack CLI)

openstack volume show "$VOLUME_ID"

Possible issues

My Pod is stuck in the Init phase, and it says Kubelet is waiting for the volume to be attached

Step 6 of the above described steps can only succeed, if the volume is not currently attached to a different Node.

Check, if there's another Pod using the same PVC.

You may only use a Deployment with a PVC, if:

  1. The Deployment has only 1 replica
  2. It is using the spec.strategy.type: Recreate

A RollingUpdate won't work, because Kubernetes will not delete the active Pod before the new one is Ready.

MetaKube currently does not support RWX volumes!.
This means, There is currently no supported way to operate a Deployment using a PVC with high availability.
Using the strategy Recreate will lead to downtime until the new Pod is created.
Your application will also not tolerate any disruption, e.g. a failing Node.

To account for high availability, we recommend separating the storage layer from your application and scaling both horizontally.

This can usually be achieved with a distributed database or by using object storage.

I resized my PVC, but the container still doesn't have more space available

Like providing a volume to a Pod after creation, resizing the PVC also undergoes a few steps before the Pod sees the desired change.

  1. PVC resources are increased.
  2. The external-resizer sidecar of the CSI-Controller recognizes the change and calls the ControllerExpandVolume RPC call of the CSI-Controller.
  3. The CSI-Controller calls the Cloud-API to resize the volume.
  4. The next time the volume is used, the CSI-Node-Plugin resizes the file system of the volume.

Step 3 will only succeed if the volume is currently not attached to a Node.
Kubernetes doesn't wait for the volume expansion to finish before creating a Pod that uses the volume.
Nor does it delete the Pod proactively if the requested resources of the PVC changed.

Sometimes a rollover of the StatefulSet gives the CSI-Controller enough time to resize the volume:

kubectl -n <namespace> rollout restart statefulset <statefulset name>

Kubernetes will back off with an exponential retry interval if the expansion failed.
This means, Kubernetes will often recreate the Pod, schedule it and attach the volume before the CSI-Controller even gets a chance to resize the volume.

The best way to ensure a volume gets resized is to constantly delete the Pod e.g. with a busy bash loop:

while true; do kubectl -n <namespace> delete po <pod name>; done

Since Pods owned by a StatefulSet keep their names, there's no need to use a selector.
Meanwhile, check if the capacity of the PV has changed and if so interrupt the loop.

Other causes

You might have exhausted your quota for storage in your SysEleven Stack project.
In this case, or if the issue persists, please contact the SysEleven product support.

References