This guide is an extension of the Known Issues documentation from SysEleven OpenStack. The issue in MetaKube is basically related and quite similar, but needs to be handled slightly different.
CBK and DBL
(Region FES is not affected)
MetaKube worker nodes run as virtual machines on SysEleven OpenStack.
When a virtual machine establishes a TCP connection to a remote server, it uses a random TCP source port.
In order for return traffic to be allowed to flow into a VM in Openstack, a dynamic inbound security group rule will automatically be created by the SDN (Software Defined Network), allowing traffic to flow back to this random TCP port.
This dynamic rule will expire if the connection is idle for 60 seconds.
If the server is quiet for too long, any return traffic from the remote server will be dropped.
Follow these steps to avoid this issue:
Select one of the following Openstack solutions which applies for your setup:
Either add a rule to your existing MetaKube security group, which explicitly allows returning traffic.
Now the SDN has no need to create dynamic rules anymore. The Linux kernel option
net.ipv4.ip_local_port_range configures the range from which the random source port will be picked when a virtual machine initiates a connection.
For example, setting this value to
32768 - 60999 and allowing all traffic incoming from the server to the client port range
32768 - 60999 will solve the issue.
net.ipv4.ip_local_port_range for your worker nodes you need SSH access to the VM or connect through a node-shell and issue:
or just in case both virtual machines (the client application and the server) run in the same Openstack project and region, add a security group to the server which allows ingress traffic to high ports such as
32768 - 60999 from another security group that the server's ports are members of.
Your application must support enabling TCP keepalives. Turn it on with a timeout value shorter than
The following (or similar) sysctl parameters must be set in the network namespace of the container/Pod (not of the host):
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_time = 10
Kubelet does not consider these settings as "safe" and will not allow containers to run, if they are set in
podSecurityContext section of your container.
Currently, MetaKube does not offer a way to change this.
A valid alternative is to set them in an
initContainer in the same Pod as your application.
apiVersion: v1 kind: Pod metadata: labels: run: myapp-client name: myapp-client spec: containers: - name: myapp-client image: busybox command: ['sh', '-c', 'echo The app is running! && sleep 3600'] initContainers: - command: [sh, -c] args: - | echo "10" > /proc/sys/net/ipv4/tcp_keepalive_intvl echo "5" > /proc/sys/net/ipv4/tcp_keepalive_probes echo "10" > /proc/sys/net/ipv4/tcp_keepalive_time image: alpine:3.16 name: sysctl securityContext: runAsUser: 0 runAsGroup: 0 privileged: true
Verify the settings by:
kubectl exec -it myapp-client -- sysctl -a | grep keepalive
Since keepalives are now activated but not automatically sent on every TCP connection, still the application must request kernel keepalives when it opens the TCP socket. To take effect, the MetaKube solution steps must be used in combination.