Cilium
Versions Supported
- 1.15.x
- 1.14.x
If you are using Bring Your Own Operating System (BYOOS), then HWE (Hardware Enabled) Kernel or a Kernel that supports eBPF modules needs to be provisioned.
Configure Cilium for Agent Mode Edge Clusters
For agent mode Edge clusters, add the following stages to the OS pack YAML file to enable Cilium.
stages:
boot.before:
- name: "Ensure CNI directory permissions on restart"
if: "[ -d /opt/cni/bin ]"
commands:
- chown --recursive root:root /opt/cni/bin
boot:
- name: "Ensure CNI directory permissions on restart"
if: "[ -d /opt/cni/bin ]"
commands:
- chown --recursive root:root /opt/cni/bin
boot.after:
- name: "Ensure CNI directory permissions on restart"
if: "[ -d /opt/cni/bin ]"
commands:
- chown --recursive root:root /opt/cni/bin
Disable UDP Segmentation on VMXNET3 Edge Hosts for Clusters with Overlay Networks
Due to a known issue with VMware's VMXNET3 adapter, which is widely used in different virtual machine management services, including VMware vSphere and Hyper-V, Cilium Pods may face network connectivity issues.
If deploying an Edge host in a virtual machine environment using a VMXNET3 adapter and enabling an overlay network for your cluster, add the following commands in the user-data
file at the boot stage. Replace <interface-name>
with the name of the network interface on your Edge host.
stages:
initramfs:
- name: "Disable UDP segmentation"
commands:
- ethtool --offload <interface-name> tx-udp_tnl-segmentation off
- ethtool --offload <interface-name> tx-udp_tnl-csum-segmentation off
Troubleshooting
Scenario - I/O Timeout Error on VMware
If you are deploying a cluster to a VMware environment using the VXLAN tunnel protocol, you may encounter I/O timeout errors. This is due to a known bug in the VXMNET3 adapter that results in VXLAN traffic being dropped. You can learn more about this issue in Cilium's GitHub issue #21801.
You can work around the issue by using one of the two following methods:
-
Option 1: Set a different tunnel protocol in the Cilium configuration. You can set the tunnel protocol to
geneve
.charts:
cilium:
tunnelProtocol: "geneve" -
Option 2: Modify the Operating System (OS) layer of your cluster profile to automatically disable UDP Segmentation Offloading (USO).
kubeadmconfig:
preKubeadmCommands:
# Disable hardware segmentation offloading due to VMXNET3 issue
- |
install -m 0755 /dev/null /usr/lib/networkd-dispatcher/routable.d/10-disable-offloading
cat <<EOF > /usr/lib/networkd-dispatcher/routable.d/10-disable-offloading
#!/bin/sh
ethtool -K eth0 tx-udp_tnl-segmentation off
ethtool -K eth0 tx-udp_tnl-csum-segmentation off
ethtool --offload eth0 rx off tx off
EOF
systemctl restart systemd-networkd
Prerequisite
If you are using Bring Your Own Operating System (BYOOS), then HWE (Hardware Enabled) Kernel or a Kernel that supports eBPF modules needs to be provisioned.
Troubleshooting
Scenario - I/O Timeout Error on VMware
If you are deploying a cluster to a VMware environment using the VXLAN tunnel protocol, you may encounter I/O timeout errors. This is due to a known bug in the VXMNET3 adapter that results in VXLAN traffic being dropped. You can learn more about this issue in Cilium's GitHub issue #21801.
You can work around the issue by using one of the two following methods:
-
Option 1: Set a different tunnel protocol in the Cilium configuration. You can set the tunnel protocol to
geneve
.charts:
cilium:
tunnelProtocol: "geneve" -
Option 2: Modify the Operating System (OS) layer of your cluster profile to automatically disable UDP Segmentation Offloading (USO).
kubeadmconfig:
preKubeadmCommands:
# Disable hardware segmentation offloading due to VMXNET3 issue
- |
install -m 0755 /dev/null /usr/lib/networkd-dispatcher/routable.d/10-disable-offloading
cat <<EOF > /usr/lib/networkd-dispatcher/routable.d/10-disable-offloading
#!/bin/sh
ethtool -K eth0 tx-udp_tnl-segmentation off
ethtool -K eth0 tx-udp_tnl-csum-segmentation off
ethtool --offload eth0 rx off tx off
EOF
systemctl restart systemd-networkd
Terraform
You can reference the Cilium pack in Terraform with the following data resource.
data "spectrocloud_registry" "public_registry" {
name = "Public Repo"
}
data "spectrocloud_pack" "cilium" {
name = "cni-cilium-oss"
version = "1.15.3"
registry_uid = data.spectrocloud_registry.public_registry.id
}