Packs
The following are common scenarios that you may encounter when using Packs.
Scenario - Pods with NamespaceLabels are Stuck on Deployment
When deploying a workload cluster with packs that declare namespaceLabels, the associated Pods never start if the
cluster was deployed via self-hosted Palette or
Palette VerteX or if the palette-agent ConfigMap has data.feature.workloads: disable. This is
due to the necessary labels not being applied to the target namespace, resulting in the namespace lacking the elevated
privileges the Pods require and the Kubernetes’ PodSecurity admission blocks the Pods.
To resolve this issue, force-apply the PodSecurity policies directly to the namespace of the affected Pods.
Debug Steps
-
Log in to Palette.
-
From the left main menu, select Clusters. Choose the affected cluster.
-
On the cluster Overview tab, click the Kubeconfig file link to download the cluster's
kubeconfigfile. -
Open a terminal session and set the
KUBECONFIGenvironment variable to the path of thekubeconfigfile.export KUBECONFIG=<path-to-kubeconfig-file> -
Use
kubectlto identify any Pods in the cluster that are not running. Note the namespace that belongs to the Pods associated with the pack usingnamespaceLabels.kubectl get pods --all-namespaces --field-selector status.phase!=RunningExample outputNAME READY STATUS RESTARTS AGE
lb-metallb-helm-metallb-full-speaker-abcde 0/1 Pending 0 3m
lb-metallb-helm-metallb-full-speaker-fghij 0/1 CreateContainerConfigError 0 3m -
Confirm the namespace is missing the
privilegedlabels. Replace<namespace>with the namespace of the affected Pods.kubectl get namespace <namespace> --show-labelsExample outputNAME STATUS AGE LABELS
metallb-system Active 10m kubernetes.io/metadata.name=metallb-system -
Force-apply the
privilegedlabels to the namespace.kubectl label namespace <namespace> \
pod-security.kubernetes.io/enforce=privileged \
pod-security.kubernetes.io/audit=privileged \
pod-security.kubernetes.io/warn=privileged \
--overwrite -
Verify the labels are now present.
kubectl get namespace <namespace> --show-labelsExample outputNAME STATUS AGE LABELS
metallb-system Active 12m kubernetes.io/metadata.name=metallb-system,
pod-security.kubernetes.io/enforce=privileged,
pod-security.kubernetes.io/audit=privileged,
pod-security.kubernetes.io/warn=privileged -
Delete the stuck Pods so that they pick up the new labels.
kubectl delete pods --namespace <namespace> --all
- Wait for the Pods to be redeployed and come up in a
Runningstate.
kubectl get pods --namespace <namespace>
NAME READY STATUS RESTARTS AGE
lb-metallb-helm-metallb-full-speaker-abcde 1/1 Running 0 30s
lb-metallb-helm-metallb-full-speaker-fghij 1/1 Running 0 30s
Scenario - Calico Fails to Start when IPv6 is Enabled
When deploying clusters with the Calico pack and IPv6
enabled, Calico fails to start on hosts running specific Linux kernel versions due to missing or incompatible kernel
modules required for ip6tables MARK support. You can observe the following error in the pod logs.
Failed to execute ip(6)tables-restore command error=exit status 2 errorOutput=... MARK: bad value for option \"--set-mark\", or out of range (0–4294967295)...
There are several possible ways to troubleshoot this issue:
- Use a Container Network Interface (CNI) other than Calico. This is a preferred approach if Calico is optional.
- Use an unaffected or fixed kernel version. This is a preferred approach if you need Calico and IPv6.
- Disable IPv6 on the Calico pack level. This is a preferred approach if you need Calico but not IPv6.
- Disable IPv6 on the BYOS Edge OS pack level. We do not recommend using this approach on its own, as it may not fully resolve the issue. For completeness, pair it with disabling IPv6 at the Calico pack level.
- Disable IPv6 in User Data for Edge Deployment. We do not recommend using this approach on its own, as it may not fully resolve the issue. For completeness, pair it with disabling IPv6 at the Calico pack level.
Debug Steps - Use an Unaffected or Fixed Kernel Version
-
Check your current kernel version using the command below.
uname --kernel-release -
Compare your version with the affected list below.
Branch Affected Versions Fixed Version 5.15.0 generic 5.15.0-127, 5.15.0-128 5.15.0-130 6.8.0 generic 6.8.0-57, 6.8.0-58 6.8.0-60 6.8.0 cloud 6.8.0-1022 6.8.0-1027 -
If your current kernel version matches any affected version, update it to a fixed or unaffected version. The method for updating depends on your deployment environment.
warningWhen updating kernel version for Edge deployments, ensure that the
UPDATE_KERNELparameter value in the.argfile isfalse. This prevents Kairos from updating the kernel during runtime upgrades.
Example - Pin Kernel Version in Kairos Base Image (Dockerfile.ubuntu)
Use this approach if you are building a Kairos image from Dockerfile.ubuntu and want to pin the kernel version.
-
Clone the Kairos GitHub Repository and check out the required version.
git clone https://github.com/kairos-io/kairos.git
cd kairos
git checkout v3.1.3 -
Customize the
images/Dockerfile.ubuntufile. Remove the following lines.RUN [ -z "$(ls -A /boot/vmlinuz*)" ] && apt-get install -y --no-install-recommends \
linux-image-generic-hwe-24.04 || true
RUN apt-get clean && rm -rf /var/lib/apt/lists/*Paste the following lines instead. In this example, the kernel version is set to
6.8.0-60-generic. Replace it with the required version.RUN [ -z "$(ls -A /boot/vmlinuz*)" ] && apt-get install --yes --no-install-recommends \
linux-image-6.8.0-60-generic linux-modules-extra-6.8.0-60-generic || true
RUN apt-get clean && rm -rf /var/lib/apt/lists/* -
Issue the following command to generate a custom Kairos base image. For Trusted Boot (Unified Kernel Image) builds, replace
--BOOTLOADER=grubwith--BOOTLOADER=systemd-boot../earthly.sh +base-image \
--FLAVOR=ubuntu \
--FLAVOR_RELEASE=24.04 \
--FAMILY=ubuntu \
--MODEL=generic \
--VARIANT=core \
--BASE_IMAGE=ubuntu:24.04 \
--BOOTLOADER=grub -
Once the build is complete, tag the image for your registry and version.
docker tag <local-image> <your-registry>/<your-kairos-image>:<your-version>Exampledocker tag kairos/ubuntu-core-base:latest my-registry.io/kairos/kairos-base:6.8.0-60 -
Push the image to your registry.
Exampledocker push my-registry.io/kairos/kairos-base:6.8.0-60 -
Set the
BASE_IMAGEvalue in the.argfile in theCanvOSdirectory to the image name.ExampleBASE_IMAGE=my-registry.io/kairos/kairos-base:6.8.0-60 -
Build the custom provider image and use it for cluster deployment.
infoFor more information on how to build provider images and ISO artifacts for Edge deployments and how to use them in your cluster setup, refer to Build Edge Artifacts.
Example - Pin Kernel Version with Full Boot Configuration (Dockerfile)
Use this approach if you are building a Kairos image from a Dockerfile and need full control over the kernel and boot
configuration.
-
Customize the
Dockerfilein theCanvOSdirectory. For example, add the command below to set a specific kernel version for Ubuntu. Replace6.8.0-60-genericwith the required version....
########################### Add any other image customizations here #######################
# Install specific kernel version if KERNEL_VERSION is provided
RUN if [ "${OS_DISTRIBUTION}" = "ubuntu" ]; then \
apt-get update && \
apt-get install --yes "linux-image-6.8.0-60-generic" "linux-headers-6.8.0-60-generic" "linux-modules-6.8.0-60-generic" && \
apt-get purge --yes $(dpkg-query --list | awk '/^ii\s+linux-(image|headers|modules)/ {print $2}' | grep --invert-match "6.8.0-60-generic") && \
apt-get autoremove --yes && \
rm --recursive --force /var/lib/apt/lists/* && \
kernel=$(ls /boot/vmlinuz-* | grep "6.8.0-60-generic" | head --lines=1) && \
ln --symbolic --force "${kernel#/boot/}" /boot/vmlinuz && \
kernel=$(ls /lib/modules | grep "6.8.0-60-generic" | head --lines=1) && \
dracut --force "/boot/initrd-${kernel}" "${kernel}" && \
ln --symbolic --force "initrd-${kernel}" /boot/initrd && \
depmod --all "${kernel}"; \
fi; \infoFor more information on how to build provider images and ISO artifacts for Edge deployments and how to use them in your cluster setup, refer to Build Edge Artifacts. For details on
Dockerfileusage in EdgeForge, refer to the advanced workflow. -
Build the required image and use it for cluster deployment.
Example - Pin Kernel Version During MAAS Provisioning
Use this approach if you want to override the kernel during MAAS provisioning without rebuilding the OS image.
-
To pin the kernel version during host provisioning with MAAS, create or modify the appropriate file depending on the image type you're deploying:
- If you are using MAAS to deploy an official unmodified Ubuntu image for Agent Mode clusters, create the
/var/lib/snap/maas/current/preseeds/curtin_userdata_ubuntufile. - If you are using MAAS to deploy a custom OS image, modify the
/var/lib/snap/maas/current/preseeds/curtin_userdata_customfile.
In both cases, add the following contents to pin the kernel. Replace
6.8.0-60-genericwith the required version.#cloud-config
kernel:
package: linux-image-6.8.0-60-generic
flavor: hwe
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', {{node_disable_pxe_url|escape.json}}, '--post-data', {{node_disable_pxe_data|escape.json}}, '-O', '/dev/null']
extra_modules: ["curtin", "in-target", "--", "apt", "install", "--yes", "linux-modules-extra-6.8.0-60-generic"] - If you are using MAAS to deploy an official unmodified Ubuntu image for Agent Mode clusters, create the
-
Deploy the node through MAAS to apply the pinned kernel during installation. Refer to Create and Manage MAAS Clusters for the details.
Debug Steps - Disable IPv6 on the Calico Pack Level
-
Log in to Palette.
-
From the left Main Menu, select Profiles.
-
On the Profiles page, click on your cluster profile, which uses Calico as the network pack.
-
Click on the Calico pack to view the Edit Pack page.
-
In the pack's YAML file, uncomment the following parameter and set its value to
false.env:
calicoNode:
FELIX_IPV6SUPPORT: false -
Click Confirm Updates after making the required changes.
-
Click Save Changes on the cluster profile page.
-
Deploy a new cluster using this profile or update an existing cluster to apply the change.
Debug Steps - Disable IPv6 on the BYOS Edge OS Pack Level
-
Log in to Palette.
-
From the left Main Menu, select Profiles.
-
On the Profiles page, click on your cluster profile that uses the BYOS Edge OS pack.
-
Click on the BYOS pack to view the Edit Pack page.
-
In the pack's YAML file, add the following lines.
stages:
boot:
- name: disable-ipv6
commands:
- sysctl --write net.ipv6.conf.all.disable_ipv6=1
- sysctl --write net.ipv6.conf.default.disable_ipv6=1 -
Click Confirm Updates after making the required changes.
-
Click Save Changes on the cluster profile page.
-
Deploy a new cluster using this profile.
If the cluster is already operating and you need to update it, reboot the nodes. Establish an SSH connection to each node and use the following command to trigger a reboot.
sudo reboot
Debug Steps - Disable IPv6 in User Data for Edge Deployment
-
Add the following lines to the
user-datafile.stages:
boot:
- name: disable-ipv6
commands:
- sysctl --write net.ipv6.conf.all.disable_ipv6=1
- sysctl --write net.ipv6.conf.default.disable_ipv6=1 -
If you don't have an ISO image or the cluster is already operating, build a new ISO image and deploy (or redeploy) the cluster.
If you already have an ISO image, but the cluster is not operating yet, create an ISO file containing the additional user data and apply the changes. Refer to Apply Site User Data for more information.
Scenario - Control Plane Node Fails to Upgrade in Sequential MicroK8s Upgrades
In clusters that use MicroK8s as the
Kubernetes distribution, there is a known issue when using the InPlaceUpgrade strategy for sequential Kubernetes
upgrades. For example, upgrading from version 1.25.x to version 1.26.x and then to version 1.27.x may cause the control
plane node to fail to upgrade. Use the following steps to troubleshoot and resolve the issue.
Debug Steps
-
Execute the first MicroK8s upgrade in your cluster. For example, upgrade from version 1.25.x to version 1.26.x.
-
Ensure you can access your cluster using kubectl. Refer to the Access Cluster with CLI guide for more information.
-
After the first upgrade is complete, issue the following command to delete the pod named
upgrade-pod.kubectl delete pod upgrade-pod --namespace default -
Once the pod is deleted, proceed to the next upgrade. For example, upgrade from version 1.26.x to version 1.27.x.
-
Within a few minutes, the control plane node will be upgraded correctly.