Following are some architectural highlights of the Amazon Web Services (AWS) clusters, provisioned by Palette:
- Kubernetes nodes can be distributed across multiple availability zones (AZs) to achieve high availability (HA). For each of the AZs that you select, a public subnet and a private subnet is created.
- All the control plane nodes and worker nodes are created within the private subnets, so there is no direct public access available.
- A Network Address Translation (NAT) Gateway is created in the public subnet of each AZ, to allow nodes in the private subnet to be able to go out to the internet or call other AWS services.
- An Internet Gateway (IG) is created for each Virtual Private Cloud (VPC), to allow Secure Shell Protocol (SSH) access to the bastion node for debugging purposes. SSH into Kubernetes nodes is only available through the bastion node. A bastion node helps to provide access to the Amazon Elastic Compute Cloud (EC2) instances. This is because the EC2 instances are created in a private subnet and the bastion node operates as a secure, single point of entry into the infrastructure. The bastion node can be accessed via SSH or Remote Desktop (RDP).
- The Kubernetes API Server endpoint is accessible through an Elastic Load Balancing (ELB), which load balances across all the control plane nodes.
The following prerequisites must be met before deploying an Amazon Elastic Kubernetes Service (EKS) workload cluster:
- You need an active AWS cloud account with all the permissions listed below in the AWS Cloud Account Permissions section.
- Register your AWS cloud account in Palette, as described in the Creating an AWS Cloud Account section below.
- You should have an Infrastructure Cluster profile created in Palette for AWS.
- Palette creates compute, network, and storage resources on AWS, during the provisioning of Kubernetes clusters. Ensure there is sufficient capacity in the preferred AWS region for the creation of the following resources:
- vCPU
- VPC
- Elastic IP
- Internet Gateway
- Elastic Load Balancers
- NAT Gateway
The following four policies include all the required permissions for provisioning clusters through Palette:
Controller Policy
Last Update: May 25, 2022
{"Version": "2012-10-17","Statement": [{"Effect": "Allow","Action": ["ec2:AllocateAddress","ec2:AssociateRouteTable","ec2:AttachInternetGateway","ec2:AuthorizeSecurityGroupIngress","ec2:CreateInternetGateway","ec2:CreateNatGateway","ec2:CreateRoute","ec2:ReplaceRoute","ec2:CreateRouteTable","ec2:CreateSecurityGroup","ec2:CreateSubnet","ec2:CreateTags","ec2:CreateVpc","ec2:ModifyVpcAttribute","ec2:DeleteInternetGateway","ec2:DeleteNatGateway","ec2:DeleteNetworkInterface","ec2:DeleteRouteTable","ec2:DeleteSecurityGroup","ec2:DeleteSubnet","ec2:DeleteTags","ec2:DeleteVpc","ec2:DescribeAccountAttributes","ec2:DescribeAddresses","ec2:DescribeAvailabilityZones","ec2:DescribeInstances","ec2:DescribeInternetGateways","ec2:DescribeImages","ec2:DescribeNatGateways","ec2:DescribeNetworkInterfaces","ec2:DescribeNetworkInterfaceAttribute","ec2:DescribeRouteTables","ec2:DescribeSecurityGroups","ec2:DescribeSubnets","ec2:DescribeVpcs","ec2:DescribeVpcAttribute","ec2:DescribeVolumes","ec2:DetachInternetGateway","ec2:DisassociateRouteTable","ec2:DisassociateAddress","ec2:ModifyInstanceAttribute","ec2:ModifyNetworkInterfaceAttribute","ec2:ModifySubnetAttribute","ec2:ReleaseAddress","ec2:RevokeSecurityGroupIngress","ec2:RunInstances","ec2:TerminateInstances","tag:GetResources","elasticloadbalancing:AddTags","elasticloadbalancing:CreateLoadBalancer","elasticloadbalancing:ConfigureHealthCheck","elasticloadbalancing:DeleteLoadBalancer","elasticloadbalancing:DescribeLoadBalancers","elasticloadbalancing:DescribeLoadBalancerAttributes","elasticloadbalancing:ApplySecurityGroupsToLoadBalancer","elasticloadbalancing:DescribeTags","elasticloadbalancing:ModifyLoadBalancerAttributes","elasticloadbalancing:RegisterInstancesWithLoadBalancer","elasticloadbalancing:DeregisterInstancesFromLoadBalancer","elasticloadbalancing:RemoveTags","autoscaling:DescribeAutoScalingGroups","autoscaling:DescribeInstanceRefreshes","ec2:CreateLaunchTemplate","ec2:CreateLaunchTemplateVersion","ec2:DescribeLaunchTemplates","ec2:DescribeLaunchTemplateVersions","ec2:DeleteLaunchTemplate","ec2:DeleteLaunchTemplateVersions"],"Resource": ["*"]},{"Effect": "Allow","Action": ["autoscaling:CreateAutoScalingGroup","autoscaling:UpdateAutoScalingGroup","autoscaling:CreateOrUpdateTags","autoscaling:StartInstanceRefresh","autoscaling:DeleteAutoScalingGroup","autoscaling:DeleteTags"],"Resource": ["arn:*:autoscaling:*:*:autoScalingGroup:*:autoScalingGroupName/*"]},{"Effect": "Allow","Action": ["iam:CreateServiceLinkedRole"],"Resource": ["arn:*:iam::*:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling"],"Condition": {"StringLike": {"iam:AWSServiceName": "autoscaling.amazonaws.com"}}},{"Effect": "Allow","Action": ["iam:CreateServiceLinkedRole"],"Resource": ["arn:*:iam::*:role/aws-service-role/elasticloadbalancing.amazonaws.com/AWSServiceRoleForElasticLoadBalancing"],"Condition": {"StringLike": {"iam:AWSServiceName": "elasticloadbalancing.amazonaws.com"}}},{"Effect": "Allow","Action": ["iam:CreateServiceLinkedRole"],"Resource": ["arn:*:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot"],"Condition": {"StringLike": {"iam:AWSServiceName": "spot.amazonaws.com"}}},{"Effect": "Allow","Action": ["iam:PassRole"],"Resource": ["arn:*:iam::*:role/*.cluster-api-provider-aws.sigs.k8s.io"]},{"Effect": "Allow","Action": ["secretsmanager:CreateSecret","secretsmanager:DeleteSecret","secretsmanager:TagResource"],"Resource": ["arn:*:secretsmanager:*:*:secret:aws.cluster.x-k8s.io/*"]},{"Effect": "Allow","Action": ["ssm:GetParameter"],"Resource": ["arn:*:ssm:*:*:parameter/aws/service/eks/optimized-ami/*"]},{"Effect": "Allow","Action": ["iam:CreateServiceLinkedRole"],"Resource": ["arn:*:iam::*:role/aws-service-role/eks.amazonaws.com/AWSServiceRoleForAmazonEKS"],"Condition": {"StringLike": {"iam:AWSServiceName": "eks.amazonaws.com"}}},{"Effect": "Allow","Action": ["iam:CreateServiceLinkedRole"],"Resource": ["arn:*:iam::*:role/aws-service-role/eks-nodegroup.amazonaws.com/AWSServiceRoleForAmazonEKSNodegroup"],"Condition": {"StringLike": {"iam:AWSServiceName": "eks-nodegroup.amazonaws.com"}}},{"Effect": "Allow","Action": ["iam:CreateServiceLinkedRole"],"Resource": ["arn:aws:iam::*:role/aws-service-role/eks-fargate-pods.amazonaws.com/AWSServiceRoleForAmazonEKSForFargate"],"Condition": {"StringLike": {"iam:AWSServiceName": "eks-fargate.amazonaws.com"}}},{"Effect": "Allow","Action": ["iam:ListOpenIDConnectProviders","iam:CreateOpenIDConnectProvider","iam:AddClientIDToOpenIDConnectProvider","iam:UpdateOpenIDConnectProviderThumbprint","iam:DeleteOpenIDConnectProvider"],"Resource": ["*"]},{"Effect": "Allow","Action": ["iam:GetRole","iam:ListAttachedRolePolicies","iam:DetachRolePolicy","iam:DeleteRole","iam:CreateRole","iam:TagRole","iam:AttachRolePolicy"],"Resource": ["arn:*:iam::*:role/*"]},{"Effect": "Allow","Action": ["iam:GetPolicy"],"Resource": ["arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"]},{"Effect": "Allow","Action": ["eks:DescribeCluster","eks:ListClusters","eks:CreateCluster","eks:TagResource","eks:UpdateClusterVersion","eks:DeleteCluster","eks:UpdateClusterConfig","eks:UntagResource","eks:UpdateNodegroupVersion","eks:DescribeNodegroup","eks:DeleteNodegroup","eks:UpdateNodegroupConfig","eks:CreateNodegroup"],"Resource": ["arn:*:eks:*:*:cluster/*","arn:*:eks:*:*:nodegroup/*/*/*"]},{"Effect": "Allow","Action": ["eks:AssociateIdentityProviderConfig","eks:ListIdentityProviderConfigs"],"Resource": ["arn:aws:eks:*:*:cluster/*"]},{"Effect": "Allow","Action": ["eks:DisassociateIdentityProviderConfig","eks:DescribeIdentityProviderConfig"],"Resource": ["*"]},{"Effect": "Allow","Action": ["eks:ListAddons","eks:CreateAddon","eks:DescribeAddonVersions","eks:DescribeAddon","eks:DeleteAddon","eks:UpdateAddon","eks:TagResource","eks:DescribeFargateProfile","eks:CreateFargateProfile","eks:DeleteFargateProfile"],"Resource": ["*"]},{"Effect": "Allow","Action": ["iam:PassRole"],"Resource": ["*"],"Condition": {"StringEquals": {"iam:PassedToService": "eks.amazonaws.com"}}}]}
The following steps need to be performed to provision a new AWS cluster:
- Provide the basic cluster information: Name, Description, and Tags. Tags on a cluster are propagated to the VMs deployed on the cloud/data center environments.
- Select the Cluster Profile created for the AWS cloud. The profile definition will be used as the cluster construction template.
- Review and override pack parameters, as desired. By default, parameters for all packs are set with values, defined in the Cluster Profile.
Provide the AWS cloud account and placement information.
Parameter Description Cloud Account Select the desired cloud account. AWS cloud accounts with AWS credentials need to be preconfigured in project settings. Region Choose the preferred AWS region where you would like the clusters to be provisioned. SSH Key Pair Name Choose the desired SSH Key pair. SSH key pairs need to be preconfigured on AWS for the desired regions. The selected key is inserted into the VMs provisioned. Static Placement By default, Palette uses dynamic placement, wherein a new VPC with a public and private subnet is created to place cluster resources for every cluster.
These resources are fully managed by Palette and deleted, when the corresponding cluster is deleted. Turn on the Static Placement option if it's desired to place resources into preexisting VPCs and subnets.
If the user is making the selection of Static Placement of resources, the following placement information needs to be provided:Virtual Network: Select the virtual network from dropdown menu. Control plane Subnet: Select the control plane network from the dropdown menu. Worker Network: Select the worker network from the dropdown menu.
- Make the choice of updating the worker pool in parallel, if required.
kubernetes.io/role/elb = 1
sigs.k8s.io/cluster-api-provider-aws/role = public
kubernetes.io/cluster/[ClusterName] = shared
sigs.k8s.io/cluster-api-provider-aws/cluster/[ClusterName] = owned
- Configure the master and worker node pools. A master and a worker node pool are configured by default.
- An optional Label can be applied to a node pool during the cluster creation. During the cluster creation, while configuring the node pools, tag an optional Label in a unique key: value format. For a running cluster, the created label can be edited as well as a new label can be added.
Enable or disable node pool Taint as per the user's choice. If Taint is enabled, the following parameters need to be passed:
Parameter Description Key Custom key for the Taint. Value Custom value for the Taint key. Effect Make the choice of effect from the dropdown menu. There are three options to go with:
NoSchedule: A pod that cannot tolerate the node Taint, should not be scheduled to the node. PreferNoSchedule: The system will avoid placing a non-tolerant pod to the tainted node but is not guaranteed. NoExecute: New pods will not be scheduled on the node, and existing pods on the node if any on the node will be evicted if they do not tolerate the Taint.
Parameter | Description |
---|---|
Name | A descriptive name for the node pool. |
Size | Number of VMs to be provisioned for the node pool. For the master pool, this number can be 1, 3, or 5. |
Allow worker capability (master pool) | Select this option for allowing workloads to be provisioned on master nodes. |
Instance type | Select the AWS instance type to be used for all nodes in the node pool. |
Rolling Update | There are two choices of Rolling Update: |
Expand First: Launches the new node and then shut down the old node. | |
Contract First: Shut down the old node first and then launches the new node. | |
Availability Zones | Choose one or more availability zones. Palette provides fault tolerance to guard against failures like hardware failures, network failures, etc. by provisioning nodes across availability zones if multiple zones are selected. |
By default, worker pools are configured to use On-Demand instances. Optionally, to take advantage of discounted spot instance pricing, the On-Spot option can be selected. This option allows you to specify a maximum bid price for the nodes as a percentage of the On-Demand price. Palette tracks the current price for spot instances and launches nodes, when the spot price falls in the specified range.
- Review settings and deploy the cluster. Provisioning status with details of ongoing provisioning tasks is available to track progress.
The deletion of an AWS cluster results in the removal of all Virtual Machines and associated Storage Disks, created for the cluster. The following tasks need to be performed to delete an AWS cluster:
- Select the cluster to be deleted from the Cluster View page and navigate to the Cluster Overview page.
- Invoke a delete action available on the page: Cluster > Settings > Cluster Settings > Delete Cluster.
- Click Confirm to delete.
The Cluster Status is updated to Deleting while cluster resources are being deleted. Provisioning status is updated with the ongoing progress of the delete operation. Once all resources are successfully deleted, the cluster status changes to Deleted and is removed from the list of clusters.
A cluster stuck in the Deletion state can be force deleted by the user through the User Interface. The user can go for a force deletion of the cluster, only if it is stuck in a deletion state for a minimum of 15 minutes. Palette enables cluster force delete from the Tenant Admin and Project Admin scope.
- Log in to the Palette Management Console.
Navigate to the Cluster Details page of the cluster stuck in deletion.
If the deletion is stuck for more than 15 minutes, click the Force Delete Cluster button from the Settings dropdown.
If the Force Delete Cluster button is not enabled, wait for 15 minutes. The Settings dropdown will give the estimated time for the auto-enabling of the force delete button.