The following are some architectural highlights of Kubernetes clusters provisioned by Palette on VMware:
- Kubernetes nodes can be distributed across multiple-compute clusters which serve as distinct fault domains.
- Support for static IP as well as DHCP.
- If using DHCP, Dynamic DNS is required.
- IP pool management for assigning blocks of IPs dedicated to clusters or projects.
- To facilitate communications between the Palette management platform and vCenter installed in the private Datacenter, set up a Private Cloud Gateway (PCG) within the environment.
- Private Cloud Gateway is Palette's on-premises component to enable support for isolated, private cloud or Datacenter environments. The Palette Gateway, once installed on-premises, registers itself with Palette's SaaS portal and enables secure communications between the SaaS portal and private cloud environment. The gateway enables installation and end-to-end lifecycle management of Kubernetes clusters in private cloud environments from Palette's SaaS portal.
The following prerequisites must be met before deploying a Kubernetes clusters in VMware:
- vSphere 6.7U3 or later (recommended).
- Configuration Requirements - A Resource Pool needs to be configured across the hosts, onto which the workload clusters will be provisioned. Every host in the Resource Pool will need access to shared storage, such as vSAN, to be able to make use of high-availability control planes. Network Time Protocol (NTP) must be configured on each of the ESXi hosts.
- You need an active vCenter account with all the permissions listed below in the VMware Cloud Account Permissions section.
- Install a Private Cloud Gateway for VMware as described in the Creating a VMware Cloud Gateway section. Installing the Private Cloud Gateway will automatically register a cloud account for VMware in Palette. You can register your additional VMware cloud accounts in Palette as described in the Creating a VMware Cloud account section.
- Subnet with egress access to the internet (direct or via proxy):
- For proxy: HTTP_PROXY, HTTPS_PROXY (both required).
- Outgoing internet connection on port 443 to api.spectrocloud.com.
- The Private cloud gateway IP requirements are:
- One (1) node - one (1) IP or three (3) nodes - three (3) IPs.
- One (1) Kubernetes control-plane VIP.
- One (1) Kubernetes control-plane extra.
- IPs for application workload services (e.g.: LoadBalancer services).
- A DNS to resolve public internet names (e.g.: api.spectrocloud.com).
- Shared Storage between vSphere hosts.
- A cluster profile created in Palette for VMWare.
- Zone Tagging: A dynamic storage allocation for persistent storage.
The following points needs to be taken care while creating the Tags:
- A valid tag must consist of alphanumeric characters
- The tag must start and end with an alphanumeric characters
- The regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')
Example Tags:
- MyValue
- my_value
- 12345
Zone tagging is required for dynamic storage allocation across fault domains when provisioning workloads that require persistent storage. This is required for installation of Palette Platform itself and also useful for workloads deployed in the tenant clusters if they have persistent storage needs. Use vSphere tags on Datacenters (k8s-region) and compute clusters (k8s-zone) to create distinct zones in your environment.
As an example, assume your vCenter environment includes three compute clusters, cluster-1, cluster-2, and cluster-3, that are part of Datacenter dc-1. You can tag them as follows:
vSphere Object | Tag Category | Tag Value |
---|---|---|
dc-1 | k8s-region | region1 |
cluster-1 | k8s-zone | az1 |
cluster-2 | k8s-zone | az2 |
cluster-3 | k8s-zone | az3 |
Note:
The exact values for the k8s-region and k8s-zone tags can be different from the ones described in the above example, as long as they are unique.
Last Update: August 18, 2022
The vSphere user account used in the various Palette tasks must have the minimum vSphere privileges required to perform the task. The Administrator role provides super-user access to all vSphere objects. For users without the Administrator role, one or more custom roles can be created based on the tasks being performed by the user.
vSphere Object | Privileges |
---|---|
Cns | Searchable |
Datastore | Browse datastore |
Host | Configuration |
* Storage partition configuration | |
vSphere Tagging | Create vSphere Tag |
Edit vSphere Tag | |
Network | Assign network |
Sessions | Validate session |
Profile-driven storage | Profile-driven storage view |
Storage views | View |
vSphere Object | Privileges |
---|---|
Cns | Searchable |
Datastore | Allocate space |
Browse datastore | |
Low level file operations | |
Remove file | |
Update virtual machine files | |
Update virtual machine metadata | |
Folder | Create folder |
Delete folder | |
Move folder | |
Rename folder | |
Host | Local operations |
Reconfigure virtual machine | |
vSphere Tagging | Assign or Unassign vSphere Tag |
Create vSphere Tag | |
Delete vSphere Tag | |
Edit vSphere Tag | |
Network | Assign network |
Resource | Apply recommendation |
Assign virtual machine to resource pool | |
Migrate powered off virtual machine | |
Migrate powered on virtual machine | |
Query vMotion | |
Sessions | Validate session |
Profile-driven storage | Profile-driven storage view |
Storage views | Configure service |
View | |
Tasks | Create task |
Update task | |
vApp | Export |
Import | |
View OVF environment | |
vApp application configuration | |
vApp instance configuration | |
Virtual machines | Change Configuration |
* Acquire disk lease | |
* Add existing disk | |
* Add new disk | |
* Add or remove device | |
* Advanced configuration | |
* Change CPU count | |
* Change Memory | |
* Change Settings | |
* Change Swapfile placement | |
* Change resource | |
* Configure Host USB device | |
* Configure Raw device | |
* Configure managedBy | |
* Display connection settings | |
* Extend virtual disk | |
* Modify device settings | |
* Query Fault Tolerance compatibility | |
* Query unowned files | |
* Reload from path | |
* Remove disk | |
* Rename | |
* Reset guest information | |
* Set annotation | |
* Toggle disk change tracking | |
* Toggle fork parent | |
* Upgrade virtual machine compatibility | |
Edit Inventory | |
* Create from existing | |
* Create new | |
* Move | |
* Register | |
* Remove | |
* Unregister | |
Guest operations | |
* Guest operation alias modification | |
* Guest operation alias query | |
* Guest operation modifications | |
* Guest operation program execution | |
* Guest operation queries | |
Interaction | |
* Console interaction | |
* Power off | |
* Power on | |
Provisioning | |
* Allow disk access | |
* Allow file access | |
* Allow read-only disk access | |
* Allow virtual machine download | |
* Allow virtual machine files upload | |
* Clone template | |
* Clone virtual machine | |
* Create template from virtual machine | |
* Customize guest | |
* Deploy template | |
* Mark as template | |
* Mark as virtual machine | |
* Modify customization specification | |
* Promote disks | |
* Read customization specifications | |
Service configuration | |
* Allow notifications | |
* Allow polling of global event notifications | |
* Manage service configurations | |
* Modify service configuration | |
* Query service configurations | |
* Read service configuration | |
Snapshot management | |
* Create snapshot | |
* Remove snapshot | |
* Rename snapshot | |
* Revert to snapshot | |
vSphere Replication | |
* Configure replication | |
* Manage replication | |
* Monitor replication | |
vSAN | Cluster |
ShallowRekey |
- Minimum capacity required for a Private Cloud Gateway:
- One (1) node - two (2) vCPU, 4 GB memory, 60 GB storage.
- Three (3)nodes - six (6) vCPU, 12GB memory, 70GB storage.
Setting up a cloud gateway involves:
- Initiating the install from the tenant portal
- Deploying gateway installer VM in vSphere
- Launching the cloud gateway from the tenant portal
As a Tenant Administrator, navigate to the Private Cloud Gateway page under settings and click the dialogue to create a new Private Cloud Gateway.
Notate the link to the Palette Gateway Installer OVA and PIN displayed on the dialogue.
- Initiate deployment of a new OVF template by providing a link to the installer OVA as the URL.
- Proceed through the OVF deployment wizard by choosing the desired Name, Placement, Compute, Storage, and Network options.
- At the Customize Template step, specify Palette properties as follows:
Parameter | Value | Remarks |
---|---|---|
Installer Name | Desired Palette Gateway Name | The name will be used to identify the gateway instance. Typical environments may only require a single gateway to be deployed, however, multiple gateways might be required for managing clusters across multiple vCenters. Choose a name that can easily identify the environment that this gateway instance is being configured for. |
Console endpoint | URL to Palette management platform portal | https://console.spectrocloud.com by default |
Pairing Code | PIN displayed on the Palette management platform portal's 'Create a new gateway' dialogue. | |
SSH Public Key | Optional key, useful for troubleshooting purposes (Recommended) | Enables SSH access to the VM as 'ubuntu' user |
Pod CIDR | Optional - IP range exclusive to pods | This range should be different to prevent an overlap with your network CIDR. |
Service cluster IP range | Optional - IP range in the CIDR format exclusive to the service clusters | This range also must not overlap with either the pod CIDR or your network CIDR. |
Additional properties that are required to be set only for a proxy environment. Each of the proxy properties may or may not have the same value but all the three properties are mandatory.
Parameter | Value | Remarks |
---|---|---|
HTTP PROXY | The endpoint for the HTTP proxy server | This setting will be propagated to all the nodes launched in the proxy network. e.g., http://USERNAME:PASSWORD@PROXYIP:PROXYPORT |
HTTPS PROXY | The endpoint for the HTTPS proxy server | This setting will be propagated to all the nodes launched in the proxy network. e.g., http://USERNAME:PASSWORD@PROXYIP:PROXYPORT |
NO Proxy | A comma-separated list of vCenter server, local network CIDR, hostnames, domain names that should be excluded from proxying | This setting will be propagated to all the nodes to bypass the proxy server . e.g., vcenter.company.com, .company.org, 10.10.0.0/16 |
Certificate | The base64 encoded value of the proxy server's cerficate OR the base64 encoded root and issuing certificate authority (CA) certificates used to sign the proxy server's certificate | Depending on how the certificate is decoded, an additonal = character may appear at the tailend of the value. The following command can be used to encode the certificate properly base64 -w0 | sed "s/=$//" |
- Finish the OVF deployment wizard and wait for the OVA to be imported and virtual machine to be deployed.
- Power on the virtual machine.
- Close the Create New Gateway dialog box if it is still open or navigate to the Private Cloud Gateway page under settings in case you have navigated away or been logged out.
- Wait for a gateway widget to be displayed on the page and for the Configure option to be available. The IP address of the installer VM will be displayed on the gateway widget. This may take a few minutes after the Virtual Machine is powered on. Failure of the installer to register with the Palette Management Platform portal within 10 mins of powering on the Virtual Machine on vSphere, might be indicative of an error. Please follow the troubleshooting steps to identify and resolve the issue.
- Click on the Configure button to invoke the Palette Configuration dialogue. Provide vCenter credentials and proceed to the next configuration step.
- Choose the desired values for the Datacenter, Compute Cluster, Datastore, Network, Resource pool, and Folder. Optionally, provide one or more SSH Keys and/or NTP server addresses.
- Choose the IP Allocation Scheme - Static IP or DHCP. If static IP is selected, an option to create an IP pool is enabled. Proceed to create an IP pool by providing an IP range (start and end IP addresses) or a subnet. The IP addresses from this IP Pool will be assigned to the gateway cluster. By default, the IP Pool is available for use by other tenant clusters. This can be prevented by enabling the Restrict to a single cluster button. A detailed description of all the fields involved in the creation of an IP pool can be found here.
Click on Confirm, to initiate provisioning of the gateway cluster. The status of the cluster on the UI should change to Provisioning and eventually Running, when the gateway cluster is fully provisioned. This process might take several minutes (typically 8 to 10 mins). You can observe a detailed provisioning sequence on the Cluster Details page, by clicking on the gateway widget on the UI. If provisioning of the gateway cluster runs into errors or gets stuck, relevant details can be found on the Summary tab or the events tab of the cluster details page.
In certain cases where provisioning of the gateway cluster is stuck or failed due to invalid configuration, the process can be reset from the Cloud Gateway Widget on the UI.
Once the Gateway transitions to the Running state, it is fully provisioned and ready to bootstrap tenant cluster requests.
Power off the installer OVA which was initially imported at the start of this installation process.
The installer VM, when powered on, goes through a bootstrap process and registers itself with the Tenant Portal. This process typically takes five to ten minutes. Failure of the installer to register with the Tenant Portal, within this duration, might be indicative of a bootstrapping error.
SSH into the installer virtual machine using the username "ubuntu" and the key provided during OVA import and inspect the log file located at /var/log/cloud-init-output.log. This log file will contain error messages in the event there are failures with connecting to the Palette Management platform portal, authenticating, or downloading installation artifacts. A common cause for these errors is that the Palette Management platform console endpoint or the pairing code is typed incorrectly.
Ensure that the Tenant Portal console endpoint does not have a trailing slash. If these properties were incorrectly specified, power down and delete the installer VM and relaunch with the correct values.
Another potential issue is a lack of outgoing connectivity from the VM. The installer VM needs to have outbound connectivity directly or via a proxy. Adjust proxy settings (if applicable) to fix the connectivity or power down and delete the installer VM and relaunch in a network that enables outgoing connections.
If the above steps do not resolve your issues, copy the following script to the installer VM and execute to generate a logs archive. Open a support ticket and attach the logs archive to the ticket to allow the Palette Support team to troubleshoot and provide further guidance:
#!/bin/bashDESTDIR="/tmp/"CONTAINER_LOGS_DIR="/var/log/containers/"CLOUD_INIT_OUTPUT_LOG="/var/log/cloud-init-output.log"CLOUD_INIT_LOG="/var/log/cloud-init.log"KERN_LOG="/var/log/kern.log"KUBELET_LOG="/tmp/kubelet.log"SYSLOGS="/var/log/syslog*"FILENAME=spectro-logs-$(date +%-Y%-m%-d)-$(date +%-HH%-MM%-SS).tgzjournalctl -u kubelet > $KUBELET_LOGtar --create --gzip -h --file=$DESTDIR$FILENAME $CONTAINER_LOGS_DIR $CLOUD_INIT_LOG $CLOUD_INIT_OUTPUT_LOG $KERN_LOG $KUBELET_LOG $SYSLOGSretVal=$?if [ $retVal -eq 1 ]; thenecho "Error creating spectro logs package"elseecho "Successfully extracted spectro cloud logs: $DESTDIR$FILENAME"fi
An installation of the gateway cluster may run into errors or might get stuck in the provisioning state for a variety of reasons like lack of infrastructure resources, IP addresses not being available, unable to perform NTP sync, etc.
While these are most common, some other issue might be related to the underlying VMware environment. The Cluster Details page, which can be accessed by clicking anywhere on the gateway widget, contains details of every orchestration step including an indication of the current task being executed.
Any intermittent errors will be displayed on this page next to the relevant orchestration task. The Events tab on this page, also provides a useful resource to look at lower-level operations being performed for the various orchestration steps.
If you think that the orchestration is stuck or failed due to an invalid selection of infrastructure resources or an intermittent problem with the infrastructure, you may reset the gateway by clicking on the Reset button on the gateway widget. This will reset the gateway state to Pending allowing you to reconfigure the gateway and start provisioning of a new gateway cluster.
If the problem persists, please contact Palette support, via the Service Desk.
Palette maintains the OS image and all configurations for the cloud gateway. Periodically, the OS images, configurations, or other components need to be upgraded to resolve security or functionality issues. Palette releases such upgrades when required and communication about the same is presented in the form of an upgrade notification on the gateway.
Administrators should review the changes and apply them at a suitable time. Upgrading a cloud gateway does not result in any downtime for the Tenant Clusters. During the upgrade process, the provisioning of new clusters might be temporarily unavailable. New cluster requests are queued while the gateway is being upgraded and are processed as soon as the gateway upgrade is complete.
The following steps need to be performed to delete a cloud gateway:
- As a Tenant Administrator, navigate to the Private Cloud Gateway page under Settings.
- Invoke the Delete action on the cloud gateway instance that needs to be deleted.
- The system performs a validation to ensure there are no running tenant clusters associated with the gateway instance being deleted. If such instances are found, the system presents an error. Delete relevant running tenant clusters and retry the deletion of the cloud gateway.
- Delete the Gateway Virtual Machines from vSphere.
A cloud gateway can be set up as a 1-node or a 3-node cluster. For production environments, it is recommended that three (3) nodes are set up. A cloud gateway can be initially set up with one (1) node and resized to three (3) nodes at a later time. The following steps need to be performed to resize a 1-node cloud gateway cluster to a 3-node gateway cluster:
- As a Tenant Administrator, navigate to the Private Cloud Gateway page under Settings.
- Invoke the resize action for the relevant cloud gateway instance.
- Update the size from one (1) to three (3).
- The gateway upgrade begins shortly after the update. Two new nodes are created on vSphere and the gateway is upgraded to a 3-node cluster.
Palette supports DHCP as well as Static IP based allocation strategies for the VMs that are launched during cluster creation. IP Pools can be defined, using a range or a subnet. Administrators can define one or more IP pools linked to a private cloud gateway.
Clusters created using a private cloud gateway can select from the IP pools linked to the corresponding private cloud gateway. By default, IP Pools are shared across multiple clusters, but can optionally be restricted to a cluster.
The following is a description of various IP Pool properties:
Property | Description |
---|---|
Name | Descriptive name for the IP Pool. This name will be displayed for IP Pool selection when static IP is chosen as the IP allocation strategy |
Network Type | Select Range to provide a start and an end IP address. IPs within this range will become part of this pool. Alternately select 'Subnet' to provide the IP range in CIDR format. |
Start | First IP address for a range based IP Pool E.g. 10.10.183.1 |
End | Last IP address for a range based IP Pool. E.g. 10.10.183.100 |
Subnet | CIDR to allocate a set of IP addresses for a subnet based IP Pool. E.g. 10.10.183.64/26 |
Subnet Prefix | Network subnet prefix. e.g. /18 |
Gateway | Network Gateway E.g. 10.128.1.1 |
Name server addresses | A comma-separated list of name servers. e.g., 8.8.8.8 |
Restrict to a Single Cluster | Select this option to reserve the pool for the first cluster that uses this pool. By default, IP pools can be shared across clusters. |
In addition to the default cloud account already associated with the private cloud gateway, new user cloud accounts can be created for the different vSphere users.
Property | Description |
---|---|
Account Name | Custom name for the cloud account |
Private cloud gateway | Reference to a running cloud gateway |
vCenter Server | IP or FQDN of the vCenter server |
Username | vCenter username |
Password | vCenter password |
The following steps need to be performed to provision a new VMware cluster:
- Provide the basic cluster information like Name, Description, and Tags. Tags are currently not propagated to the Virtual Machines (VMs) deployed on the cloud/Datacenter environments.
- Select a Cluster Profile created for the VMware environment. The profile definition will be used as the cluster construction template.
- Review and override Pack Parameters as desired. By default, parameters for all Packs are set with values defined in the Cluster Profile.
Provide a vSphere Cloud account and placement information.
Parameter Description Cloud Account Select the desired cloud account.
VMware cloud accounts with credentials need to be preconfigured
in the Project Settings section. An account is auto-created as
part of the cloud gateway setup and is available for
provisioning of Tenant Clusters if permitted by the administrator.Datacenter The vSphere Datacenter where the cluster nodes will be launched. Deployment Folder The vSphere VM Folder where the cluster nodes will be launched. Image Template Folder The vSphere folder to which the Spectro templates are imported. SSH Keys (Optional) Public key to configure remote SSH access to the nodes (User: spectro). NTP Server (Optional) Setup time synchronization for all the running nodes. IP Allocation strategy DHCP or Static IP Configure the master and worker node pools. Fill out the input fields in the Add node pool page. The following table contains an explanation of the available input parameters.
Parameter | Description |
---|---|
Name | A descriptive name for the node pool. |
Size | Number of VMs to be provisioned for the node pool. For the master pool, this number can be 1, 3, or 5. |
Allow worker capability | Select this option for allowing workloads to be provisioned on master nodes. |
Labels | Add a label to apply placement constraints on a pod, such as a node eligible for receiving the workload. |
Taints | To set toleration to pods and allow (but do not require) the pods to schedule onto nodes with matching taints. |
Instance type | Select the compute instance type to be used for all nodes in the node pool. |
Availability Zones | Choose one or more availability zones. Palette provides fault tolerance to guard against hardware failures, network failures, etc., by provisioning nodes across availability zones if multiple zones are selected. |
Disk Size | Give the required storage size |
Parameter | Description |
---|---|
Name | A descriptive name for the node pool. |
Enable Autoscaler | You can enable the autoscaler, by toggling the Enable Autoscaler button. Autoscaler scales up and down resources between the defined minimum and the maximum number of nodes to optimize resource utilization. |
Set the scaling limit by setting the Minimum Size and Maximum Size, as per the workload the number of nods will scale up from minimum set value to maximum set value and the scale down from maximum set value to minimum set value | |
Size | Number of VMs to be provisioned for the node pool. |
Rolling Update | Rolling update has two available options. Review the Update Parameter table below for more details. |
Labels | Add a label to apply placement constraints on a pod, such as a node eligible for receiving the workload. |
Taints | To set toleration to pods and allow (but do not require) the pods to schedule onto nodes with matching taints. |
Instance type | Select the compute instance type to be used for all nodes in the node pool. |
Availability Zones | Choose one or more availability zones. Palette provides fault tolerance to guard against hardware failures, network failures, etc., by provisioning nodes across availability zones if multiple zones are selected. |
Disk Size | Provide the required storage size |
- Review settings and deploy the cluster. Provisioning status with details of ongoing provisioning tasks is available to track progress.
The deletion of a VMware cluster results in the removal of all Virtual machines and associated storage disks created for the cluster. The following tasks need to be performed to delete a VMware cluster:
- Select the cluster to be deleted from the Cluster View page and navigate to the Cluster Overview page.
- Invoke the delete action available on the page: Cluster > Settings > Cluster Settings > Delete Cluster.
- Click Confirm to delete.
The Cluster Status is updated to Deleting while the Cluster Resources are being deleted. Provisioning status is updated with the ongoing progress of the delete operation. Once all resources are successfully deleted, the Cluster Status changes to Deleted and is removed from the list of Clusters.
A cluster stuck in the Deletion state can be force deleted by the user through the User Interface. The user can go for a force deletion of the cluster, only if it is stuck in a deletion state for a minimum of 15 minutes. Palette enables cluster force delete from the Tenant Admin and Project Admin scope.
- Log in to the Palette Management Console.
Navigate to the Cluster Details page of the cluster stuck in deletion mode.
If the deletion status is stuck for more than 15 minutes, click the Force Delete Cluster button from the Settings dropdown.
If the Force Delete Cluster button is not enabled, wait for 15 minutes. The Settings dropdown will give the estimated time for the auto-enabling of the Force Delete button.