Release Notes
This document provides information about the 3.6.2 release for the Ultima Enterprise product running on Baremetal, VMware vSphere, AWS and GCP.
What’s New
The release has the following software versions:
Kubernetes 1.25.14
CRI-O version 1.25
Kubevirt version 0.58.0
Rocky OS 8.6
Resource Maximums
The release supports the following maximum number of resources (per node):
Resources (per node) |
|
---|---|
Pods |
110 |
Volumes |
2048(total)/64(active)* |
Storage controllers |
64 |
Remote Storage Controllers |
64 |
The maximum number of volumes on a node is 2048. This number is equal to the sum of all mirrors of all volumes, snapshots, linked clones, etc. A snapshot is considered a volume.
The maximum number of snapshots per volume is 16.
The maximum number of linked-clone volumes on a snapshot is up to the maximum number of volumes on a node.
The number of volumes exposed to a host is 64 (active volumes). The number of volumes a node can serve as a target is 64. Therefore, at any time, a node can expose 64 volumes to the host, as well as serve 64 volumes as a target. For example, in a three-node cluster:
The number of simple volumes (single mirror): 6K
The number of 2-way mirrored volumes: 3K
The number of 3-way mirrored volumes: 2K
Release Requirements
Diamanti 3.6.2 uses Diamanti OS Release rocky 8.6.0-44 on
Baremetal machine
AWS AMI
GCP VM Image
VMware vSphere OVA package.
The following machine types are supported as Diamanti cluster nodes:
AWS
Machine Types |
vCPU |
Memory |
---|---|---|
m5d.16xlarge |
64 |
256GiB |
i4i.16xlarge |
64 |
512GiB |
GCP
Machine Types |
vCPU |
Memory |
---|---|---|
n1-highmem-32 |
32 |
208G |
n1-standard-32 |
32 |
120G |
Supported Regions
This section lists the supported regions in cloud for Diamanti 3.6.2 release.
AWS
us-east-1 |
ca-central-1 |
eu-west-3 |
ap-southeast-2 |
us-east-2 |
eu-central-1 |
eu-north-1 |
ap-northeast-1 |
us-west-1 |
eu-west-1 |
ap-south-1 |
ap-northeast-2 |
us-west-2 |
eu-west-2 |
ap-southeast-1 |
ap-northeast-3 |
GCP
us-central1 |
europe-central2 |
asia-east1 |
asia-southeast1 |
us-east1 |
europe-north1 |
asia-east2 |
asia-southeast2 |
us-east4 |
europe-west1 |
asia-northeast1 |
australia-southeast1 |
us-west1 |
europe-west2 |
asia-northeast2 |
australia-southeast2 |
us-west2 |
europe-west3 |
asia-northeast3 |
northamerica-northeast1 |
us-west3 |
europe-west4 |
asia-south1 |
northamerica-northeast2 |
us-west4 |
europe-west6 |
asia-south2 |
southamerica-east1 |
Known Issues
This section lists the known issues for the Diamanti 3.6.2 release.
Summary: A pod with a volume may get stuck during termination.
Description: On pod termination, kubelet sometimes fails to unmount CSI volumes and the following log is seen in kubelet:
1E0422 13:28:12.214727 1861 reconciler.go:193] operationExecutor.UnmountVolume failed (controllerAttachDetachEnabled true) for volume "persistentvolumemount-6wg4ywt2xwwgkyygj6tyo2xhrm" (UniqueName: "kubernetes.io/csi/rook-ceph.cephfs.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-97172dfa-9e15-11eb-8ecd-26ee97ae76aa") pod "dc3f11b7-d671-464c-a734-65e1760ed7df" (UID: "dc3f11b7-d671-464c-a734-65e1760ed7df") : UnmountVolume.NewUnmounter failed for volume "persistentvolumemount-6wg4ywt2xwwgkyygj6tyo2xhrm" (UniqueName: "kubernetes.io/csi/rook-ceph.cephfs.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-97172dfa-9e15-11eb-8ecd-26ee97ae76aa") pod "dc3f11b7-d671-464c-a734-65e1760ed7df" (UID: "dc3f11b7-d671-464c-a734-65e1760ed7df") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/dc3f11b7-d671-464c-a734-65e1760ed7df/volumes/kubernetes.io~csi/pvc-b27ae1f5-bfa6-4951-aebb-ff33280796df/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/dc3f11b7-d671-464c-a734-65e1760ed7df/volumes/kubernetes.io~csi/pvc-b27ae1f5-bfa6-4951-aebb-ff33280796df/vol\_data.json]: open /var/lib/kubelet/pods/dc3f11b7-d671-464c-a734-65e1760ed7df/volumes/kubernetes.io~csi/pvc-b27ae1f5-bfa6-4951-aebb-ff33280796df/vol\_data.json: no such file or directory
This is a known Kubernetes issue (https://github.com/kubernetes/kubernetes/issues/101378). kubelet logs shows “UnmountVolume.NewUnmounter failed for volume”(https://github.com/kubernetes/kubernetes/issues/101911). Once this bug is triggered the Pod is stuck in Terminating state and the above error message is seen continuously in the kubelet log.
Workaround: Delete the pod forcefully. This will not affect volume since the volume is not in use anymore.
Summary: It may take longer for the kubevirt VM to get terminated before it comes back up when the node is rebooted; or restarted on a failure.
Description: Based on the current behavior of Kubevirt, VM does not switch to another node unless there is a clean shutdown/reboot.
Workaround: In this case, the VM pod has to be forced-deleted.
You can take the following steps to recover from this. Check that the virtual machine is terminating
$ kubectl get pod
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
virt-launcher-centos-kbd95 1/1 Terminating 0 41m 172.46.0.10 static-n3 <none> 1/1
Run the following command, to force delete the VM, . After it is deleted forcefully, the pod enters container creation mode.
$kubectl delete pod virt-launcher-centos-kbd95 --force
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "virt-launcher-centos-kbd95" force deleted
[diamanti@static-n1 kubevirt-vm-spec]$ kgp
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
virt-launcher-centos-zh7sw 0/1 ContainerCreating 0 6s <none> static-n1 <none> 1/1
During the container creation, the volume attachment may stay in the creating state for upto 6 minutes until it is deleted by Kubernetes. Manually deleting volume attachments will speed up pod startup.
Note
Delete the volume attachment only if the node is failed node. In this case, static-n3 is the failed node and static-n1 is the new node on which new VM pod is scheduled, so it is safe to delete the attachment.
kubectl delete volume attachment csi-d5f636881867fd7be98e44eea6fe5ce5a0fe68fe4c9beef19da829128d17a892
On node static-n1, the pod enters the running state.
$kubectl get pod
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
virt-launcher-centos-zh7sw 1/1 Running 0 6m31s 172.46.8.7 static-n1 <none> 1/1
Summary: A kubevirt/virtio-container-disk image does not load by default in an air-gapped cluster.
Description: The image kubevirt/virtio-container-disk for virtio drivers is not loaded by default in an air-gapped cluster when a VM is made using an ISO file in Kubevirt.
Workaround: Manually load the kubevirt/virtio-container-disk image on all the nodes that have Kubevirt enabled.
Summary: Since shutdown is not supported so if the node is shutdown, a plex becomes unusable.
Description: Since shutdowns are not supported, the mirror plex present on the shutdown node needs to be removed from the volume and then added again.
Workaround: Nodes shutting down may result in unusable plexes that needs to be removed from the mirrored volume and drives that need to be formatted. A new plex can be added back to maintain the number of plexes on the volume. For more information. see Recovery from node Shutdown for AWS and Recovery from node Shutdown for GCP.
Summary: Rebooting may cause a target plex of a mirrored volume to go out of sync.
Description: Target plexes of mirrored volumes go out of sync if one of the target nodes is rebooted, owing to inconsistencies with the other plexes of the same mirrored volume.
Workaround: You need to detach and attach the plex back and let it resynchronize.
This example shows how to find the plex that is out of sync and how to detach and reconnect it. Identify the volume that is out of sync by describing the volume.
$ dctl volume describe test-vol8 Name : test-vol8 Size : 38.3GB Encryption : false Node : [ip-172-31-1-245.ec2.internal ip-172-31-3-166.ec2.internal ip-172-31-2-171.ec2.internal] Label : diamanti.com/pod-name=default/test-vol8-attached-manually Node Selector : mirror=true Phase : Available Status : Down Attached-To : ip-172-31-3-166.ec2.internal Device Path : /dev/nvme8n1 Age : 0d:2h:27m Perf-Tier : best-effort Mode : Filesystem Fs-Type : ext4 Scheduled Plexes / Actual Plexes : 3/3 Plexes: NAME NODES STATE CONDITION OUT-OF-SYNC-AGE RESYNC-PROGRESS DELETE-PROGRESS ---- ----- ----- --------- --------------- --------------- --------------- test-vol8.p0 ip-172-31-1-245.ec2.internal Up OutOfSync 0d:0h:13m test-vol8.p1 ip-172-31-3-166.ec2.internal Up InSync test-vol8.p2 ip-172-31-2-171.ec2.internal Up InSync[diamanti@softserv84 bin]$ Delete the plex that is out of sync using the command below $ dctl volume plex-detach test-vol8 p0 Attach the plex back using the command below $ dctl volume plex-attach test-vol8 p0
Legal Notices
Publication Date: This document was published on Sept 26, 2023.
Copyright
Copyright © 2016-2023, Diamanti. All rights reserved.
Diamanti believes the information it furnishes to be accurate and reliable. However, Diamanti assumes no responsibility for the use of this information, nor any infringement of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under any patent, copyright, or other intellectual property right of Diamanti except as specifically described by applicable user licenses. Diamanti reserves the right to change specifications at any time without notice.
Trademarks
Diamanti and the Diamanti GUI are trademarks or service marks of Diamanti, in the U.S. and other countries, and may not be used without Diamanti’s express written consent. All other product and company names herein may be trademarks of their respective owners.