Another short update from me, surgery has been confirmed and booked for Tuesday morning. Katie is traveling up Monday and then it’s game on. Unfortunately the beard and most of the hair on my upper body has to go but it will grow back eventually.
Goes without saying but thank you to everyone helping out at home, the distance is starting to hurt but with surgery I’ll be one step closer to home next week.
Hopefully out of ICU by the end of the week and posting more boring updates. In the meantime keep the girls in your thoughts and I’ll catch you all on the other side.
(15/12/23) I’ve decided to post the text of a series of Instagram posts
I’ve made to my private account detailing what I’ve been up
to over the last 8 weeks while in hospital.
The original posts and associated pictures can be found
below, assuming I know you feel free to
send a follow request!
I’m still waiting on various surgical teams coming to an agreement on my procedure. Essentially 2 stand alone ops interwoven into 1. Antibiotics flowing still but vastly reduced from previous weeks. I also finally have a PICC line so the daily cannula and bloods torture is over for now and my arms can heal. Still hopeful for progress this week, then onto recovery and eventually home.
A quick post to highlight that I’m currently leading a weekly call aiming to continue the work of defining and then seeding special interest groups or SIGs within the KubeVirt community.
Defining the high level aspects of SIGs within KubeVirt using k8s as a starting point
What is a SIG?
What should a SIG do?
How do we create and remove SIGs?
Populating kubevirt/community with an initial set of SIGs
Populating a single SIG and creating the required collateral (such as a charter etc) within kubevirt/community
Assigning code ownership using OWNERS files to the SIG across the core kubevirt/kubevirt project
There are also topics outside of this such as the need for a steering committee like group to oversee the process in the future that are also being discussed but these might be eventually handled outside of the call.
All current and potential future contributors to the KubeVirt project are welcome on the call so please do feel free to join if you would like to help shape this future aspect of the project!
Please note however that the calls have now moved to a new time slot of Thursday @ 11:00 EDT / 14:00 UTC / 15:00 BST / 16:00 CEST after a number of last minute cancellations due to downstream conflicts.
Finally, please find the recording of our previous call from August 17th is available below to review our previous discussions:
Please also feel free to file bugs or enhancements against https://github.com/kubevirt/kubevirt/issues (or the relevant sub-project) using the /area instancetype command to label these for review and triage!
As the name suggests this can be used to control memory overcommit as a percentage within an instance type.
# Creating an instance type with an overcommitPercent of 15%$ kubectl apply -f - <<EOF---
apiVersion: instancetype.kubevirt.io/v1beta1kind: VirtualMachineInstancetypemetadata:
name: overcommitspec:
cpu:
guest: 1memory:
guest: 128MiovercommitPercent: 15EOF# Creating a simple VirtualMachine (that auto starts) using this VirtualMachineInstancetype$ virtctl create vm --instancetype virtualmachineinstancetype/overcommit \ --volume-containerdisk name:cirros,src:registry:5000/kubevirt/cirros-container-disk-demo:devel \ --name cirros | kubectl apply -f -# We can see that the VirtualMachineInstance is exposing `128Mi` of memory # to the guest but is only requesting `114085072` or ~`108Mi`$ kubectl get vmi/cirros -o json | jq .spec.domain.memory,.spec.domain.resources{
"guest": "128Mi"}
{
"requests": {
"memory": "114085072" }
}
The following hopefully self-explanatory preference attributes have also been introduced:
In addition to the above standalone preference attributes a new Spec.Requirements attribute and feature has been added. At present this can encapsulate the minimum CPU and Memory requirements for the preference that need to be provided by the underlying VirtualMachine or associated VirtualMachine{ClusterInstancetype,Instancetype}.
With the introduction of instancetype.kubevirt.io/v1beta1 the older instancetype.kubevirt.io/v1alpha{1,2} versions have been deprecated ahead of removal in a future release (likely KubeVirt >= v1.2.0).
As with the recent deprecation of the kubevirt.io/v1alpha3 any users of these older instancetype.kubevirt.io/v1alpha{1,2} versions are recommend to use the kube-storage-version-migrator tool to migrate the stored version of these objects to instancetype.kubevirt.io/v1beta1. For operators of OKD/OCP environments this tool is provided through the cluster-kube-storage-version-migrator-operator.
Work to migrate ControllerRevisions containing older instancetype.kubevirt.io/v1alpha{1,2} objects will be undertaken during the v1.1.0 release of KubeVirt and can be tracked below:
Implement a conversion strategy for instancetype.kubevirt.io/v1alpha{1,2} objects stored in ControllerRevisions to instancetype.kubevirt.io/v1beta1https://github.com/kubevirt/kubevirt/issues/9909
The virtctl image-upload command has been extended with two new switches to label the resulting DataVolume or PVC with a default instance type and preference.
# Upload a CirrOS image using a DataVolume and label it with a default instance type and preference$ virtctl image-upload dv cirros --size=1Gi \ --default-instancetype n1.medium \ --default-preference cirros \ --force-bind \ --image-path=./cirros-0.6.1-x86_64-disk.img# Check that the resulting DV and PVC have been labelled correctly$ kubectl get dv/cirros -o json | jq .metadata.labels{
"instancetype.kubevirt.io/default-instancetype": "n1.medium",
"instancetype.kubevirt.io/default-preference": "cirros"}
$ kubectl get pvc/cirros -o json | jq .metadata.labels{
"app": "containerized-data-importer",
"app.kubernetes.io/component": "storage",
"app.kubernetes.io/managed-by": "cdi-controller",
"instancetype.kubevirt.io/default-instancetype": "n1.medium",
"instancetype.kubevirt.io/default-preference": "cirros"}
# Use virtctl to create a VirtualMachine manifest using the inferFromVolume option for the instance type and preference$ virtctl create vm --volume-pvc=name:cirros,src:cirros \ --infer-instancetype \ --infer-preference \ --name cirros | yq .apiVersion: kubevirt.io/v1kind: VirtualMachinemetadata:
creationTimestamp: nullname: cirrosspec:
instancetype:
inferFromVolume: cirrospreference:
inferFromVolume: cirrosrunStrategy: Alwaystemplate:
metadata:
creationTimestamp: nullspec:
domain:
devices: {}
resources: {}
terminationGracePeriodSeconds: 180volumes:
- name: cirrospersistentVolumeClaim:
claimName: cirrosstatus: {}
# Pass the manifest to kubectl and then check that the resulting instance type and prefrence matchers have been expanded correctly$ virtctl create vm --volume-pvc=name:cirros,src:cirros \ --infer-instancetype \ --infer-preference \ --name cirros | kubectl apply -f -virtualmachine.kubevirt.io/cirros created$ kubectl get vms/cirros -o json | jq '.spec.instancetype,.spec.preference'{
"kind": "virtualmachineclusterinstancetype",
"name": "n1.medium",
"revisionName": "cirros-n1.medium-1cbceb96-2771-497b-a4b7-7cad6742b385-1"}
{
"kind": "virtualmachineclusterpreference",
"name": "cirros",
"revisionName": "cirros-cirros-efc9aeac-05df-4034-aa82-45817ca0b6dc-1"}
A number of related core kubevirt.io and specific instancetype.kubevirt.io bugs were resolved as part of the upcoming KubeVirt v1.0.0 release, these include:
This was caused by shared validation logic between the VirtualMachine and VirtualMachineInstance webhooks without shared defaulting also being in place. Resulting in valid VirtualMachines being rejected while being perfectly valid.
The fix allowed us to drop the need for modelling resource requests within instance types entirely and focus instead on the guest visible resources as per the original design.
This bug allowed a user to mutate the name of an {Instancetype,Preference}Matcher even when a revisionName was present. This would have no impact on the VirtualMachine or running VirtualMachineInstance until the revisionName was dropped, only then causing a copy of the newly referenced resource to be stashed in a ControllerRevision and used.
The fix was to reject such requests and ensure the name is only updated when the revisionName is empty.
There are likely more corner cases here so any and all feedback would be welcome to help us tackle these!
This bug caused very old VirtualMachineInstancetypeSpecRevision from instancetype.kubevirt.io/v1alpha1 to be captured in a ControllerRevision with a missing apiVersion attribute thanks to a known (kubernetes/client-go#541)[https://github.com/kubernetes/client-go/issues/541] issue.
The fix was to ignore the apiVersion entirely when dealing with these older ControllerRevisions and handle conversion pre object as we do for >= instancetype.kubevirt.io/v1alpha2.
If anything this bug highlights the need to deprecate and remove support for these older versions as soon as possible to lower the overhead in maintaining the API and CRDs.
With the introduction of the OvercommitPercent attribute in instancetype.kubevirt.io/v1beta1 we have introduced a new O Overcommitted instance type class. Initially this class sets OvercommitPercent to 50%:
apiVersion: instancetype.kubevirt.io/v1beta1kind: VirtualMachineClusterInstancetypemetadata:
annotations:
instancetype.kubevirt.io/class: Overcommittedinstancetype.kubevirt.io/description: |- The O Series is based on the N Series, with the only difference
being that memory is overcommitted.
*O* is the abbreviation for "Overcommitted".instancetype.kubevirt.io/version: "1"labels:
instancetype.kubevirt.io/vendor: kubevirt.ioinstancetype.kubevirt.io/common-instancetypes-version: v0.3.0name: o1.mediumspec:
cpu:
guest: 1memory:
guest: 4GiovercommitPercent: 50
s/Neutral/Universal/g instance type class
The N Neutral instance type class has been renamed U for Universal after several discussions about the future introduction of a new N Network focused instance type set of classes. The latter is still being discussed but if you have a specific use case you think we could cover in this family then please let me know!
Deprecation of legacy instance types
$ kubectl get virtualmachineclusterinstancetypes \
-linstancetype.kubevirt.io/deprecated=true
selecting podman as container runtime
NAME AGE
highperformance.large 3h53m
highperformance.medium 3h53m
highperformance.small 3h53m
server.large 3h53m
server.medium 3h53m
server.micro 3h53m
server.small 3h53m
server.tiny 3h53m
Resource labels
The following resource labels have been added to each hyperscale
instance type to aid users searching for a type with specific resources:
instancetype.kubevirt.io/cpu
instancetype.kubevirt.io/memory
Additionally the following optional bool labels have also been added to
relevant instance types to help users looking for more specific
resources and features:
instancetype.kubevirt.io/dedicatedCPUPlacement
instancetype.kubevirt.io/hugepages
instancetype.kubevirt.io/isolateEmulatorThread
instancetype.kubevirt.io/numa
instancetype.kubevirt.io/gpus
$ kubectl get virtualmachineclusterinstancetype \
-linstancetype.kubevirt.io/hugepages=true
NAME AGE
cx1.2xlarge 113s
cx1.4xlarge 113s
cx1.8xlarge 113s
cx1.large 113s
cx1.medium 113s
cx1.xlarge 113s
m1.2xlarge 113s
m1.4xlarge 113s
m1.8xlarge 113s
m1.large 113s
m1.xlarge 113s
$ kubectl get virtualmachineclusterinstancetype \
-linstancetype.kubevirt.io/cpu=4
NAME AGE
cx1.xlarge 3m8s
gn1.xlarge 3m8s
m1.xlarge 3m8s
n1.xlarge 3m8s
$ kubectl get virtualmachineclusterinstancetype \
-linstancetype.kubevirt.io/cpu=4,instancetype.kubevirt.io/hugepages=true
NAME AGE
cx1.xlarge 5m47s
m1.xlarge 5m47s
Version label
All released resources are now labelled with a instancetype.kubevirt.io/common-instancetypes-version label denoting the release the resource came from.
With instancetype.kubevirt.io/v1beta1 out the door it’s time to start planning v1. At the moment there isn’t a need for a v1beta2 but I’m always open to introducing that first if the need arises.
Deployment of common-instancetypes from virt-operator
This has been long talked about and raised a few times in my blog post but the time has definitely come to look at this seriously with KubeVirt v1.1.0andinstancetype.kubevirt.io/v1.
A formal community design proposal (or an enhancement if I get my way) will be written up in the coming weeks setting out how we might be able to achieve this.
Migration of existing ControllerRevisions to the latest instancetype.kubevirt.io version
Again long talked about but with a possible move to instancetype.kubevirt.io/v1 I really want to enable the removal of older versions such as v1alpha{1,2} and v1beta1.
Reducing the number of created ControllerRevisions
The final item I want to complete in the next release is again a long talked about short coming in the original implementation of the API. With the growing use of the API and CRDs I do want to address this in v1.1.0.
I had the pleasure of delivering a very short KubeVirt Summit 2023 presentation with my colleague Felix last week covering some of our work around instance types and virtctl over the last year. Please find the slides and a recording below.
There have been some small changes to the design. Notably that DataSources are now supported as a target of InferFromVolume and
the previously listed camel-case annotations are now hyphenated labels:
instancetype.kubevirt.io/default-instancetype
instancetype.kubevirt.io/default-instancetype-kind (Defaults to VirtualMachineClusterInstancetype)
instancetype.kubevirt.io/default-preference
instancetype.kubevirt.io/default-preference-kind (Defaults to VirtualMachineClusterPreference)
Demo
I’ve recorded a new demo below using a SSP operator development environment, the demo now covers the following:
The creation of decorated DataImportCrons by the SSP operator
The deployment of kubevirt/common-instancetypes by the SSP operator
The creation of decorated DataSources and PVCs by CDI
The ability for KubeVirt to infer the default instance type and preference from a DataSource for a VirtualMachine
The previously discussedannotations have been replaced by labels to allow users (such as the downstream OpenShift UI within Red Hat) to use server side filtering to find suitably decorated resources within a given cluster.
Changes have also been made to the CDI project ensuring these labels are passed down when importing volumes into an environment using the DataImportCron resource. Any DataVolumes, DataSources or PVCs created by this process will have these labels copied over from the initial DataImportCron. The following example is from an environment where the SSP operator has deployed a labelled DataImportCron to CDI:
$ kubectl get all,pvc -A -l instancetype.kubevirt.io/default-preferenceNAMESPACE NAME AGEkubevirt-os-images datasource.cdi.kubevirt.io/centos-stream8 31mNAMESPACE NAME AGEkubevirt-os-images dataimportcron.cdi.kubevirt.io/centos-stream8-image-cron 4m29sNAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGEkubevirt-os-images persistentvolumeclaim/centos-stream8-2f16c067b974 Bound pvc-4be6ea30-9d7d-480a-828c-38fa2abc6597 10Gi RWX rook-ceph-block 4m19s$ kubectl get persistentvolumeclaim/centos-stream8-2f16c067b974 -n kubevirt-os-images -o json | jq .metadata.labels{
"app": "containerized-data-importer",
"app.kubernetes.io/component": "storage",
"app.kubernetes.io/managed-by": "cdi-controller",
"cdi.kubevirt.io/dataImportCron": "centos-stream8-image-cron",
"instancetype.kubevirt.io/default-instancetype": "server.medium",
"instancetype.kubevirt.io/default-preference": "centos.8.stream"}
I plan on recording and posting an updated demo shortly.
PrefferredStorageClassName
A new PrefferredStorageClassName preference has been added:
The introduction of resource requests and possible move to make the guest visible resource requests optional has prompted us to look at introducing yet another alpha version of the API:
The logic being that we can’t make part of the API optional without moving to a new version and we can’t move to v1beta1 while making changes to the API. This version should remain backwardly compatible with the older versions but work is still required to see if a conversion strategy is required for stored objects both in etcd and in ControllerRevisions.
<=v1alpha2 Deprecation
With the introduction of a new API version I also want to start looking into what it will take to deprecate our older versions while we are still in alpha:
This issue sets out the following tasks to be investigated:
- [ ] Introduce a new v1alpha3 version ahead of backwardly incompatible changes landing
- [ ] Deprecate v1alpha1 and v1alpha2 versions
- [ ] Implement a conversion strategy for stored objects from v1alpha1 and v1alpha2
- [ ] Implement a conversion strategy for objects stored in ControllerRevisions associated with existing VirtualMachines
This work could well be differed until after v1beta1 but it’s still a useful mental exercise to plan out what will eventually be required.
Preference Resource Requirements
A while ago I quickly drafted an idea around expressing the resource requirements of a workload within VirtualMachinePreferenceSpec:
The PR is still pretty rough but the demo text included sets out what I’d like to achieve with the feature eventually. The general idea being to ensure that an Instance type or raw VirtualMachine definition using a given Preference provides the required resources to run a given workload correctly.