KubeVirt instancetype.kubevirt.io Update #6

Welcome to part #6 of this series following the development of instance types and preferences within KubeVirt!

It’s been over two years since the last update, during which the instance type and preference APIs have matured significantly. This update covers the major milestones achieved between KubeVirt v1.0.0 and v1.7.0.

Feedback

Feedback is always welcome through the upstream mailing list of kubevirt-dev, upstream Slack channel #kubevirt-dev or directly to me via lyarwood at redhat dot com.

Please also feel free to file bugs or enhancements against https://github.com/kubevirt/kubevirt/issues using the /area instancetype command to label these for review and triage.

Major Milestones

Removal of instancetype.kubevirt.io/v1alpha{1,2}

As planned in the previous update, the deprecated v1alpha1 and v1alpha2 API versions have been removed. Users should ensure they’ve migrated to v1beta1 before upgrading to recent KubeVirt releases.

Deployment of common-instancetypes from virt-operator

A long-awaited feature has landed - virt-operator now deploys the common-instancetypes bundles directly. This eliminates the need for separate installation steps and ensures that a standard set of instance types and preferences are available out of the box.

The latest deployed version is v1.5.1, providing a comprehensive set of instance type classes and OS-specific preferences.

New Instance Type Features

IOThreads Support

Instance types can now configure IOThreads for improved storage performance:

apiVersion: instancetype.kubevirt.io/v1beta1
kind: VirtualMachineInstancetype
metadata:
  name: io-optimized
spec:
  cpu:
    guest: 4
  memory:
    guest: 8Gi
  ioThreadsPolicy: auto

CPU Hotplug Support via MaxSockets

The maxSockets field enables CPU hotplug capabilities:

apiVersion: instancetype.kubevirt.io/v1beta1
kind: VirtualMachineInstancetype
metadata:
  name: scalable
spec:
  cpu:
    guest: 2
    maxSockets: 4
  memory:
    guest: 4Gi

Memory Hotplug Support via MaxGuest

Similarly, maxGuest enables memory hotplug:

apiVersion: instancetype.kubevirt.io/v1beta1
kind: VirtualMachineInstancetype
metadata:
  name: memory-scalable
spec:
  cpu:
    guest: 2
  memory:
    guest: 4Gi
    maxGuest: 16Gi

Scheduling Controls

Instance types now support nodeSelector and schedulerName for fine-grained scheduling control. The nodeSelector accepts any Kubernetes node labels defined by cluster administrators:

apiVersion: instancetype.kubevirt.io/v1beta1
kind: VirtualMachineInstancetype
metadata:
  name: gpu-workload
spec:
  cpu:
    guest: 8
  memory:
    guest: 16Gi
  nodeSelector:
    nvidia.com/gpu.present: "true"
  schedulerName: custom-scheduler

Required Annotations

Instance types can now specify required annotations to be applied to VirtualMachineInstances:

apiVersion: instancetype.kubevirt.io/v1beta1
kind: VirtualMachineInstancetype
metadata:
  name: annotated
spec:
  cpu:
    guest: 2
  memory:
    guest: 4Gi
  annotations:
    example.com/required-annotation: "value"

New Preference Features

Enhanced CPU Topology Control

New CPU topology options provide greater flexibility:

  • Spread topology: Evenly distributes vCPUs between cores and sockets
  • Any topology: Allows VirtualMachine to define topology while still using preferences
  • SpreadOptions: Fine-grained control over spreading with across (SocketsCores, CoresThreads, SocketsCoresThreads) and ratio fields
apiVersion: instancetype.kubevirt.io/v1beta1
kind: VirtualMachinePreference
metadata:
  name: spread-example
spec:
  cpu:
    preferredCPUTopology: spread
    spreadOptions:
      across: SocketsCores
      ratio: 2

Note: The original preferCores, preferSockets, preferThreads constants have been deprecated in favor of shorter cores, sockets, threads names.

Architecture Support

Preferences can now specify a preferred architecture:

apiVersion: instancetype.kubevirt.io/v1beta1
kind: VirtualMachinePreference
metadata:
  name: arm64-optimized
spec:
  preferredArchitecture: arm64

Panic Device Support

Panic devices can now be configured through preferences for improved crash diagnostics.

Firmware Preferences Update

PreferredEfi has been introduced to replace the deprecated PreferredUseEfi and PreferredUseSecureBoot fields, providing a more flexible firmware configuration mechanism.

Preferred Annotations

Similar to instance types, preferences can specify preferred annotations:

apiVersion: instancetype.kubevirt.io/v1beta1
kind: VirtualMachinePreference
metadata:
  name: windows
spec:
  annotations:
    example.com/os: "windows"

ControllerRevision Upgrades

Support for upgrading ControllerRevisions to newer API versions has been implemented, enabling seamless migration as the API evolves.

What’s Coming Next

The most significant upcoming change is the promotion of the API to v1, planned for KubeVirt v1.8.0. This milestone is tracked through VEP #17 and implemented in PR #16598.

https://github.com/kubevirt/kubevirt/issues?q=is%3Aopen%20label%3Aarea%2Finstancetype

The instance type and preference APIs continue to evolve based on community feedback. Check the KubeVirt issue tracker for upcoming enhancements and contribute your ideas!

Read More

DevConf.cz 2024 presentation - Streamlining VM creation within KubeVirt

I had the pleasure of delivering the following talk in person this year at devconf in Brno.

There were some challenges with audio and my delivery was rusty at best but the presentation is hopefully understandable and useful for folks.

There were plenty of questions at the end of the talk and even more in the hallway/booth track outside the lecture hall. Feel free to comment on the post if you have more or reach out directly using my contact details on the opening slide.

Video

Slides

Read More

KubeVirtCI - How to deploy an env with CPUManager and multiple host NUMA nodes

With the introduction of PR#1174 and PR#1171 kubevirtci users can now deploy CPUManager and NUMA enabled environments with ease:

$ export \
 KUBEVIRT_PROVIDER=k8s-1.29 \
 KUBEVIRT_MEMORY_SIZE=$((16 * 1024))M \
 KUBEVIRT_HUGEPAGES_2M=$((4 * 1024)) \
 KUBEVIRT_CPU_MANAGER_POLICY=static \
 KUBEVIRT_NUM_NUMA_NODES=2 \
 KUBEVIRT_NUM_VCPU=16
$ cd kubevirt && make cluster-up && make cluster-sync
[..]
$ ./cluster-up/ssh.sh node01
[vagrant@node01 ~]$ sudo dnf install numactl -y && numactl --hardware
[..]
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 7996 MB
node 0 free: 1508 MB
node 1 cpus: 8 9 10 11 12 13 14 15
node 1 size: 8017 MB
node 1 free: 1265 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 
[..]
$ ./cluster-up/kubectl.sh patch kv/kubevirt -n kubevirt --type merge \
  -p '{"spec":{"configuration":{"developerConfiguration":{"featureGates": ["CPUManager","NUMA"]}}}}'
[..]
$ ./cluster-up/kubectl.sh apply -f -<<EOF
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
  name: example
spec:
  domain:
    cpu:
      cores: 9
      dedicatedCpuPlacement: true
      numa:
        guestMappingPassthrough: {}
    devices:
      disks:
        - disk:
            bus: virtio
          name: containerdisk
        - disk:
            bus: virtio
          name: cloudinitdisk
    resources:
      requests:
        memory: 1Gi
    memory:
      hugepages:
        pageSize: 2Mi
  volumes:
    - containerDisk:
        image: quay.io/containerdisks/fedora:39
      name: containerdisk
    - cloudInitNoCloud:
        userData: |
          #!/bin/sh
          mkdir -p  /home/fedora/.ssh
          curl https://github.com/lyarwood.keys > /home/fedora/.ssh/authorized_keys
          chown -R fedora: /home/fedora/.ssh
      name: cloudinitdisk
EOF
[..]
$ ./cluster-up/virtctl.sh ssh -lfedora example
[fedora@example ~]$ sudo dnf install numactl -y && numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 446 MB
node 0 free: 243 MB
node 1 cpus: 1 2 3 4 5 6 7 8
node 1 size: 499 MB
node 1 free: 109 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10
[..]
$ ./cluster-up/kubectl.sh exec pods/virt-launcher-example-h852s -- virsh vcpuinfo 1
[..]
VCPU:           0
CPU:            1
State:          running
CPU time:       16.1s
CPU Affinity:   -y--------------

VCPU:           1
CPU:            8
State:          running
CPU time:       2.3s
CPU Affinity:   --------y-------

VCPU:           2
CPU:            9
State:          running
CPU time:       4.6s
CPU Affinity:   ---------y------

VCPU:           3
CPU:            10
State:          running
CPU time:       2.5s
CPU Affinity:   ----------y-----

VCPU:           4
CPU:            11
State:          running
CPU time:       2.3s
CPU Affinity:   -----------y----

VCPU:           5
CPU:            12
State:          running
CPU time:       1.4s
CPU Affinity:   ------------y---

VCPU:           6
CPU:            13
State:          running
CPU time:       1.3s
CPU Affinity:   -------------y--

VCPU:           7
CPU:            14
State:          running
CPU time:       2.1s
CPU Affinity:   --------------y-

VCPU:           8
CPU:            15
State:          running
CPU time:       2.1s
CPU Affinity:   ---------------y

This has already allowed me to report bug #11749 regarding vCPUs being exposed as threads pinned to non-thread sibling pCPUs on hosts without SMT when using dedicatedCpuPlacement. It has also helped greatly with the design of SpreadOptions that aims to allow instance types users to expose more realistic vCPU topologies to their workloads through extending the existing PreferSpread preferredCPUTopology option.

I’m looking also at breaking up the KUBEVIRT_NUM_VCPU env variable to better control the host CPU topology within a kubevirtci environment but as yet I haven’t found the time to work on this ahead of the rewrite of vm.sh in go via PR #1164.

Feel free to reach out if you have any other ideas for kubevirtci or issues with the above changes! Hopefully someone finds this work useful.

Read More

whereis lyarwood? - Back online!

After just over 4 months off I’m returning to work on Monday February 26th.

What follows is a brief overview of why I was offline in an attempt to avoid repeating myself and likely boring people with the story over the coming weeks.

tl;dr - If in doubt always seek multiple medical opinions!

As crudely documented at the time on my private instagram account and later copied into this blog I was admitted to hospital back in October after finally getting a second opinion on some symptoms I had been having for the prior ~6 months. These symptoms included fevers, uncontrollable shivering and exhaustion but had unfortunately been misdiagnosed as night sweats. I was about to see an Endocrinologist when my condition really deteriorated and my wife finally pushed me to seek the advice of a second GP.

Within a few minutes of seeing the GP a new diagnosis was suggested and I was quickly admitted to Hospital. There we discovered that my ICD, first implanted 19 years earlier for primary prevention, had become completely infected with a ~40mm long mass (yum!) hanging onto the end of the wire within my heart. The eventual diagnosis would be cardiac device related infective endocarditis. I was extremely lucky that the mass hadn’t detached already causing a fatal heart attack or stroke.

If that wasn’t enough I also somehow managed to contract COVID within a few days of being in hospital and had to spend a considerable amount of time in isolation while my initial course of antibiotics were being given. This was definitely a low point but the Coronary Care Unit and Lugg Ward teams at Hereford County Hospital were excellent throughout.

Once I was COVID negative I was transferred to St Bartholomew’s Hospital in London and had an emergency device extraction via open heart surgery on November 14th. While the surgery went well there were complications in the form of liquid (~600ml) building up around my heart and lungs that led to more surgery at the start of December. During this time I remained on a cocktail of IV antibiotics that finally stopped in the middle of December and after numerous blood cultures (including one false positive!) I had an S-ICD implanted on December 20th before being discharged home the next day.

I’ve spent the time since recovering at home with family as my sternum and numerous wounds slowly heal. This time has been extremely important and I can’t thank Red Hat and my management chain enough for their support with this extended sick leave. I’ve been able to focus on reconnecting with my wife and girls Rose (3.5) and Phoebe (9 months) after 9 extremely difficult weeks apart. I’ve also ventured back to London for checkups and to thank friends who helped me through the weeks away from home. I’ve got many many more people to thank in the coming months.

I now feel mentally ready to return to work but know it’s going to be a little while longer before I’m fully physically recovered.

Thanks to anyone who reached out during this time and I’ll catch you all upstream in the coming weeks!

Update 26/02/24

I’m back to work today, many thanks once again for all of the responses to this on LinkedIn and elsewhere, they are all greatly appreciated!

Read More

whereis lyarwood? - The 80 20 rule - Update #7

Unfortunately while I’m feeling much better and basically back to normal a set of blood cultures taken last Friday somehow managed to grow bacteria from the same family (Staphylococcaceae) as my original infection. Additional cultures taken on Saturday and Monday did not growing anything leading to the assumption that the first set were somehow contaminated.

To ensure this really was the case another two sets of cultures are being taken today and should allow for the all clear to be given some time on Sunday evening.

This has obviously delayed my final S-ICD implant surgery but I’ve been assured that once the all clear is given the surgery should be able to take place on Monday or Tuesday at the latest. If anything does grow then I’m in for another 10 to 14 days of antibiotics before we try the process again.

I’m pretty gutted after assuming I’d be coming home this week but hopeful that things work out in the coming days and I can get home early next week instead.

Thanks as always to folks sending messages, visiting or helping out at home. We are almost there now.

Read More

whereis lyarwood? - Progress - Update #6

The last 7 days have been extremely difficult. From being told by a nurse that my pain tolerance is too low (it isn’t, see below about my lung), having a drain and bucket of blood attached to me for 4 days and being left all weekend without my phone, laptop, tablet and glasses it has honestly been the most trying week of my life.

The surgery to remove liquid from around my heart was successful (~750ml removed over 4 days) but we then discovered yet more liquid around my partially collapsed left lung. Thankfully the latter just required rest, exercise and time to clear and I was given the all clear yesterday for both.

This then allowed my antibiotics to finally be stopped as my infection/inflammation blood markers all came crashing down. It honestly felt weird not being hooked up to an IV for 3 hours while I attempted to sleep last night.

With the antibiotics stopped I could then be booked in for S-ICD implant surgery on Monday, the final thing required before I can go home.

All being well this means that I should be discharged sometime on Tuesday and finally able to go home to see my girls for the first time in almost 8 weeks.

To celebrate I’ve just ventured outside on my own and had a flat white for the first time since I was admitted.

As always I can’t thank everyone who made the effort to message, visit or help out back home enough. I can’t imagine what this would have been like without your support so thank you from the bottom of my now disinfected and hopefully healthy heart!

Read More

whereis lyarwood? - Bump in the road - Update #5

Morning all, unfortunately I’m heading back down to the basement at St Bart’s for more surgery today after liquid has built up around my heart.

This is a common complication from cardiac surgery and the procedure should in all likelihood be simple and over in about 60 minutes.

With this hopefully resolved I should then see my recovery continue after it platod over the last week or so, even with blood transfusions and the best efforts of the staff on the recovery ward.

Then it’s back to ensuring all of my inflammation and infection markers return to normal before stopping antibiotics and collecting blood cultures to ensure the infection is no more.

Once that’s done I’m then having an S-ICD implanted as an inpatient before leaving the hospital. This sits under the skin and doesn’t have a wire in the heart.

For anyone still following along it increasingly looks like I’m not going to be home for Christmas but given the recent discovery I’d rather return home fully fit than have to return here again in the new year.

Thank you once again to everyone who is chipping in at home with the girls, visiting me in London or just messaging (apologies if I don’t reply, often my head is just mush at the moment). We massively appreciate the support!

Read More