KubeVirtCI - How to deploy an env with CPUManager and multiple host NUMA nodes

With the introduction of PR#1174 and PR#1171 kubevirtci users can now deploy CPUManager and NUMA enabled environments with ease:

$ export \
 KUBEVIRT_PROVIDER=k8s-1.29 \
 KUBEVIRT_MEMORY_SIZE=$((16 * 1024))M \
 KUBEVIRT_HUGEPAGES_2M=$((4 * 1024)) \
 KUBEVIRT_CPU_MANAGER_POLICY=static \
 KUBEVIRT_NUM_NUMA_NODES=2 \
 KUBEVIRT_NUM_VCPU=16
$ cd kubevirt && make cluster-up && make cluster-sync
[..]
$ ./cluster-up/ssh.sh node01
[vagrant@node01 ~]$ sudo dnf install numactl -y && numactl --hardware
[..]
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 7996 MB
node 0 free: 1508 MB
node 1 cpus: 8 9 10 11 12 13 14 15
node 1 size: 8017 MB
node 1 free: 1265 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 
[..]
$ ./cluster-up/kubectl.sh patch kv/kubevirt -n kubevirt --type merge \
  -p '{"spec":{"configuration":{"developerConfiguration":{"featureGates": ["CPUManager","NUMA"]}}}}'
[..]
$ ./cluster-up/kubectl.sh apply -f -<<EOF
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
  name: example
spec:
  domain:
    cpu:
      cores: 9
      dedicatedCpuPlacement: true
      numa:
        guestMappingPassthrough: {}
    devices:
      disks:
        - disk:
            bus: virtio
          name: containerdisk
        - disk:
            bus: virtio
          name: cloudinitdisk
    resources:
      requests:
        memory: 1Gi
    memory:
      hugepages:
        pageSize: 2Mi
  volumes:
    - containerDisk:
        image: quay.io/containerdisks/fedora:39
      name: containerdisk
    - cloudInitNoCloud:
        userData: |
          #!/bin/sh
          mkdir -p  /home/fedora/.ssh
          curl https://github.com/lyarwood.keys > /home/fedora/.ssh/authorized_keys
          chown -R fedora: /home/fedora/.ssh
      name: cloudinitdisk
EOF
[..]
$ ./cluster-up/virtctl.sh ssh -lfedora example
[fedora@example ~]$ sudo dnf install numactl -y && numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 446 MB
node 0 free: 243 MB
node 1 cpus: 1 2 3 4 5 6 7 8
node 1 size: 499 MB
node 1 free: 109 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10
[..]
$ ./cluster-up/kubectl.sh exec pods/virt-launcher-example-h852s -- virsh vcpuinfo 1
[..]
VCPU:           0
CPU:            1
State:          running
CPU time:       16.1s
CPU Affinity:   -y--------------

VCPU:           1
CPU:            8
State:          running
CPU time:       2.3s
CPU Affinity:   --------y-------

VCPU:           2
CPU:            9
State:          running
CPU time:       4.6s
CPU Affinity:   ---------y------

VCPU:           3
CPU:            10
State:          running
CPU time:       2.5s
CPU Affinity:   ----------y-----

VCPU:           4
CPU:            11
State:          running
CPU time:       2.3s
CPU Affinity:   -----------y----

VCPU:           5
CPU:            12
State:          running
CPU time:       1.4s
CPU Affinity:   ------------y---

VCPU:           6
CPU:            13
State:          running
CPU time:       1.3s
CPU Affinity:   -------------y--

VCPU:           7
CPU:            14
State:          running
CPU time:       2.1s
CPU Affinity:   --------------y-

VCPU:           8
CPU:            15
State:          running
CPU time:       2.1s
CPU Affinity:   ---------------y

This has already allowed me to report bug #11749 regarding vCPUs being exposed as threads pinned to non-thread sibling pCPUs on hosts without SMT when using dedicatedCpuPlacement. It has also helped greatly with the design of SpreadOptions that aims to allow instance types users to expose more realistic vCPU topologies to their workloads through extending the existing PreferSpread preferredCPUTopology option.

I’m looking also at breaking up the KUBEVIRT_NUM_VCPU env variable to better control the host CPU topology within a kubevirtci environment but as yet I haven’t found the time to work on this ahead of the rewrite of vm.sh in go via PR #1164.

Feel free to reach out if you have any other ideas for kubevirtci or issues with the above changes! Hopefully someone finds this work useful.

Contents

comments powered by Disqus