How to use VFIO to assign a device to nested VM

M Castelino
2 min readApr 11, 2019

--

  • Here the vfio-pci device is passed in into the L1 VM
  • The L1 VM is setup with kernel_irqchip=split
  • The L0 exposes a virtual IOMMU to the L1 VM
qemu-system-x86_64 \
-machine q35,accel=kvm,kernel_irqchip=split \
-enable-kvm \
-bios OVMF.fd \
-smp sockets=1,cpus=4,cores=2 -cpu host \
-m 1024 \
-vga none -nographic \
-drive file="$IMAGE",if=virtio,aio=threads,format=raw \
-netdev user,id=mynet0,hostfwd=tcp::${VMN}0022-:22,hostfwd=tcp::${VMN}2375-:2375 \
-device virtio-net-pci,netdev=mynet0 \
-device virtio-rng-pci \
-monitor telnet:127.0.0.1:55555,server,nowait \
-debugcon file:debug.log -global isa-debugcon.iobase=0x402 $@ \
-device intel-iommu,intremap=on,caching-mode=on \
-device vfio-pci,host=b3:00.0 \

Within the VM you will see

root@clr-d8a5d96d9a844656bcab094780f420b2 ~ # dmesg | grep -e DMAR -e IOMMU
[ 0.000000] ACPI: DMAR 0x000000003E86C000 000048 (v01 BOCHS BXPCDMAR 00000001 BXPC 00000001)
[ 0.000000] DMAR: IOMMU enabled
[ 0.145746] DMAR: Host address width 39
[ 0.145747] DMAR: DRHD base: 0x000000fed90000 flags: 0x1
[ 0.145769] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 12008c22260286 ecap f00f5a
[ 0.145776] DMAR: No RMRR found
[ 0.145776] DMAR: No ATSR found
[ 0.145825] DMAR: dmar0: Using Queued invalidation
[ 0.218192] DMAR: Setting RMRR:
[ 0.218193] DMAR: Prepare 0-16MiB unity mapping for LPC
[ 0.219038] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
[ 0.257194] DMAR: Intel(R) Virtualization Technology for Directed I/O

You will also see IOMMU groups setup within the VM

root@clr-d8a5d96d9a844656bcab094780f420b2 ~ # lspci -v -s 00:03.0
00:03.0 Serial controller: MosChip Semiconductor Technology Ltd. 4-Port PCIe Serial Adapter (prog-if 02 [16550])
Subsystem: Device a000:1000
Flags: bus master, fast devsel, latency 0, IRQ 23
I/O ports at 60e0 [size=8]
Memory at 90003000 (32-bit, non-prefetchable) [size=4K]
Memory at 90002000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [78] Power Management version 3
Kernel driver in use: serial
root@clr-d8a5d96d9a844656bcab094780f420b2 ~ # find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/5/devices/0000:00:1f.2
/sys/kernel/iommu_groups/5/devices/0000:00:1f.0
/sys/kernel/iommu_groups/5/devices/0000:00:1f.3
/sys/kernel/iommu_groups/3/devices/0000:00:03.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/4/devices/0000:00:04.0
/sys/kernel/iommu_groups/2/devices/0000:00:02.0
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
root@clr-d8a5d96d9a844656bcab094780f420b2 ~ # readlink /sys/kernel/iommu_groups/3/devices/0000:00:03.0
../../../../devices/pci0000:00/0000:00:03.0

This device we assigned through VFIO is now in its own IOMMU groups and can be assigned using VFIO in L1 to a L2 VM.

The L1 VM is booted with IOMMU support by passing intel_iommu=on on its kernel command line.

Assigning virtio devices to vfio

If a virtio device is to be assigned to vfio, then it needs to be passed as :

-device virtio-net-pci,netdev=mynet0,disable-legacy=on,disable-modern=off,iommu_platform=on,ats=on \
-device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on

Documentation can be found at:
https://wiki.qemu.org/Features/VT-d#Command_Line_Example_2
Although the device is a virtio-net device, it is bound to virtio-pci driver.

--

--