Setup for VM

If your node will serve virtual machines, you need to install and configure Libvirt and Qemu.

To allow Libvirt to use GPU device in pass-through mode, you have to make sure the NVIDIA driver is not installed or not used by the system.

0. Check NVIDIA Driver Installation

You can check if the NVIDIA driver is loaded and used by the system using the following command:

lsmod | grep nvidia
# or check if driver is used by your GPU:
lspci -k | grep -EA3 'VGA|3D|Display'

1. Remove NVIDIA Driver (if installed)

If the NVIDIA driver is installed, you can remove it using the following command:

sudo apt-get remove --purge '^nvidia-.*'
sudo apt autoremove
echo 'nouveau' | sudo tee -a /etc/modules
sudo rm /etc/X11/xorg.conf

If you've installed the driver using the RUN file, remove it using:

sudo /usr/bin/nvidia-uninstall

Remove configs if any.

sudo rm -rf /etc/X11/xorg.conf
sudo rm -rf /etc/modprobe.d/nvidia*.conf
sudo rm -rf /lib/modprobe.d/nvidia*.conf

After the reboot, verify that driver is uninstalled. The following command should return nothing:

lsmod | grep nvidia

info

Restart your machine to apply the changes.

2. Install System Dependencies

Install additional packages required by the service.

sudo apt update
sudo apt install qemu-kvm libvirt-daemon-system genisoimage whois

3. Configure Libvirt

Rift system service is run as the root user, so you need to set up libvirt to run under root as well. Edit /etc/libvirt/qemu.conf and modify the user and group settings:

user = "root"
group = "root"

Then restart the libvirtd service:

sudo systemctl restart libvirtd

tip

You can use client_setup.sh to install libvirt and rift:

sudo client_setup.sh --only=vm
sudo client_setup.sh --only=rift

4. Check IOMMU groups

Run the following script on the host system.

#!/bin/bash

for g in /sys/kernel/iommu_groups/*; do
    group_num=$(basename "$g")
    gpu_found=false
    device_lines=""

    for d in "$g"/devices/*; do
        pci_addr=$(basename "$d")
        line=$(lspci -nn -s "$pci_addr")
        device_lines+="$line"$'\n'
        if echo "$line" | grep -qE 'VGA compatible controller|3D controller'; then
            gpu_found=true
        fi
    done

    if $gpu_found; then
        echo "IOMMU Group $group_num:"
        echo "$device_lines"
    fi
done

Typically, you'll see just GPUs or GPU + Audio device or in some cases GPU + Audio + PCI bridges. All of these are normal.

IOMMU Group 112:
a0:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
a0:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
a1:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2b85] (rev a1)
a1:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e8] (rev a1)

IOMMU Group 13:
40:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
40:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
41:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2b85] (rev a1)
41:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e8] (rev a1)

If you see multiple GPUs in the same group or too many devices, your motherboard might not support all the necessary features for virtualization and it may be tricky to workaround these.

5. (Optional) Enable Other Virtualization Options

The iommu=pt and pci=realloc kernel parameters in GRUB are used to optimize I/O operations and improve performance, particularly in virtualized environments or when dealing with GPUs with large amount of memory. The iommu=pt option enables IOMMU pass-through mode, bypassing DMA translation for better performance, while pci=realloc allows the kernel to reallocate PCI resources if the BIOS's initial allocation is insufficient.

Add iommu=pt pci=realloc pcie_aspm=off and intel_iommu=on or amd_iommu=on options, depending on your system, to the following line in the /etc/default/grub.

GRUB_CMDLINE_LINUX_DEFAULT="... iommu=pt pci=realloc pcie_aspm=off amd_iommu=on (or intel_iommu=on)"

Here is a short summary of these options:

iommu=pt: IOMMU passthrough mode tells the kernel to enable the IOMMU but use pass-through mode for DMA mappings by default. For VFIO GPU passthrough — allows the device to access physical memory directly with minimal performance penalty.
pci=realloc: Reallocate PCI resources forces the kernel to reassign PCI bus resources (MMIO/IOBARs) from scratch, ignoring what the firmware/BIOS assigned. It helps avoid issues when the BIOS didn't allocate enough space for devices (common with large GPUs or multiple devices). Fixes “BAR can't be assigned” or “resource busy” errors.
pcie_aspm=off: Disable PCIe Active State Power Management, which is a power-saving feature that reduces PCIe link power in idle states. Some PCIe devices (especially GPUs) have trouble retraining links or waking from ASPM low-power states, leading to hangs or device inaccessible errors. Disabling ASPM can improve stability.

Then, invoke (or do it after the next section).

sudo update-initramfs -u -k all
sudo update-grub
sudo reboot

6. Bind to VFIO Early

For greater stability it is best to ensure that VFIO driver is always bound and no driver changes are happening at runtime. Thus, first blacklist all other drivers early and disable video mode setup to ensure that the GPU is not used by the system.

You need to first figure out the PCI vendor ID and device ID for GPUs. To do so you can run lspci -nnk | grep NVIDIA. You'll see output like this (RTX 5090):

$ lspci -nnk | grep NVIDIA
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2b85] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e8] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:0000]

Then add the following lines to GRUB_CMDLINE_LINUX_DEFAULT in the /etc/default/grub changing the PCI vendor ID and device ID appropriately. Keep other options if needed.

GRUB_CMDLINE_LINUX_DEFAULT="nomodeset video=efifb:off modprobe.blacklist=nouveau,nvidia,nvidiafb,snd_hda_intel vfio-pci.ids=10de:2b85,10de:22e8 ..."

The reason why we're adding the driver binding options to GRUB is because vfio is a kernel builtin module in many modern OS. Many online manuals suggest adding vfio modules to /etc/modprobe.d/vfio.conf, but any options we add to /etc/modprobe.d/XXX are ignored, because modprobe doesn't control builtin modules.

We also need to ensure that /etc/initramfs-tools/modules contains these lines:

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

Also, it is a good idea to disable power management for VFIO devices to improve stability.

echo "options vfio-pci disable_idle_d3=1" | sudo tee /etc/modprobe.d/vfio.conf

Finally, invoke:

sudo update-initramfs -u -k all
sudo update-grub
sudo reboot

After reboot check that VFIO drivers are in use. You can use lspci -k | grep -A 2 -i nvidia command and should see vfio drivers in use:

81:00.0 VGA compatible controller: NVIDIA Corporation Device 2b85 (rev a1)
    Subsystem: Gigabyte Technology Co., Ltd Device 416f
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau
81:00.1 Audio device: NVIDIA Corporation Device 22e8 (rev a1)
    Subsystem: NVIDIA Corporation Device 0000
    Kernel driver in use: vfio-pci
    Kernel modules: snd_hda_intel

Next Steps

Proceed to the Disk Setup guide.

Setup for VM

0. Check NVIDIA Driver Installation​

1. Remove NVIDIA Driver (if installed)​

2. Install System Dependencies​

3. Configure Libvirt​

4. Check IOMMU groups​

5. (Optional) Enable Other Virtualization Options​

6. Bind to VFIO Early​

Next Steps​