Overview

This video will step you through the process I followed in my own environment running Dell PowerEdge R650. Prior to recording this video, I experienced a few different errors due to not having an up to date bios and attempting to enable SR-IOV with three GPUs installed:

  • SR-IOV Enabled / Needs reboot

SR-IOV Enabled / Needs reboot

  • PST0208 System BIOS has halted
  • UEFI0036 Unable to initialize the iDRAC Shared Memory Architecture (SMA) interface
  • UEFI0134 Unable to allocate Memory Mapped Input Output (MMIO) resources for one or more PCIe devices because of insufficient MMIO memory

BIOS Errors

The video shows How to:

Completing the above steps is required in order to allow the GPU resources to be shared across multiple Virtual Machines (VM) in a vSphere 7 or vSphere 8 environment. To get the most out of the Intel Flex GPU line, it is recommended to use vSphere 8

Prepare

  1. Download the latest BIOS (.exe file) from Dell for your server. Failure to do so may result in errors enabling SR-IOV

  2. Download the latest Intel Drivers for ESXi 7u3 or 8u2 from https://www.intel.com/content/www/us/en/download/786751/intel-data-center-graphics-driver-for-vmware-esxi.html

  3. Extract the .zip so that you have the Intel-idcgpu_.zip and Intel-idcgputools_.zip files ready

  4. Transfer/copy the two Intel zip files to your ESXi host using your preferred method. Be sure to use the driver specified for your ESXi Host version:

1scp ./Intel*.zip root@esxi-server:/tmp

Instructions

  1. Place host in maintenance mode using the ESXi Host Client, vCenter, or the following command at the console:
1esxcli system maintenanceMode set -e true
  1. Verify host is in maintenance mode: esxcli system maintenanceMode get

  2. Open a console or SSH to your ESXi host and install the Intel drivers:

1esxcli software component apply --no-sig-check -d /tmp/*Intel-idcgpu_*.zip
2esxcli software component apply --no-sig-check -d /tmp/*Intel-idcgputools*.zip
  1. Use iDRAC to upgrade your BIOS to latest version. When prompted choose to Apply at next reboot - but DON’T REBOOT yet. If iDRAC is unavailable, use your available method to upgrade the BIOS

  2. Enable SR-IOV in your BIOS. Using iDRAC, this is found under: Configuration -> BIOS Settings -> Integrated Devices -> SR-IOV Global Enable. When done, select “At Next Reboot”

  3. Reboot your host - When using iDRAC as noted above, the BIOS upgrade is placed in job queue before the reconfigure. So end result is that at next reboot: BIOS is upgraded, then SR-IOV is enabled, and host is rebooted back to the installed OS

iDrac Jobs

  1. Exit maintenance mode:
1esxcli system maintenanceMode set -e false
  1. Using either ESXi Host Client or vCenter, enable SR-IOV on each GPU listed in under PCI Devices, setting VF to 6 for FLEX 140 (I’m not certain about this)

Configure SR-IOV in vCenter

  1. Verify that all VFs show as Passthrough PCIe Devices

Passthrough enabled devices

How to Uninstall the Intel Drivers

  1. Place host in maintenance mode
1esxcli system maintenanceMode set -e true
  1. Use console/ssh to uninstall drivers
1esxcli software component remove -n Intel-idcgputools
2esxcli software component remove -n Intel-idcgpu
  1. Reboot
  2. Exit maintenance mode
1esxcli system maintenanceMode set -e false