1.. _sriov_virtualization:
2
3Enable SR-IOV Virtualization
4############################
5
6SR-IOV (Single Root Input/Output Virtualization) can isolate PCIe devices
7to improve performance that is similar to bare-metal levels. SR-IOV consists
8of two basic units: PF (Physical Function), which supports SR-IOV PCIe
9extended capability and manages entire physical devices; and VF (Virtual
10Function), a "lightweight" PCIe function that is a passthrough device for
11VMs.
12
13For details, refer to Chapter 9 of PCI-SIG's
14`PCI Express Base Specification Revision 4.0, Version 1.0
15<https://pcisig.com/pci-express-architecture-configuration-space-test-specification-revision-40-version-10>`_.
16
17SR-IOV Architectural Overview
18*****************************
19
20.. figure:: images/sriov-image1.png
21   :align: center
22   :name: SR-IOV-architecture-overview
23
24   SR-IOV Architectural Overview
25
26-  **SI** - A System Image known as a VM.
27
28-  **VI** - A Virtualization Intermediary known as a hypervisor.
29
30-  **SR-PCIM** - A Single Root PCI Manager; it is a software entity for
31   SR-IOV management.
32
33-  **PF** - A PCIe Function that supports the SR-IOV capability
34   and is accessible to an SR-PCIM, a VI, or an SI.
35
36-  **VF** - A "light-weight" PCIe Function that is directly accessible by an
37   SI.
38
39SR-IOV Extended Capability
40--------------------------
41
42The SR-IOV Extended Capability defined here is a PCIe extended
43capability that must be implemented in each PF device that supports the
44SR-IOV feature. This capability is used to describe and control a PF's
45SR-IOV capabilities.
46
47.. figure:: images/sriov-image2.png
48   :align: center
49   :name: SR-IOV-extended-capability
50
51   SR-IOV Extended Capability
52
53-  **PCIe Extended Capability ID** - 0010h.
54
55-  **SR-IOV Capabilities** - VF Migration-Capable and ARI-Capable.
56
57-  **SR-IOV Control** - Enable/Disable VFs; VF migration state query.
58
59-  **SR-IOV Status** - VF Migration Status.
60
61-  **Initial VFs** - Indicates to the SR-PCIM the number of VFs that are
62   initially associated with the PF.
63
64-  **Total VFs** - Indicates the maximum number of VFs that can be
65   associated with the PF.
66
67-  **Num VFs** - Controls the number of VFs that are visible. *Num VFs* <=
68   *Initial VFs* = *Total VFs*.
69
70-  **Function Link Dependency** - The field used to describe
71   dependencies between PFs. VF dependencies are the same as the
72   dependencies of their associated PFs.
73
74-  **First VF Offset** - A constant that defines the Routing ID
75   offset of the first VF that is associated with the PF that contains
76   this Capability structure.
77
78-  **VF Stride** - Defines the Routing ID offset from one VF to the
79   next one for all VFs associated with the PF that contains this
80   Capability structure.
81
82-  **VF Device ID** - The field that contains the Device ID that should be
83   presented for every VF to the SI.
84
85-  **Supported Page Sizes** - The field that indicates the page sizes
86   supported by the PF.
87
88-  **System Page Size** - The field that defines the page size the system
89   will use to map the VFs' memory addresses. Software must set the
90   value of the *System Page Size* to one of the page sizes set in the
91   *Supported Page Sizes* field.
92
93-  **VF BARs** - Fields that must define the VF's Base Address
94   Registers (BARs). These fields behave as normal PCI BARs.
95
96-  **VF Migration State Array Offset** - Register that contains a
97   PF BAR relative pointer to the VF Migration State Array.
98
99-  **VF Migration State Array** - Located using the VF Migration
100   State Array Offset register of the SR-IOV Capability block.
101
102For details, refer to the *PCI Express Base Specification Revision 4.0, Version 1.0 Chapter 9.3.3*.
103
104SR-IOV Architecture in ACRN
105---------------------------
106
107.. figure:: images/sriov-image3.png
108   :align: center
109   :name: SR-IOV-architecure-in-acrn
110
111   SR-IOV Architectural in ACRN
112
1131. A hypervisor detects an SR-IOV capable PCIe device in the physical PCI
114   device enumeration phase.
115
1162. The hypervisor intercepts the PF's SR-IOV capability and accesses whether
117   to enable/disable VF devices based on the ``VF_ENABLE`` state. All
118   read/write requests for a PF device passthrough to the PF physical
119   device.
120
1213. The hypervisor waits for 100ms after ``VF_ENABLE`` is set and initializes
122   VF devices. The differences between a normal passthrough device and
123   SR-IOV VF device are physical device detection, BARs, and MSI-X
124   initialization. The hypervisor uses ``Subsystem Vendor ID`` to detect the
125   SR-IOV VF physical device instead of ``Vendor ID`` since no valid
126   ``Vendor ID`` exists for the SR-IOV VF physical device. The VF BARs are
127   initialized by its associated PF's SR-IOV capabilities, not PCI
128   standard BAR registers. The MSI-X mapping base address is also from the
129   PF's SR-IOV capabilities, not PCI standard BAR registers.
130
131SR-IOV Passthrough VF Architecture in ACRN
132------------------------------------------
133
134.. figure:: images/sriov-image4.png
135   :align: center
136   :name: SR-IOV-vf-passthrough
137
138   SR-IOV VF Passthrough Architecture in ACRN
139
1401. The SR-IOV VF device needs to bind the PCI-stud driver instead of the
141   vendor-specific VF driver before the device passthrough.
142
1432. The user configures the ``acrn-dm`` boot parameter with the passthrough
144   SR-IOV VF device. When the User VM starts, ``acrn-dm`` invokes a
145   hypercall to set the *vdev-VF0* device in the User VM.
146
1473. The hypervisor emulates ``Device ID/Vendor ID`` and ``Memory Space Enable
148   (MSE)`` in the configuration space for an assigned SR-IOV VF device. The
149   assigned VF ``Device ID`` comes from its associated PF's capability. The
150   ``Vendor ID`` is the same as the PF's ``Vendor ID`` and the ``MSE`` is always
151   set when reading the SR-IOV VF device's control register.
152
1534. The vendor-specific VF driver in the target VM probes the assigned SR-IOV
154   VF device.
155
156SR-IOV Initialization Flow
157--------------------------
158
159.. figure:: images/sriov-image5.png
160   :align: center
161   :name: SR-IOV-init-flow
162
163   SR-IOV Initialization Flow
164
165When an SR-IOV capable device is initialized, all access to the
166configuration space will passthrough to the physical device directly.
167The Service VM can identify all capabilities of the device from the SR-IOV
168extended capability and then create a *sysfs* node for SR-IOV management.
169
170SR-IOV VF Enable Flow
171---------------------
172
173.. figure:: images/sriov-image6.png
174   :align: center
175   :width: 900px
176   :name: SR-IOV-enable-flow
177
178   SR-IOV VF Enable Flow
179
180The application enables ``n`` VF devices via an SR-IOV PF device ``sysfs`` node.
181The hypervisor intercepts all SR-IOV capability access and checks the
182``VF_ENABLE`` state. If ``VF_ENABLE`` is set, the hypervisor creates n
183virtual devices after 100ms so that VF physical devices have enough time to
184be created. The Service VM waits 100ms and then only accesses the first VF
185device's configuration space including Class Code, Reversion ID, Subsystem
186Vendor ID, Subsystem ID. The Service VM uses the first VF device
187information to initialize subsequent VF devices.
188
189SR-IOV VF Disable Flow
190----------------------
191
192.. figure:: images/sriov-image7.png
193   :align: center
194   :name: SR-IOV-disable-flow
195
196   SR-IOV VF Disable Flow
197
198The application disables SR-IOV VF devices by writing zero to the SR-IOV PF
199device ``sysfs`` node. The hypervisor intercepts all SR-IOV capability
200accesses and checks the ``VF_ENABLE`` state. If ``VF_ENABLE`` is clear, the
201hypervisor makes VF virtual devices invisible from the Service VM so that all
202access to VF devices will return ``0xFFFFFFFF`` as an error. The VF physical
203devices are removed within 1s of when ``VF_ENABLE`` is clear.
204
205SR-IOV VF Assignment Policy
206---------------------------
207
208.. figure:: images/sriov-image8.png
209   :align: center
210   :name: SR-IOV-vf-assignment
211
212   SR-IOV VF Assignment
213
2141. All SR-IOV PF devices are managed by the Service VM.
215
2162. The SR-IOV PF cannot passthrough to the User VM.
217
2183. All VFs can passthrough to the User VM, but we do not recommend
219   a passthrough to high privilege VMs because the PF device may impact
220   the assigned VFs' functionality and stability.
221
222SR-IOV Usage Guide in ACRN
223--------------------------
224
225We use the Intel 82576 NIC as an example in the following instructions. We
226only support LaaG (Linux as a Guest).
227
2281. Ensure that the 82576 VF driver is compiled into the User VM Kernel
229   (set ``CONFIG_IGBVF=y`` in the Kernel Config).
230
231#. When the Service VM boots, the ``lspci -v`` command indicates
232   that the Intel 82576 NIC devices have SR-IOV capability and their PF
233   drivers are ``igb``.
234
235   .. figure:: images/sriov-image9.png
236      :align: center
237      :name: 82576-pf
238
239      82576 SR-IOV PF Devices
240
241#. Input the ``echo n > /sys/class/net/enp109s0f0/device/sriov\_numvfs``
242   command in the Service VM to enable n VF devices for the first PF
243   device (\ *enp109s0f0)*. The number *n* can't be more than *TotalVFs*
244   coming from the return value of command
245   ``cat /sys/class/net/enp109s0f0/device/sriov\_totalvfs``. Here we
246   use *n = 2* as an example.
247
248   .. figure:: images/sriov-image10.png
249      :align: center
250      :name: 82576-vf
251
252      82576 SR-IOV VF Devices
253
254   .. figure:: images/sriov-image11.png
255      :align: center
256      :name: 82576-vf-nic
257
258      82576 SR-IOV VF NIC
259
260#. Passthrough an SR-IOV VF device to guest.
261
262   a. Unbind the igbvf driver in the Service VM.
263
264      i.   ``modprobe pci\_stub``
265
266      ii.  ``echo "8086 10ca" > /sys/bus/pci/drivers/pci-stub/new\_id``
267
268      iii. ``echo "0000:6d:10.0" > /sys/bus/pci/devices/0000:6d:10.0/driver/unbind``
269
270      iv.  ``echo "0000:6d:10.0" > /sys/bus/pci/drivers/pci-stub/bind``
271
272   b. Add the SR-IOV VF device parameter (``-s X, passthru,6d/10/0``) in
273      the launch User VM script
274
275      .. figure:: images/sriov-image12.png
276         :align: center
277         :name: 82576-nic-passthru
278
279         Configure 82576 NIC as a Passthrough Device
280
281   c. Boot the User VM
282
283SR-IOV Limitations in ACRN
284--------------------------
285
2861. The SR-IOV migration feature is not supported.
287
2882. If an SR-IOV PF device is detected during the enumeration phase, but
289   not enough room exists for its total VF devices, the PF device will be
290   dropped. The platform uses the ``MAX_PCI_DEV_NUM`` ACRN configuration to
291   support the maximum number of PCI devices. Make sure ``MAX_PCI_DEV_NUM`` is
292   more than the number of all PCI devices, including the total SR-IOV VF
293   devices.
294