1.. _interrupt-hld: 2 3Physical Interrupt High-Level Design 4#################################### 5 6Overview 7******** 8 9The ACRN hypervisor implements a simple but fully functional framework 10to manage interrupts and exceptions, as shown in 11:numref:`interrupt-modules-overview`. In its native layer, it configures 12the physical PIC, IOAPIC, and LAPIC to support different interrupt 13sources from the local timer/IPI to the external INTx/MSI. In its virtual guest 14layer, it emulates virtual PIC, virtual IOAPIC, and virtual LAPIC/passthrough 15LAPIC. It provides full APIs, allowing virtual interrupt injection from 16emulated or passthrough devices. The contents in this section do not include 17the passthrough LAPIC case. For the passthrough LAPIC, refer to 18:ref:`lapic_passthru` 19 20.. figure:: images/interrupt-image3.png 21 :align: center 22 :width: 600px 23 :name: interrupt-modules-overview 24 25 ACRN Interrupt Modules Overview 26 27In the software modules view shown in :numref:`interrupt-sw-modules`, 28the ACRN hypervisor sets up the physical interrupt in its basic 29interrupt modules (e.g., IOAPIC/LAPIC/IDT). It dispatches the interrupt 30in the hypervisor interrupt flow control layer to the corresponding 31handlers; this could be predefined IPI notification, timer, or runtime 32registered passthrough devices. The ACRN hypervisor then uses its VM 33interfaces based on vPIC, vIOAPIC, and vMSI modules, to inject the 34necessary virtual interrupt into the specific VM, or directly deliver 35interrupt to the specific RT VM with passthrough LAPIC. 36 37.. figure:: images/interrupt-image2.png 38 :align: center 39 :width: 600px 40 :name: interrupt-sw-modules 41 42 ACRN Interrupt Software Modules Overview 43 44 45The hypervisor implements the following functionalities for handling 46physical interrupts: 47 48- Configure interrupt-related hardware including IDT, PIC, LAPIC, and 49 IOAPIC on startup. 50 51- Provide APIs to manipulate the registers of LAPIC and IOAPIC. 52 53- Acknowledge physical interrupts. 54 55- Set up a callback mechanism for the other components in the 56 hypervisor to request for an interrupt vector and register a 57 handler for that interrupt. 58 59HV owns all native physical interrupts and manages 256 vectors per CPU. 60All physical interrupts are first handled in VMX root-mode. The 61"external-interrupt exiting" bit in VM-Execution controls field is set 62to support this. The ACRN hypervisor also initializes all the interrupt 63related modules like IDT, PIC, IOAPIC, and LAPIC. 64 65HV does not own any host devices (except UART). All devices are by 66default assigned to the Service VM. Any interrupts received by VM 67(Service VM or User VM) device drivers are virtual interrupts injected 68by HV (via vLAPIC). 69HV manages a Host-to-Guest mapping. When a native IRQ/interrupt occurs, 70HV decides whether this IRQ/interrupt should be forwarded to a VM and 71which VM to forward to (if any). Refer to 72:ref:`virt-interrupt-injection` and :ref:`interrupt-remapping` for 73more information. 74 75HV does not own any exceptions. Guest VMCS are configured so no VM Exit 76happens, with some exceptions such as #INT3 and #MC. This is to 77simplify the design as HV does not support any exception handling 78itself. HV supports only static memory mapping, so there should be no 79#PF or #GP. If HV receives an exception indicating an error, an assert 80function is then executed with an error message printout, and the 81system then halts. 82 83Native interrupts can be generated from one of the following 84sources: 85 86- GSI interrupts 87 88 - PIC or Legacy devices IRQ (0~15) 89 - IOAPIC pin 90 91- PCI MSI/MSI-X vectors 92- Inter CPU IPI 93- LAPIC timer 94 95.. _physical-interrupt-initialization: 96 97Physical Interrupt Initialization 98********************************* 99 100After ACRN hypervisor gets control from the bootloader, it 101initializes all physical interrupt-related modules for all the CPUs. ACRN 102hypervisor creates a framework to manage the physical interrupt for 103hypervisor local devices, passthrough devices, and IPI between CPUs, as 104shown in :numref:`hv-interrupt-init`: 105 106.. figure:: images/interrupt-image66.png 107 :align: center 108 :name: hv-interrupt-init 109 110 Physical Interrupt Initialization 111 112IDT Initialization 113================== 114 115ACRN hypervisor builds its native IDT (interrupt descriptor table) 116during interrupt initialization and sets up the following handlers: 117 118- On an exception, the hypervisor dumps its context and halts the current 119 physical processor (because physical exceptions are not expected). 120 121- For external interrupts, HV may mask the interrupt (depending on the 122 trigger mode), followed by interrupt acknowledgement and dispatch 123 to the registered handler, if any. 124 125Most interrupts and exceptions are handled without a stack switch, 126except for machine-check, double fault, and stack fault exceptions which 127have their own stack set in TSS. 128 129PIC/IOAPIC Initialization 130========================= 131 132ACRN hypervisor masks all interrupts from the PIC. All legacy interrupts 133from PIC (<16) will be linked to IOAPIC, as shown in the connections in 134:numref:`hv-pic-config`. 135 136ACRN will pre-allocate vectors and set them for these legacy interrupts 137in IOAPIC RTEs. For others (>= 16), ACRN will set them with vector 0 in 138RTEs, and valid vectors will be dynamically allocated on demand. 139 140All external IOAPIC pins are categorized as GSI interrupt according to 141ACPI definition. HV supports multiple IOAPIC components. IRQ PIN to GSI 142mappings are maintained internally to determine GSI source IOAPIC. 143Native PIC is not used in the system. 144 145.. figure:: images/interrupt-image46.png 146 :align: center 147 :name: hv-pic-config 148 149 Hypervisor PIC/IOAPIC/LAPIC Configuration 150 151LAPIC Initialization 152==================== 153 154Physical LAPICs are in x2APIC mode in ACRN hypervisor. The hypervisor 155initializes LAPIC for each physical CPU by masking all interrupts in the 156local vector table (LVT), clearing all ISRs, and enabling LAPIC. 157 158APIs are provided to access LAPIC for the other components in the 159hypervisor, aiming for further usage of local timer (TSC Deadline) 160program, IPI notification program, etc. See :ref:`hv_interrupt-data-api` 161for a complete list. 162 163HV Interrupt Vectors and Delivery Mode 164====================================== 165 166The interrupt vectors are assigned as shown here: 167 168**Vector 0-0x1F** 169 are exceptions that are not handled by HV. If 170 such an exception does occur, the system then halts. 171 172**Vector: 0x20-0x2F** 173 are allocated statically for legacy IRQ0-15. 174 175**Vector: 0x30-0xDF** 176 are dynamically allocated vectors for PCI devices 177 INTx or MSI/MIS-X usage. According to different interrupt delivery mode 178 (FLAT or PER_CPU mode), an interrupt will be assigned to a vector for 179 all the CPUs or a particular CPU. 180 181**Vector: 0xE0-0xFE** 182 are high priority vectors reserved by HV for 183 dedicated purposes. For example, 0xEF is used for timer, 0xF0 is used 184 for IPI. 185 186.. list-table:: 187 :widths: 30 70 188 :header-rows: 1 189 190 * - Vectors 191 - Usage 192 193 * - 0x0-0x14 194 - Exceptions: NMI, INT3, page fault, GP, debug. 195 196 * - 0x15-0x1F 197 - Reserved 198 199 * - 0x20-0x2F 200 - Statically allocated for external IRQ (IRQ0-IRQ15) 201 202 * - 0x30-0xDF 203 - Dynamically allocated for IOAPIC IRQ from PCI INTx/MSI 204 205 * - 0xE0-0xFE 206 - Static allocated for HV 207 208 * - 0xEF 209 - Timer 210 211 * - 0xF0 212 - IPI 213 214 * - 0xF2 215 - Posted Interrupt 216 217 * - 0xF3 218 - Hypervisor Callback HSM 219 220 * - 0xF4 221 - Performance Monitoring Interrupt 222 223 * - 0xFF 224 - SPURIOUS_APIC_VECTOR 225 226Interrupts from either IOAPIC or MSI can be delivered to a target CPU. 227By default, they are configured as Lowest Priority (FLAT mode), meaning they 228are delivered to a CPU core that is idle or executing the lowest 229priority ISR. There is no guarantee a device's interrupt will be 230delivered to a specific Guest's CPU. Timer interrupts are an exception - 231these are always delivered to the CPU which programs the LAPIC timer. 232 233x86-64 supports per CPU IDTs, but ACRN uses a global shared IDT, 234with which the interrupt/IRQ to vector mapping is the same on all CPUs. Vector 235allocation for CPUs is shown here: 236 237.. figure:: images/interrupt-image89.png 238 :align: center 239 240 FLAT Mode Vector Allocation 241 242IRQ Descriptor Table 243==================== 244 245ACRN hypervisor maintains a global IRQ Descriptor Table shared among the 246physical CPUs, so the same vector will link to the same IRQ number for 247all CPUs. 248 249The *irq_desc[]* array's index represents IRQ number. A *handle_irq* 250will be called from *interrupt_dispatch* to commonly handle edge/level 251triggered IRQ and call the registered *action_fn*. 252 253Another reverse mapping from vector to IRQ is used in addition to the 254IRQ descriptor table which maintains the mapping from IRQ to vector. 255 256On initialization, the descriptor of the legacy IRQs are initialized with 257proper vectors and the corresponding reverse mapping is set up. 258The descriptor of other IRQs are filled with an invalid 259vector which will be updated on IRQ allocation. 260 261For example, if local timer registers an interrupt with IRQ number 254 and 262vector 0xEF, then this date will be set up: 263 264.. code-block:: c 265 266 irq_desc[254].irq = 254 267 irq_desc[254].vector = 0xEF 268 vector_to_irq[0xEF] = 254 269 270External Interrupt Handling 271*************************** 272 273CPU runs under VMX non-root mode and inside Guest VMs. 274``MSR_IA32_VMX_PINBASED_CTLS.bit[0]`` and 275``MSR_IA32_VMX_EXIT_CTLS.bit[15]`` are set to allow vCPU VM Exit to HV 276whenever there are interrupts to that physical CPU under 277non-root mode. HV ACKs the interrupts in VMX non-root and saves the 278interrupt vector to the relevant VM Exit field for HV IRQ processing. 279 280Note that as discussed above, an external interrupt causing vCPU VM Exit 281to HV does not mean that the interrupt belongs to that Guest VM. When 282CPU executes VM Exit into root-mode, interrupt handling will be enabled 283and the interrupt will be delivered and processed as quickly as possible 284inside HV. HV may emulate a virtual interrupt and inject to Guest if 285necessary. 286 287Interrupt and IRQ processing flow diagrams are shown below: 288 289.. figure:: images/interrupt-image48.png 290 :align: center 291 :name: phy-interrupt-processing 292 293 Processing of Physical Interrupts 294 295When a physical interrupt is raised and delivered to a physical CPU, the 296CPU may be running under either VMX root mode or non-root mode. 297 298- If the CPU is running under VMX root mode, the interrupt is handled 299 following the standard native IRQ flow: interrupt gate to 300 dispatch_interrupt(), IRQ handler, and finally the registered callback. 301- If the CPU is running under VMX non-root mode, an external interrupt 302 calls a VM exit for reason "external-interrupt", and then the VM 303 exit processing flow will call dispatch_interrupt() to dispatch and 304 handle the interrupt. 305 306After an interrupt occurs from either path shown in 307:numref:`phy-interrupt-processing`, ACRN hypervisor will jump to 308dispatch_interrupt. This function gets the vector of the generated 309interrupt from the context, gets IRQ number from vector_to_irq[], and 310then gets the corresponding irq_desc. 311 312Though there is only one generic IRQ handler for registered interrupt, 313there are three different handling flows according to flags: 314 315- ``!IRQF_LEVEL`` 316- ``IRQF_LEVEL && !IRQF_PT`` 317 318 To avoid continuous interrupt triggers, it masks the IOAPIC pin and 319 unmask it only after IRQ action callback is executed 320 321- ``IRQF_LEVEL && IRQF_PT`` 322 323 For passthrough devices, to avoid continuous interrupt triggers, it masks 324 the IOAPIC pin and leaves it unmasked until corresponding vIOAPIC 325 pin gets an explicit EOI ACK from guest. 326 327Since interrupts are not shared for multiple devices, there is only one 328IRQ action registered for each interrupt. 329 330The IRQ number inside HV is a software concept to identify GSI and 331Vectors. Each GSI will be mapped to one IRQ. The GSI number is usually the same 332as the IRQ number. IRQ numbers greater than max GSI (nr_gsi) number are dynamically 333assigned. For example, HV allocates an interrupt vector to a PCI device, 334an IRQ number is then assigned to that vector. When the vector later 335reaches a CPU, the corresponding IRQ action function is located and executed. 336 337See :numref:`request-irq` for request IRQ control flow for different 338conditions: 339 340.. figure:: images/interrupt-image76.png 341 :align: center 342 :name: request-irq 343 344 Request IRQ for Different Conditions 345 346.. _ipi-management: 347 348IPI Management 349************** 350 351The only purpose of IPI use in HV is to kick a vCPU out of non-root mode 352and enter to HV mode. This requires I/O request and virtual interrupt 353injection be distributed to different IPI vectors. The I/O request uses 354IPI vector 0xF3 upcall. The virtual interrupt injection uses IPI vector 0xF0. 355 3560xF3 upcall 357 A Guest vCPU VM Exit exits due to EPT violation or IO instruction trap. 358 It requires Device Module to emulate the MMIO/PortIO instruction. 359 However it could be that the Service VM vCPU0 is still in non-root 360 mode. So an IPI (0xF3 upcall vector) should be sent to the physical CPU0 361 (with non-root mode as vCPU0 inside the Service VM) to force vCPU0 to VM Exit due 362 to the external interrupt. The virtual upcall vector is then injected to 363 the Service VM, and the vCPU0 inside the Service VM then will pick up the IO request and do 364 emulation for other Guest. 365 3660xF0 IPI flow 367 If Device Module inside the Service VM needs to inject an interrupt to other Guest 368 such as vCPU1, it will issue an IPI first to kick CPU1 (assuming CPU1 is 369 running on vCPU1) to root-hv_interrupt-data-apmode. CPU1 will inject the 370 interrupt before VM Enter. 371 372.. _hv_interrupt-data-api: 373 374Data Structures and Interfaces 375****************************** 376 377IOAPIC 378====== 379 380The following APIs are external interfaces for IOAPIC related 381operations. 382 383.. doxygengroup:: ioapic_ext_apis 384 :project: Project ACRN 385 :content-only: 386 387 388LAPIC 389===== 390 391The following APIs are external interfaces for LAPIC related operations. 392 393.. doxygengroup:: lapic_ext_apis 394 :project: Project ACRN 395 :content-only: 396 397 398IPI 399=== 400 401The following APIs are external interfaces for IPI related operations. 402 403.. doxygengroup:: ipi_ext_apis 404 :project: Project ACRN 405 :content-only: 406 407 408Physical Interrupt 409================== 410 411The following APIs are external interfaces for physical interrupt 412related operations. 413 414.. doxygengroup:: phys_int_ext_apis 415 :project: Project ACRN 416 :content-only: 417 418