1# Xen Hypervisor Command Line Options 2 3This document covers the command line options which the Xen 4Hypervisor. 5 6## Types of parameter 7 8Most parameters take the form `option=value`. Different options on 9the command line should be space delimited. All options are case 10sensitive, as are all values unless explicitly noted. 11 12### Boolean (`<boolean>`) 13 14All boolean option may be explicitly enabled using a `value` of 15> `yes`, `on`, `true`, `enable` or `1` 16 17They may be explicitly disabled using a `value` of 18> `no`, `off`, `false`, `disable` or `0` 19 20In addition, a boolean option may be enabled by simply stating its 21name, and may be disabled by prefixing its name with `no-`. 22 23####Examples 24 25Enable noreboot mode 26> `noreboot=true` 27 28Disable x2apic support (if present) 29> `x2apic=off` 30 31Enable synchronous console mode 32> `sync_console` 33 34Explicitly specifying any value other than those listed above is 35undefined, as is stacking a `no-` prefix with an explicit value. 36 37### Integer (`<integer>`) 38 39An integer parameter will default to decimal and may be prefixed with 40a `-` for negative numbers. Alternatively, a hexadecimal number may be 41used by prefixing the number with `0x`, or an octal number may be used 42if a leading `0` is present. 43 44Providing a string which does not validly convert to an integer is 45undefined. 46 47### Size (`<size>`) 48 49A size parameter may be any integer, with a single size suffix 50 51* `T` or `t`: TiB (2^40) 52* `G` or `g`: GiB (2^30) 53* `M` or `m`: MiB (2^20) 54* `K` or `k`: KiB (2^10) 55* `B` or `b`: Bytes 56 57Without a size suffix, the default will be kilo. Providing a suffix 58other than those listed above is undefined. 59 60### String 61 62Many parameters are more complicated and require more intricate 63configuration. The detailed description of each individual parameter 64specify which values are valid. 65 66### List 67 68Some options take a comma separated list of values. 69 70### Combination 71 72Some parameters act as combinations of the above, most commonly a mix 73of Boolean and String. These are noted in the relevant sections. 74 75## Parameter details 76 77### acpi 78> `= force | ht | noirq | <boolean> | verbose` 79 80**String**, or **Boolean** to disable. 81 82By default, Xen will scan the DMI data and blacklist certain systems 83which are known to have broken ACPI setups. Providing `acpi=force` 84will cause Xen to ignore the blacklist and attempt to use all ACPI 85features. 86 87Using `acpi=ht` causes Xen to parse the ACPI tables enough to 88enumerate all CPUs, but will not use other ACPI features. This is not 89common, and only has an effect if your system is blacklisted. 90 91The `acpi=noirq` option causes Xen to not parse the ACPI MADT table 92looking for IO-APIC entries. This is also not common, and any system 93which requires this option to function should be blacklisted. 94Additionally, this will not prevent Xen from finding IO-APIC entries 95from the MP tables. 96 97Further, any of the boolean false options can be used to disable ACPI 98usage entirely. 99 100Because responsibility for ACPI processing is shared between Xen and 101the domain 0 kernel this option is automatically propagated to the 102domain 0 command line. 103 104Finally, `acpi=verbose` will enable per-processor information logging 105which may otherwise be too noisy in particular on large systems. 106 107### acpi_apic_instance 108> `= <integer>` 109 110Specify which ACPI MADT table to parse for APIC information, if more 111than one is present. 112 113### acpi_pstate_strict (x86) 114> `= <boolean>` 115 116> Default: `false` 117 118Enforce checking that P-state transitions by the ACPI cpufreq driver 119actually result in the nominated frequency to be established. A warning 120message will be logged if that isn't the case. 121 122### acpi_skip_timer_override (x86) 123> `= <boolean>` 124 125Instruct Xen to ignore timer-interrupt override. 126 127### acpi_sleep (x86) 128> `= s3_bios | s3_mode` 129 130`s3_bios` instructs Xen to invoke video BIOS initialization during S3 131resume. 132 133`s3_mode` instructs Xen to set up the boot time (option `vga=`) video 134mode during S3 resume. 135 136### allow_unsafe (x86) 137> `= <boolean>` 138 139> Default: `false` 140 141Force boot on potentially unsafe systems. By default Xen will refuse 142to boot on systems with the following errata: 143 144* AMD Erratum 121. Processors with this erratum are subject to a guest 145 triggerable Denial of Service. Override only if you trust all of 146 your PV guests. 147 148### altp2m (Intel) 149> `= <boolean>` 150 151> Default: `false` 152 153Permit multiple copies of host p2m. 154 155### apic (x86) 156> `= bigsmp | default` 157 158Override Xen's logic for choosing the APIC driver. By default, if 159there are more than 8 CPUs, Xen will switch to `bigsmp` over 160`default`. 161 162### apicv (Intel) 163> `= <boolean>` 164 165> Default: `true` 166 167Permit Xen to use APIC Virtualisation Extensions. This is an optimisation 168available as part of VT-x, and allows hardware to take care of the guests APIC 169handling, rather than requiring emulation in Xen. 170 171### apic_verbosity (x86) 172> `= verbose | debug` 173 174Increase the verbosity of the APIC code from the default value. 175 176### arat (x86) 177> `= <boolean>` 178 179> Default: `true` 180 181Permit Xen to use "Always Running APIC Timer" support on compatible hardware 182in combination with cpuidle. This option is only expected to be useful for 183developers wishing Xen to fall back to older timing methods on newer hardware. 184 185### argo 186 = List of [ <bool>, mac-permissive=<bool> ] 187 188Controls for the Argo hypervisor-mediated interdomain communication service. 189 190The functionality that this option controls is only available when Xen has been 191compiled with the build setting for Argo enabled in the build configuration. 192 193Argo is a interdomain communication mechanism, where Xen acts as the central 194point of authority. Guests may register memory rings to recieve messages, 195query the status of other domains, and send messages by hypercall, all subject 196to appropriate auditing by Xen. Argo is disabled by default. 197 198* The `mac-permissive` boolean controls whether wildcard receive rings may be 199 registered (`mac-permissive=1`) or may not be registered 200 (`mac-permissive=0`). 201 202 This option is disabled by default, to protect domains from a DoS by a 203 buggy or malicious other domain spamming the ring. 204 205### asid (x86) 206> `= <boolean>` 207 208> Default: `true` 209 210Permit Xen to use Address Space Identifiers. This is an optimisation which 211tags the TLB entries with an ID per vcpu. This allows for guest TLB flushes 212to be performed without the overhead of a complete TLB flush. 213 214### async-show-all (x86) 215> `= <boolean>` 216 217> Default: `false` 218 219Forces all CPUs' full state to be logged upon certain fatal asynchronous 220exceptions (watchdog NMIs and unexpected MCEs). 221 222### ats (x86) 223> `= <boolean>` 224 225> Default: `false` 226 227Permits Xen to set up and use PCI Address Translation Services. This is a 228performance optimisation for PCI Passthrough. 229 230**WARNING: Xen cannot currently safely use ATS because of its synchronous wait 231loops for Queued Invalidation completions.** 232 233### availmem 234> `= <size>` 235 236> Default: `0` (no limit) 237 238Specify a maximum amount of available memory, to which Xen will clamp 239the e820 table. 240 241### badpage 242> `= List of [ <integer> | <integer>-<integer> ]` 243 244Specify that certain pages, or certain ranges of pages contain bad 245bytes and should not be used. For example, if your memory tester says 246that byte `0x12345678` is bad, you would place `badpage=0x12345` on 247Xen's command line. 248 249### bootscrub 250> `= idle | <boolean>` 251 252> Default: `idle` 253 254Scrub free RAM during boot. This is a safety feature to prevent 255accidentally leaking sensitive VM data into other VMs if Xen crashes 256and reboots. 257 258In `idle` mode, RAM is scrubbed in background on all CPUs during idle-loop 259with a guarantee that memory allocations always provide scrubbed pages. 260This option reduces boot time on machines with a large amount of RAM while 261still providing security benefits. 262 263### bootscrub_chunk 264> `= <size>` 265 266> Default: `128M` 267 268Maximum RAM block size chunks to be scrubbed whilst holding the page heap lock 269and not running softirqs. Reduce this if softirqs are not being run frequently 270enough. Setting this to a high value may cause boot failure, particularly if 271the NMI watchdog is also enabled. 272 273### cet 274 = List of [ shstk=<bool>, ibt=<bool> ] 275 276 Applicability: x86 277 278Controls for the use of Control-flow Enforcement Technology. CET is group a 279of hardware features designed to combat Return-oriented Programming (ROP, also 280call/jmp COP/JOP) attacks. 281 282CET is incompatible with 32bit PV guests. If any CET sub-options are active, 283they will override the `pv=32` boolean to `false`. Backwards compatibility 284can be maintained with the pv-shim mechanism. 285 286* The `shstk=` boolean controls whether Xen uses Shadow Stacks for its own 287 protection. 288 289 The option is available when `CONFIG_XEN_SHSTK` is compiled in, and 290 generally defaults to `true` on hardware supporting CET-SS. Specifying 291 `cet=no-shstk` will cause Xen not to use Shadow Stacks even when support 292 is available in hardware. 293 294 Some hardware suffers from an issue known as Supervisor Shadow Stack 295 Fracturing. On such hardware, Xen will default to not using Shadow Stacks 296 when virtualised. Specifying `cet=shstk` will override this heuristic and 297 enable Shadow Stacks unilaterally. 298 299* The `ibt=` boolean controls whether Xen uses Indirect Branch Tracking for 300 its own protection. 301 302 The option is available when `CONFIG_XEN_IBT` is compiled in, and defaults 303 to `true` on hardware supporting CET-IBT. Specifying `cet=no-ibt` will 304 cause Xen not to use Indirect Branch Tracking even when support is 305 available in hardware. 306 307### clocksource (x86) 308> `= pit | hpet | acpi | tsc` 309 310If set, override Xen's default choice for the platform timer. 311Having TSC as platform timer requires being explicitly set. This is because 312TSC can only be safely used if CPU hotplug isn't performed on the system. On 313some platforms, the "maxcpus" option may need to be used to further adjust 314the number of allowed CPUs. When running on platforms that can guarantee a 315monotonic TSC across sockets you may want to adjust the "tsc" command line 316parameter to "stable:socket". 317 318### cmci-threshold (Intel) 319> `= <integer>` 320 321> Default: `2` 322 323Specify the event count threshold for raising Corrected Machine Check 324Interrupts. Specifying zero disables CMCI handling. 325 326### cmos-rtc-probe (x86) 327> `= <boolean>` 328 329> Default: `false` 330 331Flag to indicate whether to probe for a CMOS Real Time Clock irrespective of 332ACPI indicating none to be there. 333 334### com1 (x86) 335### com2 (x86) 336> `= <baud>[/<base-baud>][,[DPS][,[<io-base>|pci|amt][,[<irq>|msi][,[<port-bdf>][,[<bridge-bdf>]]]]]]` 337 338Both option `com1` and `com2` follow the same format. 339 340* `<baud>` may be either an integer baud rate, or the string `auto` if 341 the bootloader or other earlier firmware has already set it up. 342* Optionally, the base baud rate (usually the highest baud rate the 343 device can communicate at) can be specified. 344* `DPS` represents the number of data bits, the parity, and the number 345 of stop bits. 346 * `D` is an integer between 5 and 8 for the number of data bits. 347 * `P` is a single character representing the type of parity: 348 * `n` No 349 * `o` Odd 350 * `e` Even 351 * `m` Mark 352 * `s` Space 353 * `S` is an integer 1 or 2 for the number of stop bits. 354* `<io-base>` is an integer which specifies the IO base port for UART 355 registers. 356* `<irq>` is the IRQ number to use, or `0` to use the UART in poll 357 mode only, or `msi` to set up a Message Signaled Interrupt. 358* `<port-bdf>` is the PCI location of the UART, in 359 `<bus>:<device>.<function>` notation. 360* `<bridge-bdf>` is the PCI bridge behind which is the UART, in 361 `<bus>:<device>.<function>` notation. 362* `pci` indicates that Xen should scan the PCI bus for the UART, 363 avoiding Intel AMT devices. 364* `amt` indicated that Xen should scan the PCI bus for the UART, 365 including Intel AMT devices if present. 366 367A typical setup for most situations might be `com1=115200,8n1` 368 369In addition to the above positional specification for UART parameters, 370name=value pair specfications are also supported. This is used to add 371flexibility for UART devices which require additional UART parameter 372configurations. 373 374The comma separation still delineates positional parameters. Hence, 375unless the parameter is explicitly specified with name=value option, it 376will be considered a positional parameter. 377 378The syntax consists of 379com1=(comma-separated positional parameters),(comma separated name-value pairs) 380 381The accepted name keywords for name=value pairs are: 382 383* `baud` - accepts integer baud rate (eg. 115200) or `auto` 384* `bridge`- Similar to bridge-bdf in positional parameters. 385 Used to determine the PCI bridge to access the UART device. 386 Notation is xx:xx.x `<bus>:<device>.<function>` 387* `clock-hz`- accepts large integers to setup UART clock frequencies. 388 Do note - these values are multiplied by 16. 389* `data-bits` - integer between 5 and 8 390* `dev` - accepted values are `pci` OR `amt`. If this option 391 is used to specify if the serial device is pci-based. The io_base 392 cannot be specified when `dev=pci` or `dev=amt` is used. 393* `io-base` - accepts integer which specified IO base port for UART registers 394* `irq` - IRQ number to use 395* `parity` - accepted values are same as positional parameters 396* `port` - Used to specify which port the PCI serial device is located on 397 Notation is xx:xx.x `<bus>:<device>.<function>` 398* `reg-shift` - register shifts required to set UART registers 399* `reg-width` - register width required to set UART registers 400 (only accepts 1 and 4) 401* `stop-bits` - only accepts 1 or 2 for the number of stop bits 402 403The following are examples of correct specifications: 404 405 com1=115200,8n1,0x3f8,4 406 com1=115200,8n1,0x3f8,4,reg-width=4,reg-shift=2 407 com1=baud=115200,parity=n,stop-bits=1,io-base=0x3f8,reg-width=4 408 409### conring_size 410> `= <size>` 411 412> Default: `conring_size=16k` 413 414Specify the size of the console ring buffer. 415 416### console 417> `= List of [ vga | com1[H,L] | com2[H,L] | pv | dbgp | ehci | xhci | none ]` 418 419> Default: `console=com1,vga` 420 421Specify which console(s) Xen should use. 422 423`vga` indicates that Xen should try and use the vga graphics adapter. 424 425`com1` and `com2` indicates that Xen should use serial ports 1 and 2 426respectively. Optionally, these arguments may be followed by an `H` or 427`L`. `H` indicates that transmitted characters will have their MSB 428set, while received characters must have their MSB set. `L` indicates 429the converse; transmitted and received characters will have their MSB 430cleared. This allows a single port to be shared by two subsystems 431(e.g. console and debugger). 432 433`pv` indicates that Xen should use Xen's PV console. This option is 434only available when used together with `pv-in-pvh`. 435 436`dbgp` or `ehci` indicates that Xen should use a USB2 debug port. 437 438`xhci` indicates that Xen should use a USB3 debug port. 439 440`none` indicates that Xen should not use a console. This option only 441makes sense on its own. 442 443### console_timestamps 444> `= none | date | datems | boot | raw` 445 446> Default: `none` 447 448> Can be modified at runtime 449 450Specify which timestamp format Xen should use for each console line. 451 452* `none`: No timestamps 453* `date`: Date and time information 454 * `[YYYY-MM-DD HH:MM:SS]` 455* `datems`: Date and time, with milliseconds 456 * `[YYYY-MM-DD HH:MM:SS.mmm]` 457* `boot`: Seconds and microseconds since boot 458 * `[SSSSSS.uuuuuu]` 459+ `raw`: Raw platform ticks, architecture and implementation dependent 460 * `[XXXXXXXXXXXXXXXX]` 461 462For compatibility with the older boolean parameter, specifying 463`console_timestamps` alone will enable the `date` option. 464 465### console_to_ring 466> `= <boolean>` 467 468> Default: `false` 469 470Flag to indicate whether all guest console output should be copied 471into the console ring buffer. 472 473### conswitch 474> `= <switch char>[x]` 475 476> Default: `conswitch=a` 477 478> Can be modified at runtime 479 480Specify which character should be used to switch serial input between 481Xen and dom0. The required sequence is CTRL-<switch char> three 482times. 483 484The optional trailing `x` indicates that Xen should not automatically 485switch the console input to dom0 during boot. Any other value, 486including omission, causes Xen to automatically switch to the dom0 487console during dom0 boot. Use `conswitch=ax` to keep the default switch 488character, but for xen to keep the console. 489 490### core_parking 491> `= power | performance` 492 493> Default: `power` 494 495### cpu_type (x86) 496> `= arch_perfmon` 497 498If set, force use of the performance counters for oprofile, rather than detecting 499available support. 500 501### cpufreq 502> `= none | {{ <boolean> | xen } { [:[powersave|performance|ondemand|userspace][,[<maxfreq>]][,[<minfreq>]]] } [,verbose]} | dom0-kernel | hwp[:[<hdc>][,verbose]]` 503 504> Default: `xen` 505 506Indicate where the responsibility for driving power states lies. Note that the 507choice of `dom0-kernel` is deprecated and not supported by all Dom0 kernels. 508 509* Default governor policy is ondemand. 510* `<maxfreq>` and `<minfreq>` are integers which represent max and min processor frequencies 511 respectively. 512* `verbose` option can be included as a string or also as `verbose=<integer>` 513 for `xen`. It is a boolean for `hwp`. 514* `hwp` selects Hardware-Controlled Performance States (HWP) on supported Intel 515 hardware. HWP is a Skylake+ feature which provides better CPU power 516 management. The default is disabled. If `hwp` is selected, but hardware 517 support is not available, Xen will fallback to cpufreq=xen. 518* `<hdc>` is a boolean to enable Hardware Duty Cycling (HDC). HDC enables the 519 processor to autonomously force physical package components into idle state. 520 The default is enabled, but the option only applies when `hwp` is enabled. 521 522There is also support for `;`-separated fallback options: 523`cpufreq=hwp;xen,verbose`. This first tries `hwp` and falls back to `xen` if 524unavailable. Note: The `verbose` suboption is handled globally. Setting it 525for either the primary or fallback option applies to both irrespective of where 526it is specified. 527 528Note: grub2 requires to escape or quote ';', so `"cpufreq=hwp;xen"` should be 529specified within double quotes inside grub.cfg. Refer to the grub2 530documentation for more information. 531 532### cpuid (x86) 533> `= List of comma separated booleans` 534 535This option allows for fine tuning of the facilities Xen will use, after 536accounting for hardware capabilities as enumerated via CPUID. 537 538Unless otherwise noted, options only have any effect in their negative form, 539to hide the named feature(s). Ignoring a feature using this mechanism will 540cause Xen not to use the feature, nor offer them as usable to guests. 541 542Currently accepted: 543 544The Speculation Control hardware features `srbds-ctrl`, `md-clear`, `ibrsb`, 545`stibp`, `ibpb`, `l1d-flush` and `ssbd` are used by default if available and 546applicable. They can all be ignored. 547 548`rdrand` and `rdseed` have multiple interactions. 549 550* For Special Register Buffer Data Sampling (SRBDS, XSA-320, CVE-2020-0543), 551 RDRAND and RDSEED can be ignored. 552 553 Due to the absence of microcode to address SRBDS on IvyBridge client 554 hardware, the RDRAND feature is hidden by default for guests, unless 555 `rdrand` is used in its positive form. Irrespective of the setting here, 556 VMs can use RDRAND if explicitly enabled in guest config file, and VMs 557 already using RDRAND can migrate in. 558 559* The RDRAND feature is disabled by default on AMD Fam15/16 systems, due to 560 possible malfunctions after ACPI S3 suspend/resume. `rdrand` may be used 561 in its positive form to override Xen's default behaviour on these systems, 562 and make the feature fully usable. 563 564### cpuid_mask_cpu 565> `= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b` 566 567> Applicability: AMD 568 569If none of the other **cpuid_mask_\*** options are given, Xen has a set of 570pre-configured masks to make the current processor appear to be 571family/revision specified. 572 573See below for general information on masking. 574 575**Warning: This option is not fully effective on Family 15h processors or 576later.** 577 578### cpuid_mask_ecx 579### cpuid_mask_edx 580### cpuid_mask_ext_ecx 581### cpuid_mask_ext_edx 582### cpuid_mask_l7s0_eax 583### cpuid_mask_l7s0_ebx 584### cpuid_mask_thermal_ecx 585### cpuid_mask_xsave_eax 586> `= <integer>` 587 588> Applicability: x86. Default: `~0` (all bits set) 589 590The availability of these options are model specific. Some processors don't 591support any of them, and no processor supports all of them. Xen will ignore 592options on processors which are lacking support. 593 594These options can be used to alter the features visible via the `CPUID` 595instruction. Settings applied here take effect globally, including for Xen 596and all guests. 597 598Note: Since Xen 4.7, it is no longer necessary to mask a host to create 599migration safety in heterogeneous scenarios. All necessary CPUID settings 600should be provided in the VM configuration file. Furthermore, it is 601recommended not to use this option, as doing so causes an unnecessary 602reduction of features at Xen's disposal to manage guests. 603 604### cpuidle (x86) 605> `= <boolean>` 606 607### cpuinfo (x86) 608> `= <boolean>` 609 610### crash-debug-debugkey 611### crash-debug-hwdom 612### crash-debug-kexeccmd 613### crash-debug-panic 614### crash-debug-watchdog 615> `= <string>` 616 617> Can be modified at runtime 618 619Specify debug-key actions in cases of crashes. Each of the parameters applies 620to a different crash reason. The `<string>` is a sequence of debug key 621characters, with `+` having the special meaning of a 10 millisecond pause. 622 623`crash-debug-debugkey` will be used for crashes induced by the `C` debug 624key (i.e. manually induced crash). 625 626`crash-debug-hwdom` denotes a crash of dom0. 627 628`crash-debug-kexeccmd` is an explicit request of dom0 to continue with the 629kdump kernel via kexec. Only available on hypervisors built with CONFIG_KEXEC. 630 631`crash-debug-panic` is a crash of the hypervisor. 632 633`crash-debug-watchdog` is a crash due to the watchdog timer expiring. 634 635It should be noted that dumping diagnosis data to the console can fail in 636multiple ways (missing data, hanging system, ...) depending on the reason 637of the crash, which might have left the hypervisor in a bad state. In case 638a debug-key action leads to another crash recursion will be avoided, so no 639additional debug-key actions will be performed in this case. A crash in the 640early boot phase will not result in any debug-key action, as the system 641might not yet be in a state where the handlers can work. 642 643So e.g. `crash-debug-watchdog=0+0r` would dump dom0 state twice with 10 644milliseconds between the two state dumps, followed by the run queues of the 645hypervisor, if the system crashes due to a watchdog timeout. 646 647Depending on the reason of the system crash it might happen that triggering 648some debug key action will result in a hang instead of dumping data and then 649doing a reboot or crash dump. 650 651### crashinfo_maxaddr 652> `= <size>` 653 654> Default: `4G` 655 656Specify the maximum address to allocate certain structures, if used in 657combination with the **low_crashinfo** command line option. 658 659### crashkernel 660> `= <ramsize-range>:<size>[,...][{@,<}<offset>]` 661> `= <size>[{@,<}<offset>]` 662> `= <size>,below=offset` 663 664Specify sizes and optionally placement of the crash kernel reservation 665area. The `<ramsize-range>:<size>` pairs indicate how much memory to 666set aside for a crash kernel (`<size>`) for a given range of installed 667RAM (`<ramsize-range>`). Each `<ramsize-range>` is of the form 668`<start>-[<end>]`. 669 670A trailing `@<offset>` specifies the exact address this area should be 671placed at, whereas `<` in place of `@` just specifies an upper bound of 672the address range the area should fall into. 673 674< and below are synonyomous, the latter being useful for grub2 systems 675which would otherwise require escaping of the < option 676 677 678### credit2_balance_over 679> `= <integer>` 680 681### credit2_balance_under 682> `= <integer>` 683 684### credit2_cap_period_ms 685> `= <integer>` 686 687> Default: `10` 688 689Domains subject to a cap receive a replenishment of their runtime budget 690once every cap period interval. Default is 10 ms. The amount of budget 691they receive depends on their cap. For instance, a domain with a 50% cap 692will receive 50% of 10 ms, so 5 ms. 693 694### credit2_load_precision_shift 695> `= <integer>` 696 697> Default: `18` 698 699Specify the number of bits to use for the fractional part of the 700values involved in Credit2 load tracking and load balancing math. 701 702### credit2_load_window_shift 703> `= <integer>` 704 705> Default: `30` 706 707Specify the number of bits to use to represent the length of the 708window (in nanoseconds) we use for load tracking inside Credit2. 709This means that, with the default value (30), we use 7102^30 nsec ~= 1 sec long window. 711 712Load tracking is done by means of a variation of exponentially 713weighted moving average (EWMA). The window length defined here 714is what tells for how long we give value to previous history 715of the load itself. In fact, after a full window has passed, 716what happens is that we discard all previous history entirely. 717 718A short window will make the load balancer quick at reacting 719to load changes, but also short-sighted about previous history 720(and hence, e.g., long term load trends). A long window will 721make the load balancer thoughtful of previous history (and 722hence capable of capturing, e.g., long term load trends), but 723also slow in responding to load changes. 724 725The default value of `1 sec` is rather long. 726 727### credit2_runqueue 728> `= cpu | core | socket | node | all` 729 730> Default: `socket` 731 732Specify how host CPUs are arranged in runqueues. Runqueues are kept 733balanced with respect to the load generated by the vCPUs running on 734them. Smaller runqueues (as in with `core`) means more accurate load 735balancing (for instance, it will deal better with hyperthreading), 736but also more overhead. 737 738Available alternatives, with their meaning, are: 739* `cpu`: one runqueue per each logical pCPUs of the host; 740* `core`: one runqueue per each physical core of the host; 741* `socket`: one runqueue per each physical socket (which often, 742 but not always, matches a NUMA node) of the host; 743* `node`: one runqueue per each NUMA node of the host; 744* `all`: just one runqueue shared by all the logical pCPUs of 745 the host 746 747Regardless of the above choice, Xen attempts to respect 748`sched_credit2_max_cpus_runqueue` limit, which may mean more than one runqueue 749for the `all` value. If that isn't intended, raise 750the `sched_credit2_max_cpus_runqueue` value. 751 752### dbgp 753> `= ehci[ <integer> | @pci<bus>:<slot>.<func> ]` 754> `= xhci[ <integer> | @pci<bus>:<slot>.<func> ][,share=<bool>|hwdom]` 755 756Specify the USB controller to use, either by instance number (when going 757over the PCI busses sequentially) or by PCI device (must be on segment 0). 758 759Use `ehci` for EHCI debug port, use `xhci` for XHCI debug capability. 760XHCI driver will wait indefinitely for the debug host to connect - make sure 761the cable is connected. 762The `share` option for xhci controls who else can use the controller: 763* `no`: use the controller exclusively for console, even hardware domain 764 (dom0) cannot use it 765* `hwdom`: hardware domain may use the controller too, ports not used for debug 766 console will be available for normal devices; this is the default 767* `yes`: the controller can be assigned to any domain; it is not safe to assign 768 the controller to untrusted domain 769 770Choosing `share=hwdom` (the default) or `share=yes` allows a domain to reset the 771controller, which may cause small portion of the console output to be lost. 772 773The `share=yes` configuration is not security supported. 774 775### debug_stack_lines 776> `= <integer>` 777 778> Default: `20` 779 780Limits the number lines printed in Xen stack traces. 781 782### debugtrace 783> `= [cpu:]<size>` 784 785> Default: `128` 786 787Specify the size of the console debug trace buffer. By specifying `cpu:` 788additionally a trace buffer of the specified size is allocated per cpu. 789The debug trace feature is only enabled in debugging builds of Xen. 790 791### dit (x86/Intel) 792> `= <boolean>` 793 794> Default: `CONFIG_DIT_DEFAULT` 795 796Specify whether Xen and guests should operate in Data Independent Timing 797mode (Intel calls this DOITM, Data Operand Independent Timing Mode). Note 798that enabling this option cannot guarantee anything beyond what underlying 799hardware guarantees (with, where available and known to Xen, respective 800tweaks applied). 801 802### dma_bits 803> `= <integer>` 804 805Specify the bit width of the DMA heap. 806 807### dom0 808 = List of [ pv | pvh, shadow=<bool>, verbose=<bool>, 809 cpuid-faulting=<bool>, msr-relaxed=<bool> ] (x86) 810 811 = List of [ sve=<integer> ] (Arm64) 812 813Controls for how dom0 is constructed on x86 systems. 814 815* The `pv` and `pvh` options select the virtualisation mode of dom0. 816 817 The `pv` option is only available when `CONFIG_PV` is compiled in. The 818 `pvh` option is only available when `CONFIG_HVM` is compiled in. When 819 both options are compiled in, the default is PV. 820 821 In addition, the following requirements must be met: 822 823 * The dom0 kernel selected by the boot loader must be capable of the 824 selected mode. 825 * For a PVH dom0, the hardware must have VT-x/SVM extensions available. 826 827* The `shadow` boolean allows dom0 to be explicitly constructed using shadow 828 paging. This option is unavailable when `CONFIG_SHADOW_PAGING` is 829 disabled. 830 831 For PVH, dom0 defaults to using HAP on capable hardware, and falls back to 832 shadow paging otherwise. A PVH dom0 cannot be used if Xen is compiled 833 without shadow paging support, and the hardware lacks HAP support. 834 835 For PV, the use of dom0 shadow mode is only for development purposes. PV 836 guests do no require any paging support by default. 837 838* The `verbose` boolean is intended for diagnostics, and prints out extra 839 information during the dom0 build. It defaults to the compile time choice 840 of `CONFIG_VERBOSE_DEBUG`. 841 842* The `cpuid-faulting` boolean is an interim option, is only applicable to 843 PV dom0, and defaults to true. 844 845 Before Xen 4.13, the domain builder logic for guest construction depended 846 on seeing host CPUID values to function correctly. As a result, CPUID 847 Faulting was never activated for PV dom0's, even on capable hardware. 848 849 In Xen 4.13, the domain builder logic has been fixed, and no longer has 850 this dependency. As a consequence, CPUID Faulting is activated by default 851 even for PV dom0's. 852 853 However, as PV dom0's have always seen host CPUID data in the past, there 854 is a chance that further dependencies exist. This boolean can be used to 855 restore the pre-4.13 behaviour. If specifying `no-cpuid-faulting` fixes 856 an issue in dom0, please report a bug. 857 858* The `msr-relaxed` boolean is an interim option, and defaults to false. 859 860 In Xen 4.15, the default behaviour for unhandled MSRs has been changed, 861 to avoid leaking host data into guests, and to avoid breaking guest 862 logic which uses \#GP probing to identify the availability of MSRs. 863 864 However, this new stricter behaviour has the possibility to break 865 guests, and a more 4.14-like behaviour can be selected by specifying 866 `dom0=msr-relaxed`. 867 868 If using this option is necessary to fix an issue, please report a bug. 869 870Enables features on dom0 on Arm systems. 871 872* The `sve` integer parameter enables Arm SVE usage for Dom0 and sets the 873 maximum SVE vector length, the option is applicable only to Arm64 Dom0 874 kernels. 875 A value equal to 0 disables the feature, this is the default value. 876 Values below 0 means the feature uses the maximum SVE vector length 877 supported by hardware, if SVE is supported. 878 Values above 0 explicitly set the maximum SVE vector length for Dom0, 879 allowed values are from 128 to maximum 2048, being multiple of 128. 880 Please note that when the user explicitly specifies the value, if that value 881 is above the hardware supported maximum SVE vector length, the domain 882 creation will fail and the system will stop, the same will occur if the 883 option is provided with a positive non zero value, but the platform doesn't 884 support SVE. 885 886### dom0-cpuid 887 = List of comma separated booleans 888 889 Applicability: x86 890 891This option allows for fine tuning of the facilities dom0 will use, after 892accounting for hardware capabilities and Xen settings as enumerated via CPUID. 893 894Options are accepted in positive and negative form, to enable or disable 895specific features. All selections via this mechanism are subject to normal 896CPU Policy safety and dependency logic. 897 898This option is intended for developers to opt dom0 into non-default features, 899and is not intended for use in production circumstances. If using this option 900is necessary to fix an issue, please report a bug. 901 902### dom0-iommu 903 = List of [ passthrough=<bool>, strict=<bool>, map-inclusive=<bool>, 904 map-reserved=<bool>, none ] 905 906Controls for the dom0 IOMMU setup. 907 908* The `passthrough` boolean controls whether IOMMU translation functionality 909 is disabled for devices in dom0 (`passthrough=1`) or whether the IOMMU is 910 used to ensure that dom0 can only DMA to its permitted areas of RAM 911 (`passthrough=0`). 912 913 This option is only applicable to x86 PV dom0's, and defaults to false. 914 915 Some older Intel VT-d hardware isn't capable of disabling translation 916 functionality on a per-device basis, and will cause this option to be 917 ignored and assumed to be 0. Similar behaviour on such systems is only 918 available by fully disabling all IOMMUs. 919 920 This option is hardwired to false for x86 PVH dom0's (where a non-identity 921 transform is required for dom0 to function), and is ignored for ARM. 922 923* The `strict` boolean is applicable to x86 PV dom0's only and defaults to 924 false. It controls whether dom0 can have IOMMU mappings for all domain 925 RAM in the system, or only for its allocated RAM (and grant mappings etc.) 926 927 This option is hardwired to true for x86 PVH dom0's (as RAM belonging to 928 other domains in the system don't live in a compatible address space), and 929 is ignored for ARM. 930 931* The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up 932 identity IOMMU mappings for all non-RAM regions below 4GB except for 933 unusable ranges, and ranges belonging to Xen. 934 935 Typically, some devices in a system use bits of RAM for communication, and 936 these areas should be listed as reserved in the E820 table and identified 937 via RMRR or IVMD entries in the ACPI tables, so Xen can ensure that they 938 are identity-mapped in the IOMMU. However, some firmware makes mistakes, 939 and this option is a coarse-grain workaround for those errors. 940 941 Where possible, finer grain corrections should be made with the `rmrr=`, 942 `ivmd=`, `ivrs_hpet[]=`, or `ivrs_ioapic[]=` command line options. 943 944 This option is disabled by default, and deprecated and intended for 945 removal in future versions of Xen. If specifying `map-inclusive` is the 946 only way to make your system boot, please report a bug. 947 948* The `map-reserved` functionality is very similar to `map-inclusive`. 949 950 The differences from `map-inclusive` are that `map-reserved` is applicable 951 to both x86 PV and PVH dom0's, is enabled by default, and represents a 952 subset of the correction by only mapping reserved memory regions rather 953 than all non-RAM regions. 954 955* The `none` option is intended for development purposes only, and skips 956 certain safety checks pertaining to the correct IOMMU configuration for 957 dom0 to boot. 958 959 Incorrect use of this option may result in a malfunctioning system. 960 961### dom0_ioports_disable (x86) 962> `= List of <hex>-<hex>` 963 964Specify a list of IO ports to be excluded from dom0 access. 965 966### dom0_max_vcpus 967 968Either: 969 970> `= <integer>`. 971 972The number of VCPUs to give to dom0. This number of VCPUs can be more 973than the number of PCPUs on the host. The default is the number of 974PCPUs. 975 976Or: 977 978> `= <min>-<max>` where `<min>` and `<max>` are integers. 979 980Gives dom0 a number of VCPUs equal to the number of PCPUs, but always 981at least `<min>` and no more than `<max>`. Using `<min>` may give 982more VCPUs than PCPUs. `<min>` or `<max>` may be omitted and the 983defaults of 1 and unlimited respectively are used instead. 984 985For example, with `dom0_max_vcpus=4-8`: 986 987> Number of 988> PCPUs | Dom0 VCPUs 989> 2 | 4 990> 4 | 4 991> 6 | 6 992> 8 | 8 993> 10 | 8 994 995### dom0_mem (ARM) 996> `= <size>` 997 998Set the amount of memory for the initial domain (dom0). It must be 999greater than zero. This parameter is required. 1000 1001### dom0_mem (x86) 1002> `= List of ( min:<sz> | max:<sz> | <sz> )` 1003 1004Set the amount of memory for the initial domain (dom0). If a size is 1005positive, it represents an absolute value. If a size is negative, it 1006is subtracted from the total available memory. 1007 1008* `<sz>` specifies the exact amount of memory. 1009* `min:<sz>` specifies the minimum amount of memory. 1010* `max:<sz>` specifies the maximum amount of memory. 1011 1012If `<sz>` is not specified, the default is all the available memory 1013minus some reserve. The reserve is 1/16 of the available memory or 1014128 MB (whichever is smaller). 1015 1016The amount of memory will be at least the minimum but never more than 1017the maximum (i.e., `max` overrides the `min` option). If there isn't 1018enough memory then as much as possible is allocated. 1019 1020`max:<sz>` also sets the maximum reservation (the maximum amount of 1021memory dom0 can balloon up to). If this is omitted then the maximum 1022reservation is unlimited. 1023 1024For example, to set dom0's initial memory allocation to 512MB but 1025allow it to balloon up as far as 1GB use `dom0_mem=512M,max:1G` 1026 1027> `<sz>` is: `<size> | [<size>+]<frac>%` 1028> `<frac>` is an integer < 100 1029 1030* `<frac>` specifies a fraction of host memory size in percent. 1031 1032So `<sz>` being `1G+25%` on a 256 GB host would result in 65 GB. 1033 1034If you use this option then it is highly recommended that you disable 1035any dom0 autoballooning feature present in your toolstack. See the 1036_xl.conf(5)_ man page or [Xen Best 1037Practices](https://wiki.xen.org/wiki/Xen_Best_Practices#Xen_dom0_dedicated_memory_and_preventing_dom0_memory_ballooning). 1038 1039This option doesn't have effect if pv-shim mode is enabled. 1040 1041### dom0_nodes (x86) 1042 1043> `= List of [ <integer> | relaxed | strict ]` 1044 1045> Default: `strict` 1046 1047Specify the NUMA nodes to place Dom0 on. Defaults for vCPU-s created 1048and memory assigned to Dom0 will be adjusted to match the node 1049restrictions set up here. Note that the values to be specified here are 1050ACPI PXM ones, not Xen internal node numbers. `relaxed` sets up vCPU 1051affinities to prefer but be not limited to the specified node(s). 1052 1053### dom0_vcpus_pin 1054> `= <boolean>` 1055 1056> Default: `false` 1057 1058Pin dom0 vcpus to their respective pcpus 1059 1060### dtuart (ARM) 1061> `= path [:options]` 1062 1063> Default: `""` 1064 1065Specify the full path in the device tree for the UART. If the path doesn't 1066start with `/`, it is assumed to be an alias. The options are device specific. 1067 1068### e820-mtrr-clip (x86) 1069> `= <boolean>` 1070 1071Flag that specifies if RAM should be clipped to the highest cacheable 1072MTRR. 1073 1074> Default: `true` on Intel CPUs, otherwise `false` 1075 1076### e820-verbose (x86) 1077> `= <boolean>` 1078 1079> Default: `false` 1080 1081Flag that enables verbose output when processing e820 information and 1082applying clipping. 1083 1084### edd (x86) 1085> `= off | on | skipmbr` 1086 1087Control retrieval of Extended Disc Data (EDD) from the BIOS during 1088boot. 1089 1090### edid (x86) 1091> `= no | force` 1092 1093Either force retrieval of monitor EDID information via VESA DDC, or 1094disable it (edid=no). This option should not normally be required 1095except for debugging purposes. 1096 1097### efi 1098 = List of [ rs=<bool>, attr=no|uc ] 1099 1100Controls for interacting with the system Extended Firmware Interface. 1101 1102* The `rs` boolean controls whether Runtime Services are used. By default, 1103 Xen uses Runtime Services itself, and proxies certain calls on behalf of 1104 dom0. Selecting `rs=0` prohibits all use of Runtime Services. 1105 1106* The `attr=` string exists to specify what to do with memory regions of 1107 unknown/unrecognised cacheability. `attr=no` is the default and will 1108 leave the memory regions unmapped, while `attr=uc` will map them as fully 1109 uncacheable. 1110 1111### ept 1112> `= List of [ ad=<bool>, pml=<bool>, exec-sp=<bool> ]` 1113 1114> Applicability: Intel 1115 1116Extended Page Tables are a feature of Intel's VT-x technology, whereby 1117hardware manages the virtualisation of HVM guest pagetables. EPT was 1118introduced with the Nehalem architecture. 1119 1120* The `ad` boolean controls hardware tracking of Access and Dirty bits in the 1121 EPT pagetables, and was first introduced in Broadwell Server. 1122 1123 By default, Xen will use A/D tracking when available in hardware, except 1124 on Avoton processors affected by erratum AVR41. Explicitly choosing 1125 `ad=0` will disable the use of A/D tracking on capable hardware, whereas 1126 choosing `ad=1` will cause tracking to be used even on AVR41-affected 1127 hardware. 1128 1129* The `pml` boolean controls the use of Page Modification Logging, which is 1130 also introduced in Broadwell Server. 1131 1132 PML is a feature whereby the processor generates a list of pages which 1133 have been dirtied. This is necessary information for operations such as 1134 live migration, and having the processor maintain the list of dirtied 1135 pages is more efficient than traditional software implementations where 1136 all guest writes trap into Xen so the dirty bitmap can be maintained. 1137 1138 By default, Xen will use PML when it is available in hardware. PML 1139 functionally depends on A/D tracking, so choosing `ad=0` will implicitly 1140 disable PML. `pml=0` can be used to prevent the use of PML on otherwise 1141 capable hardware. 1142 1143* The `exec-sp` boolean controls whether EPT superpages with execute 1144 permissions are permitted. In general this is good for performance. 1145 1146 However, on processors vulnerable CVE-2018-12207, HVM guest kernels can 1147 use executable superpages to crash the host. By default, executable 1148 superpages are disabled on affected hardware. 1149 1150 If HVM guest kernels are trusted not to mount a DoS against the system, 1151 this option can enabled to regain performance. 1152 1153 This boolean may be modified at runtime using `xl set-parameters 1154 ept=[no-]exec-sp` to switch between fast and secure. 1155 1156 * When switching from secure to fast, preexisting HVM domains will run 1157 at their current performance until they are rebooted; new domains will 1158 run without any overhead. 1159 1160 * When switching from fast to secure, all HVM domains will immediately 1161 suffer a performance penalty. 1162 1163 **Warning: No guarantee is made that this runtime option will be retained 1164 indefinitely, or that it will retain this exact behaviour. It is 1165 intended as an emergency option for people who first chose fast, then 1166 change their minds to secure, and wish not to reboot.** 1167 1168### extra_guest_irqs (x86) 1169> `= [<domU number>][,<dom0 number>]` 1170 1171> Default: `32,<variable>` 1172 1173Change the number of PIRQs available for guests. The optional first number is 1174common for all domUs, while the optional second number (preceded by a comma) 1175is for dom0. Changing the setting for domU has no impact on dom0 and vice 1176versa. For example to change dom0 without changing domU, use 1177`extra_guest_irqs=,512`. The default value for Dom0 and an eventual separate 1178hardware domain is architecture dependent. The upper limit for both values on 1179x86 is such that the resulting total number of IRQs can't be higher than 32768. 1180Note that specifying zero as domU value means zero, while for dom0 it means 1181to use the default. Note further that the Dom0 setting has no useful meaning 1182for the PVH case; use of the option may have an adverse effect there, though. 1183 1184### ext_regions (Arm) 1185> `= <boolean>` 1186 1187> Default : `true` 1188 1189Flag to enable or disable support for extended regions for Dom0 and 1190Dom0less DomUs. 1191 1192Extended regions are ranges of unused address space exposed to the guest 1193as "safe to use" for special memory mappings. Disable if your board 1194device tree is incomplete. 1195 1196### flask 1197> `= permissive | enforcing | late | disabled` 1198 1199> Default: `enforcing` 1200 1201Specify how the FLASK security server should be configured. This option is only 1202available if the hypervisor was compiled with FLASK support. This can be 1203enabled by running either: 1204- make -C xen config and enabling XSM and FLASK. 1205- make -C xen menuconfig and enabling 'FLux Advanced Security Kernel support' and 'Xen Security Modules support' 1206 1207* `permissive`: This is intended for development and is not suitable for use 1208 with untrusted guests. If a policy is provided by the bootloader, it will be 1209 loaded; errors will be reported to the ring buffer but will not prevent 1210 booting. The policy can be changed to enforcing mode using "xl setenforce". 1211* `enforcing`: This will cause the security server to enter enforcing mode prior 1212 to the creation of domain 0. If an valid policy is not provided by the 1213 bootloader and no built-in policy is present, the hypervisor will not continue 1214 booting. 1215* `late`: This disables loading of the built-in security policy or the policy 1216 provided by the bootloader. FLASK will be enabled but will not enforce access 1217 controls until a policy is loaded by a domain using "xl loadpolicy". Once a 1218 policy is loaded, FLASK will run in enforcing mode unless "xl setenforce" has 1219 changed that setting. 1220* `disabled`: This causes the XSM framework to revert to the dummy module. The 1221 dummy module provides the same security policy as is used when compiling the 1222 hypervisor without support for XSM. The xsm_op hypercall can also be used to 1223 switch to this mode after boot, but there is no way to re-enable FLASK once 1224 the dummy module is loaded. 1225 1226### font 1227> `= <height>` where height is `8x8 | 8x14 | 8x16` 1228 1229Specify the font size when using the VESA console driver. 1230 1231### force-ept (Intel) 1232> `= <boolean>` 1233 1234> Default: `false` 1235 1236Allow EPT to be enabled when VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is not 1237present. 1238 1239*Warning:* 1240Due to CVE-2013-2212, VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is by default 1241required as a prerequisite for using EPT. If you are not using PCI Passthrough, 1242or trust the guest administrator who would be using passthrough, then the 1243requirement can be relaxed. This option is particularly useful for nested 1244virtualization, to allow the L1 hypervisor to use EPT even if the L0 hypervisor 1245does not provide `VM_ENTRY_LOAD_GUEST_PAT`. 1246 1247### gnttab 1248> `= List of [ max-ver:<integer>, transitive=<bool>, transfer=<bool> ]` 1249 1250> Default (Arm): `gnttab=max-ver:1` 1251> Default (x86,PV): `gnttab=max-ver:2,transitive,transfer` 1252> Default (x86,HVM): `gnttab=max-ver:2,transitive` 1253 1254Control various aspects of the grant table behaviour available to guests. 1255 1256* `max-ver` Select the maximum grant table version to offer to guests. Valid 1257version are 1 and 2. 1258* `transitive` Permit or disallow the use of transitive grants. Note that the 1259use of grant table v2 without transitive grants is an ABI breakage from the 1260guests point of view. 1261* `transfer` Permit or disallow the GNTTABOP_transfer operation of the 1262grant table hypercall. Note that disallowing GNTTABOP_transfer is an ABI 1263breakage from the guests point of view. This option is only available on 1264hypervisors configured to support PV guests. 1265 1266The usage of gnttab v2 is not security supported on ARM platforms. 1267 1268### gnttab_max_frames 1269> `= <integer>` 1270 1271> Default: `64` 1272 1273> Can be modified at runtime 1274 1275Specify the default upper bound on the number of frames which any domain may 1276use as part of its grant table unless a different value is specified at domain 1277creation. 1278 1279Note this value is the effective upper bound for dom0. 1280 1281### gnttab_max_maptrack_frames 1282> `= <integer>` 1283 1284> Default: `1024` 1285 1286> Can be modified at runtime 1287 1288Specify the default upper bound on the number of frames which any domain may 1289use as part of its maptrack array unless a different value is specified at 1290domain creation. 1291 1292Note this value is the effective upper bound for dom0. 1293 1294### global-pages 1295 = <boolean> 1296 1297 Applicability: x86 1298 Default: true unless running virtualized on AMD or Hygon hardware 1299 1300Control whether to use global pages for PV guests, and thus the need to 1301perform TLB flushes by writing to CR4. This is a performance trade-off. 1302 1303AMD SVM does not support selective trapping of CR4 writes, which means that a 1304global TLB flush (two CR4 writes) takes two VMExits, and massively outweigh 1305the benefit of using global pages to begin with. This case is easy for Xen to 1306spot, and is accounted for in the default setting. 1307 1308Other cases where this option might be a benefit is on VT-x hardware when 1309selective CR4 writes are not supported/enabled by the hypervisor, or in any 1310virtualised case using shadow paging. These are not easy for Xen to spot, so 1311are not accounted for in the default setting. 1312 1313### guest_loglvl 1314> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all` 1315 1316> Default: `guest_loglvl=none/warning` 1317 1318> Can be modified at runtime 1319 1320Set the logging level for Xen guests. Any log message with equal more 1321more importance will be printed. 1322 1323The optional `<rate-limited level>` option instructs which severities 1324should be rate limited. 1325 1326### hap (x86) 1327> `= <boolean>` 1328 1329> Default: `true` 1330 1331Flag to globally enable or disable support for Hardware Assisted 1332Paging (HAP) 1333 1334### hap_1gb (x86) 1335> `= <boolean>` 1336 1337> Default: `true` 1338 1339Flag to enable 1 GB host page table support for Hardware Assisted 1340Paging (HAP). 1341 1342### hap_2mb (x86) 1343> `= <boolean>` 1344 1345> Default: `true` 1346 1347Flag to enable 2 MB host page table support for Hardware Assisted 1348Paging (HAP). 1349 1350### hardware_dom 1351> `= <domid>` 1352 1353> Default: `0` 1354 1355Enable late hardware domain creation using the specified domain ID. This is 1356intended to be used when domain 0 is a stub domain which builds a disaggregated 1357system including a hardware domain with the specified domain ID. This option is 1358supported only when compiled with XSM on x86. 1359 1360### hest_disable 1361> ` = <boolean>` 1362 1363> Default: `false` 1364 1365Control Xens use of the APEI Hardware Error Source Table, should one be found. 1366 1367### highmem-start (x86) 1368> `= <size>` 1369 1370Specify the memory boundary past which memory will be treated as highmem (x86 1371debug hypervisor only). 1372 1373### hmp-unsafe (arm) 1374> `= <boolean>` 1375 1376> Default : `false` 1377 1378Say yes at your own risk if you want to enable heterogenous computing 1379(such as big.LITTLE). This may result to an unstable and insecure 1380platform, unless you manually specify the cpu affinity of all domains so 1381that all vcpus are scheduled on the same class of pcpus (big or LITTLE 1382but not both). vcpu migration between big cores and LITTLE cores is not 1383supported. See docs/misc/arm/big.LITTLE.txt for more information. 1384 1385When the hmp-unsafe option is disabled (default), CPUs that are not 1386identical to the boot CPU will be parked and not used by Xen. 1387 1388### hpet 1389 = List of [ <bool> | broadcast=<bool> | legacy-replacement=<bool> ] 1390 1391 Applicability: x86 1392 1393Controls Xen's use of the system's High Precision Event Timer. By default, 1394Xen will use an HPET when available and not subject to errata. Use of the 1395HPET can be disabled by specifying `hpet=0`. 1396 1397 * The `broadcast` boolean is disabled by default, but forces Xen to keep 1398 using the broadcast for CPUs in deep C-states even when an RTC interrupt is 1399 enabled. This then also affects raising of the RTC interrupt. 1400 1401 * The `legacy-replacement` boolean allows for control over whether Legacy 1402 Replacement mode is enabled. 1403 1404 Legacy Replacement mode is intended for hardware which does not have an 1405 8254 PIT, and allows the HPET to be configured into a compatible mode. 1406 Intel chipsets from Skylake/ApolloLake onwards can turn the PIT off for 1407 power saving reasons, and there is no platform-agnostic mechanism for 1408 discovering this. 1409 1410 By default, Xen will not change hardware configuration, unless the PIT 1411 appears to be absent, at which point Xen will try to enable Legacy 1412 Replacement mode before falling back to pre-IO-APIC interrupt routing 1413 options. 1414 1415 This behaviour can be inhibited by specifying `legacy-replacement=0`. 1416 Alternatively, this mode can be enabled unconditionally (if available) by 1417 specifying `legacy-replacement=1`. 1418 1419### hpetbroadcast (x86) 1420> `= <boolean>` 1421 1422Deprecated alternative of `hpet=broadcast`. 1423 1424### hvm_debug (x86) 1425> `= <integer>` 1426 1427The specified value is a bit mask with the individual bits having the 1428following meaning: 1429 1430> Bit 0 - debug level 0 (unused at present) 1431> Bit 1 - debug level 1 (Control Register logging) 1432> Bit 2 - debug level 2 (VMX logging of MSR restores when context switching) 1433> Bit 3 - debug level 3 (unused at present) 1434> Bit 4 - I/O operation logging 1435> Bit 5 - vMMU logging 1436> Bit 6 - vLAPIC general logging 1437> Bit 7 - vLAPIC timer logging 1438> Bit 8 - vLAPIC interrupt logging 1439> Bit 9 - vIOAPIC logging 1440> Bit 10 - hypercall logging 1441> Bit 11 - MSR operation logging 1442 1443Recognized in debug builds of the hypervisor only. 1444 1445### hvm_fep (x86) 1446> `= <boolean>` 1447 1448> Default: `false` 1449 1450Allow use of the Forced Emulation Prefix in HVM guests, to allow emulation of 1451arbitrary instructions. 1452 1453This option is intended for development and testing purposes. 1454 1455*Warning* 1456As this feature opens up the instruction emulator to arbitrary 1457instruction from an HVM guest, don't use this in production system. No 1458security support is provided when this flag is set. 1459 1460### hvm_port80 (x86) 1461> `= <boolean>` 1462 1463> Default: `true` 1464 1465Specify whether guests are to be given access to physical port 80 1466(often used for debugging purposes), to override the DMI based 1467detection of systems known to misbehave upon accesses to that port. 1468 1469### idle_latency_factor (x86) 1470> `= <integer>` 1471 1472### ioapic_ack (x86) 1473> `= old | new` 1474 1475> Default: `new` unless directed-EOI is supported 1476 1477### iommu 1478 = List of [ <bool>, verbose, debug, force, required, 1479 quarantine=<bool>|scratch-page, 1480 sharept, superpages, intremap, intpost, crash-disable, 1481 snoop, qinval, igfx, amd-iommu-perdev-intremap, 1482 dom0-{passthrough,strict} ] 1483 1484 All sub-options are boolean in nature. 1485 1486I/O Memory Memory Units perform a function similar to the CPU MMU (hence the 1487name), but typically exist as a discrete device, integrated as part of a PCI 1488Root Complex. The most common configuration is to have one IOMMU per package 1489(for on-die PCIe devices and directly attached PCIe lanes), and one IOMMU 1490covering the remaining I/O in the system. 1491 1492The functionality in an IOMMU commonly falls into two orthogonal categories: 1493 14941. DMA remapping which uses a pagetable-like hierarchical structure and maps 1495 I/O Virtual Addresses (DFNs - Device Frame Numbers in Xen's terminology) 1496 to System Physical Addresses (MFNs - Machine Frame Numbers in Xen's 1497 terminology). 1498 14992. Interrupt Remapping, which controls incoming Message Signalled Interrupt 1500 requests, including their routing to specific CPUs. 1501 1502IOMMU functionality can be used to provide a translation which the hardware 1503device driver isn't aware of (e.g. PCI Passthrough and a native driver inside 1504the guest) and/or to enforce fine-grained control over the memory and 1505interrupts which a device is attempting to access. 1506 1507By default, IOMMUs are configured for use if they are available. An overall 1508boolean (e.g. `iommu=no`) can override this and leave the IOMMUs disabled. 1509 1510* The `verbose` and `debug` booleans can be used to print additional 1511 diagnostic information. Neither are active by default. 1512 1513* The `force` and `required` booleans are synonymous and, when requested, 1514 will prevent Xen from booting if IOMMUs aren't discovered and enabled 1515 successfully. 1516 1517* The `quarantine` option can be used to control Xen's behavior when 1518 de-assigning devices from guests. The default behaviour is chosen at 1519 compile time, and is one of `CONFIG_IOMMU_QUARANTINE_{NONE,BASIC,SCRATCH_PAGE}`. 1520 1521 When a PCI device is assigned to an untrusted domain, it is possible 1522 for that domain to program the device to DMA to an arbitrary address. 1523 The IOMMU is used to protect the host from malicious DMA by making 1524 sure that the device addresses can only target memory assigned to the 1525 guest. However, when the guest domain is torn down, assigning the 1526 device back to the hardware domain would allow any in-flight DMA to 1527 potentially target critical host data. To avoid this, quarantining 1528 should be enabled. Quarantining can be done in two ways: In its basic 1529 form, all in-flight DMA will simply be forced to encounter IOMMU 1530 faults. Since there are systems where doing so can cause host lockup, 1531 an alternative form is available where accesses to memory will be directed 1532 to a scratch page. The implication here is that such accesses will go 1533 unnoticed, i.e. an admin may not become aware of the underlying problem. 1534 1535 Therefore, if this option is set to true (the default), Xen always 1536 quarantines such devices; they must be explicitly assigned back to Dom0 1537 before they can be used there again. If set to "scratch-page", still 1538 active DMA operations will additionally be directed to a "scratch" page. If 1539 set to false, Xen will only quarantine devices the toolstack has arranged 1540 for getting quarantined, and only in the "basic" form. 1541 1542 This option is only valid on builds supporting PCI. 1543 1544* The `sharept` boolean controls whether the IOMMU pagetables are shared 1545 with the CPU-side HAP pagetables, or allocated separately. Sharing 1546 reduces the memory overhead, but doesn't work in combination with CPU-side 1547 pagefault-based features, e.g. dirty VRAM tracking when a PCI device is 1548 assigned. 1549 1550 Due to implementation choices, sharing pagetables doesn't work on AMD 1551 hardware, and this option is ignored. It is enabled by default on Intel 1552 systems. 1553 1554 This option is ignored on ARM, and the pagetables are always shared. 1555 1556* The `superpages` boolean controls whether superpage mappings may be used 1557 in IOMMU page tables. If using this option is necessary to fix an issue, 1558 please report a bug. 1559 1560 This option is only valid on x86. 1561 1562* The `intremap` boolean controls the Interrupt Remapping sub-feature, and 1563 is active by default on compatible hardware. On x86 systems, the first 1564 generation of IOMMUs only supported DMA remapping, and Interrupt Remapping 1565 appeared in the second generation. 1566 1567 This option is only valid on x86. 1568 1569* The `intpost` boolean controls the Posted Interrupt sub-feature. In 1570 combination with APIC acceleration (VT-x APICV, SVM AVIC), the IOMMU can 1571 be configured to deliver interrupts from assigned PCI devices directly 1572 into the guest, without trapping out into hypervisor context. 1573 1574 This option depends on `intremap`, and is disabled by default due to some 1575 corner cases in the implementation which have yet to be resolved. 1576 1577 This option is only valid on x86, and only builds of Xen with HVM support. 1578 1579* The `crash-disable` boolean controls disabling IOMMU functionality (DMAR/IR/QI) 1580 before switching to a crash kernel. This option is inactive by default and 1581 is for compatibility with older kdump kernels only. Modern kernels copy 1582 all the necessary tables from the previous one following kexec which makes 1583 the transition transparent for them with IOMMU functions still on. 1584 1585The following options are specific to Intel VT-d hardware: 1586 1587* The `snoop` boolean controls the Snoop Control sub-feature, and is active 1588 by default on compatible hardware. 1589 1590 An incoming DMA request may specify _Snooped_ (query the CPU caches for 1591 the appropriate lines) or _Non-Snooped_ (don't query the CPU caches). 1592 _Non-Snooped_ accesses incur less latency, but behind-the-scenes 1593 hypervisor activity can invalidate the expectations of the device driver, 1594 and Snoop Control allows the hypervisor to force DMA requests to be 1595 _Snooped_ when they would otherwise not be. 1596 1597* The `qinval` boolean controls the Queued Invalidation sub-feature, and is 1598 active by default on compatible hardware. Queued Invalidation is a 1599 feature in second-generation IOMMUs and is a functional prerequisite for 1600 Interrupt Remapping. Note that Xen disregards this setting for Intel VT-d 1601 version 6 and greater as Registered-Based Invalidation isn't supported 1602 by them. 1603 1604* The `igfx` boolean is active by default, and controls whether IOMMUs in 1605 front of solely graphics devices get enabled or not. 1606 1607 It is intended as a debugging mechanism for graphics issues, and to be 1608 similar to Linux's `intel_iommu=igfx_off` option. If specifying `no-igfx` 1609 fixes anything, please report the problem. 1610 1611The following options are specific to AMD-Vi hardware: 1612 1613* The `amd-iommu-perdev-intremap` boolean controls whether the interrupt 1614 remapping table is per device (the default), or a single global table for 1615 the entire system. 1616 1617 Using a global table is not security supported as it allows all devices to 1618 impersonate each other as far as interrupts as concerned (see XSA-36), but 1619 it is a workaround for SP5100 Erratum 28. 1620 1621**WARNING: The `dom0-passthrough` and `dom0-strict` booleans are both 1622deprecated, and superseded by _dom0-iommu={passthrough,strict}_ respectively - 1623using both the old and new command line options in combination is undefined.** 1624 1625### iommu_dev_iotlb_timeout 1626> `= <integer>` 1627 1628> Default: `1000` 1629 1630Specify the timeout of the device IOTLB invalidation in milliseconds. 1631By default, the timeout is 1000 ms. When you see error 'Queue invalidate 1632wait descriptor timed out', try increasing this value. 1633 1634### iommu_inclusive_mapping 1635> `= <boolean>` 1636 1637**WARNING: This command line option is deprecated, and superseded by 1638_dom0-iommu=map-inclusive_ - using both options in combination is undefined.** 1639 1640### irq-max-guests (x86) 1641> `= <integer>` 1642 1643> Default: `32` 1644 1645Maximum number of guests any individual IRQ could be shared between, 1646i.e. a limit on the number of guests it is possible to start each having 1647assigned a device sharing a common interrupt line. Accepts values between 16481 and 255. 1649 1650### irq_ratelimit (x86) 1651> `= <integer>` 1652 1653### irq_vector_map (x86) 1654 1655### ivmd (x86) 1656> `= <start>[-<end>][=<bdf1>[-<bdf1'>][,<bdf2>[-<bdf2'>][,...]]][;<start>...]` 1657 1658Define IVMD-like ranges that are missing from ACPI tables along with the 1659device(s) they belong to, and use them for 1:1 mapping. End addresses can be 1660omitted when exactly one page is meant. The ranges are inclusive when start 1661and end are specified. Note that only PCI segment 0 is supported at this time, 1662but it is fine to specify it explicitly. 1663 1664'start' and 'end' values are page numbers (not full physical addresses), 1665in hexadecimal format (can optionally be preceded by "0x"). 1666 1667Omitting the optional (range of) BDF spcifiers signals that the range is to 1668be applied to all devices. 1669 1670Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be 1671reserved, and devices 0:0:1a.0...0:0:1a.3 collectively require three pages 1672(0xd5d46 thru 0xd5d48) to be reserved, one usage would be: 1673 1674ivmd=d5d45=0:1d.0;0xd5d46-0xd5d48=0:1a.0-0:1a.3 1675 1676Note: grub2 requires to escape or quote special characters, like ';' when 1677multiple ranges are specified - refer to the grub2 documentation. 1678 1679### ivrs_hpet[`<hpet>`] (AMD) 1680> `=[<seg>:]<bus>:<device>.<func>` 1681 1682Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of HPET 1683`<hpet>` instead of the one specified by the IVHD sub-tables of the IVRS 1684ACPI table. 1685 1686### ivrs_ioapic[`<ioapic>`] (AMD) 1687> `=[<seg>:]<bus>:<device>.<func>` 1688 1689Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of IO-APIC 1690`<ioapic>` instead of the one specified by the IVHD sub-tables of the IVRS 1691ACPI table. 1692 1693### lapic (x86) 1694> `= <boolean>` 1695 1696Force the use of use of the local APIC on a uniprocessor system, even 1697if left disabled by the BIOS. 1698 1699### lapic_timer_c2_ok (x86) 1700> `= <boolean>` 1701 1702### ler (x86) 1703> `= <boolean>` 1704 1705> Default: false 1706 1707This option is intended for debugging purposes only. Enable MSR_DEBUGCTL.LBR 1708in hypervisor context to be able to dump the Last Interrupt/Exception To/From 1709record with other registers. 1710 1711### lock-depth-size 1712> `= <integer>` 1713 1714> Default: `lock-depth-size=64` 1715 1716Specifies the maximum number of nested locks tested for illegal recursions. 1717Higher nesting levels still work, but recursion testing is omitted for those 1718levels. In case an illegal recursion is detected the system will crash 1719immediately. Specifying `0` will disable all testing of illegal lock nesting. 1720 1721This option is available for hypervisors built with CONFIG_DEBUG_LOCKS only. 1722 1723### loglvl 1724> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all` 1725 1726> Default: `loglvl=info` 1727 1728> Can be modified at runtime 1729 1730Set the logging level for Xen. Any log message with equal more more 1731importance will be printed. 1732 1733The optional `<rate-limited level>` option instructs which severities 1734should be rate limited. 1735 1736### low_crashinfo 1737> `= none | min | all` 1738 1739> Default: `none` if not specified at all, or to `min` if **low_crashinfo** is present without qualification. 1740 1741This option is only useful for hosts with a 32bit dom0 kernel, wishing 1742to use kexec functionality in the case of a crash. It represents 1743which data structures should be deliberately allocated in low memory, 1744so the crash kernel may find find them. Should be used in combination 1745with **crashinfo_maxaddr**. 1746 1747### low_mem_virq_limit 1748> `= <size>` 1749 1750> Default: `64M` 1751 1752Specify the threshold below which Xen will inform dom0 that the quantity of 1753free memory is getting low. Specifying `0` will disable this notification. 1754 1755### maxcpus 1756> `= <integer>` 1757 1758Specify the maximum number of CPUs that should be brought up. 1759 1760This option is ignored in **pv-shim** mode. 1761 1762**WARNING: On Arm big.LITTLE systems, when `hmp-unsafe` option is enabled, this command line 1763option does not guarantee on which CPU types will be used.** 1764 1765### max_cstate (x86) 1766> `= <integer>[,<integer>]` 1767 1768Specify the deepest C-state CPUs are permitted to be placed in, and 1769optionally the maximum sub C-state to be used used. The latter only applies 1770to the highest permitted C-state. 1771 1772### max_gsi_irqs (x86) 1773> `= <integer>` 1774 1775Specifies the number of interrupts to be use for pin (IO-APIC or legacy PIC) 1776based interrupts. Any higher IRQs will be available for use via PCI MSI. 1777 1778### max_lpi_bits (arm) 1779> `= <integer>` 1780 1781Specifies the number of ARM GICv3 LPI interrupts to allocate on the host, 1782presented as the number of bits needed to encode it. This must be at least 178314 and not exceed 32, and each LPI requires one byte (configuration) and 1784one pending bit to be allocated. 1785Defaults to 20 bits (to cover at most 1048576 interrupts). 1786 1787### mce (x86) 1788> `= <boolean>` 1789 1790> Default: `true` 1791 1792Allows to disable the use of Machine Check Exceptions. Note that doing 1793so may result in silent shutdown of the system in case an event occurs 1794which would have resulted in raising a Machine Check Exception. Silent 1795here is as far as Xen is concerned; firmware may offer to retrieve some 1796collected data. 1797 1798### mce_fb (Intel) 1799> `= <boolean>` 1800 1801> Default: `false` 1802 1803Force broadcasting of Machine Check Exceptions, suppressing the use of 1804Local MCE functionality available in newer Intel hardware. 1805 1806### mce_verbosity (x86) 1807> `= verbose` 1808 1809Specify verbose machine check output. 1810 1811### mem (x86) 1812> `= <size>` 1813 1814Specify the maximum address of physical RAM. Any RAM beyond this 1815limit is ignored by Xen. 1816 1817### memop-max-order 1818> `= [<domU>][,[<ctldom>][,[<hwdom>][,<ptdom>]]]` 1819 1820> x86 default: `9,18,12,12` 1821> ARM default: `9,18,10,10` 1822 1823Change the maximum order permitted for allocation (or allocation-like) 1824requests issued by the various kinds of domains (in this order: 1825ordinary DomU, control domain, hardware domain, and - when supported 1826by the platform - DomU with pass-through device assigned). 1827 1828### mmcfg (x86) 1829> `= <boolean>[,amd-fam10]` 1830 1831> Default: `1` 1832 1833Specify if the MMConfig space should be enabled. 1834 1835### mmio-relax (x86) 1836> `= <boolean> | all` 1837 1838> Default: `false` 1839 1840By default, domains may not create cached mappings to MMIO regions. 1841This option relaxes the check for Domain 0 (or when using `all`, all PV 1842domains), to permit the use of cacheable MMIO mappings. 1843 1844### msi (x86) 1845> `= <boolean>` 1846 1847> Default: `true` 1848 1849Force Xen to (not) use PCI-MSI, even if ACPI FADT says otherwise. 1850 1851### mtrr.show (x86) 1852> `= <boolean>` 1853 1854> Default: `false` 1855 1856Print boot time MTRR state. 1857 1858### mwait-idle (x86) 1859> `= <boolean>` 1860 1861> Default: `true` 1862 1863Use the MWAIT idle driver (with model specific C-state knowledge) instead 1864of the ACPI based one. 1865 1866### nmi (x86) 1867> `= ignore | dom0 | fatal` 1868 1869> Default: `fatal` for a debug build, or `dom0` for a non-debug build 1870 1871Specify what Xen should do in the event of an NMI parity or I/O error. 1872`ignore` discards the error; `dom0` causes Xen to report the error to 1873dom0, while 'fatal' causes Xen to print diagnostics and then hang. 1874 1875### noapic (x86) 1876 1877Instruct Xen to ignore any IOAPICs that are present in the system, and 1878instead continue to use the legacy PIC. This is _not_ recommended with 1879pvops type kernels. 1880 1881Because responsibility for APIC setup is shared between Xen and the 1882domain 0 kernel this option is automatically propagated to the domain 18830 command line. 1884 1885### invpcid (x86) 1886> `= <boolean>` 1887 1888> Default: `true` 1889 1890By default, Xen will use the INVPCID instruction for TLB management if 1891it is available. This option can be used to cause Xen to fall back to 1892older mechanisms, which are generally slower. 1893 1894### load-balance-ratelimit 1895> `= <integer>` 1896 1897The minimum interval between load balancing events on a given pcpu, in 1898microseconds. A value of '0' will disable rate limiting. Maximum 1899value 1 second. At the moment only credit honors this parameter. 1900Default 1ms. 1901 1902### noirqbalance (x86) 1903> `= <boolean>` 1904 1905Disable software IRQ balancing and affinity. This can be used on 1906systems such as Dell 1850/2850 that have workarounds in hardware for 1907IRQ routing issues. 1908 1909### nolapic (x86) 1910> `= <boolean>` 1911 1912> Default: `false` 1913 1914Ignore the local APIC on a uniprocessor system, even if enabled by the 1915BIOS. 1916 1917### no-real-mode (x86) 1918> `= <boolean>` 1919 1920Do not execute real-mode bootstrap code when booting Xen. This option 1921should not be used except for debugging. It will effectively disable 1922the **vga** option, which relies on real mode to set the video mode. 1923 1924### noreboot 1925> `= <boolean>` 1926 1927Do not automatically reboot after an error. This is useful for 1928catching debug output. Defaults to automatically reboot after 5 1929seconds. 1930 1931### nosmp (x86) 1932> `= <boolean>` 1933 1934Disable SMP support. No secondary processors will be booted. 1935Defaults to booting secondary processors. 1936 1937This option is ignored in **pv-shim** mode. 1938 1939### nr_irqs (x86) 1940> `= <integer>` 1941 1942### numa (x86) 1943> `= on | off | fake=<integer> | noacpi` 1944 1945> Default: `on` 1946 1947### partial-emulation (arm) 1948> `= <boolean>` 1949 1950> Default: `false` 1951 1952Flag to enable or disable partial emulation of system/coprocessor registers. 1953Only effective if CONFIG_PARTIAL_EMULATION is enabled. 1954 1955**WARNING: Enabling this option might result in unwanted/non-spec compliant 1956behavior.** 1957 1958### pci 1959 = List of [ serr=<bool>, perr=<bool> ] 1960 1961 Default: Signaling left as set by firmware. 1962 1963Override the firmware settings, and explicitly enable or disable the 1964signalling of PCI System and Parity errors. 1965 1966### pci-phantom 1967> `=[<seg>:]<bus>:<device>,<stride>` 1968 1969Mark a group of PCI devices as using phantom functions without actually 1970advertising so, so the IOMMU can create translation contexts for them. 1971 1972All numbers specified must be hexadecimal ones. 1973 1974This option can be specified more than once (up to 8 times at present). 1975 1976### pci-passthrough (arm) 1977> `= <boolean>` 1978 1979> Default: `false` 1980 1981Flag to enable or disable support for PCI passthrough 1982 1983### pcid (x86) 1984> `= <boolean> | xpti=<bool>` 1985 1986> Default: `xpti` 1987 1988> Can be modified at runtime (change takes effect only for domains created 1989 afterwards) 1990 1991If available, control usage of the PCID feature of the processor for 199264-bit pv-domains. PCID can be used either for no domain at all (`false`), 1993for all of them (`true`), only for those subject to XPTI (`xpti`) or for 1994those not subject to XPTI (`no-xpti`). The feature is used only in case 1995INVPCID is supported and not disabled via `invpcid=false`. 1996 1997### ple_gap 1998> `= <integer>` 1999 2000### ple_window (Intel) 2001> `= <integer>` 2002 2003### preferred-cstates (x86) 2004> `= ( <integer> | List of ( C1 | C1E | C2 | ... )` 2005 2006This is a mask of C-states which are to be used preferably. This option is 2007applicable only on hardware were certain C-states are exclusive of one another. 2008 2009### probe-port-aliases (x86) 2010> `= <boolean>` 2011 2012> Default: `true` outside of shim mode, `false` in shim mode 2013 2014Certain devices accessible by I/O ports may be accessible also through "alias" 2015ports (originally a result of incomplete address decoding). When such devices 2016are solely under Xen's control, Xen disallows even Dom0 access to the "primary" 2017ports. When alias probing is active and aliases are detected, "alias" ports 2018would then be treated similar to the "primary" ones. 2019 2020### psr (Intel) 2021> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> | cdp:<boolean> )` 2022 2023> Default: `psr=cmt:0,rmid_max:255,cat:0,cos_max:255,cdp:0` 2024 2025Platform Shared Resource(PSR) Services. Intel Haswell and later server 2026platforms offer information about the sharing of resources. 2027 2028To use the PSR monitoring service for a certain domain, a Resource 2029Monitoring ID(RMID) is used to bind the domain to corresponding shared 2030resource. RMID is a hardware-provided layer of abstraction between software 2031and logical processors. 2032 2033To use the PSR cache allocation service for a certain domain, a capacity 2034bitmasks(CBM) is used to bind the domain to corresponding shared resource. 2035CBM represents cache capacity and indicates the degree of overlap and isolation 2036between domains. In hypervisor a Class of Service(COS) ID is allocated for each 2037unique CBM. 2038 2039The following resources are available: 2040 2041* Cache Monitoring Technology (Haswell and later). Information regarding the 2042 L3 cache occupancy. 2043 * `cmt` instructs Xen to enable/disable Cache Monitoring Technology. 2044 * `rmid_max` indicates the max value for rmid. 2045* Memory Bandwidth Monitoring (Broadwell and later). Information regarding the 2046 total/local memory bandwidth. Follow the same options with Cache Monitoring 2047 Technology. 2048 2049* Cache Allocation Technology (Broadwell and later). Information regarding 2050 the cache allocation. 2051 * `cat` instructs Xen to enable/disable Cache Allocation Technology. 2052 * `cos_max` indicates the max value for COS ID. 2053* Code and Data Prioritization Technology (Broadwell and later). Information 2054 regarding the code cache and the data cache allocation. CDP is based on CAT. 2055 * `cdp` instructs Xen to enable/disable Code and Data Prioritization. Note 2056 that `cos_max` of CDP is a little different from `cos_max` of CAT. With 2057 CDP, one COS will corespond two CBMs other than one with CAT, due to the 2058 sum of CBMs is fixed, that means actual `cos_max` in use will automatically 2059 reduce to half when CDP is enabled. 2060 2061### pv 2062 = List of [ 32=<bool> ] 2063 2064 Applicability: x86 2065 2066Controls for aspects of PV guest support. 2067 2068* The `32` boolean controls whether 32bit PV guests can be created. It 2069 defaults to `true`, and is ignored when `CONFIG_PV32` is compiled out. 2070 2071 32bit PV guests are incompatible with CET Shadow Stacks. If Xen is using 2072 shadow stacks, this option will be overridden to `false`. Backwards 2073 compatibility can be maintained with the `pv-shim` mechanism. 2074 2075### pv-linear-pt (x86) 2076> `= <boolean>` 2077 2078> Default: `true` 2079 2080Only available if Xen is compiled with `CONFIG_PV_LINEAR_PT` support 2081enabled. 2082 2083Allow PV guests to have pagetable entries pointing to other pagetables 2084of the same level (i.e., allowing L2 PTEs to point to other L2 pages). 2085This technique is often called "linear pagetables", and is sometimes 2086used to allow operating systems a simple way to consistently map the 2087current process's pagetables into its own virtual address space. 2088 2089Linux and MiniOS don't use this technique. NetBSD and Novell Netware 2090do; there may be other custom operating systems which do. If you're 2091certain you don't plan on having PV guests which use this feature, 2092turning it off can reduce the attack surface. 2093 2094### pv-l1tf (x86) 2095> `= List of [ <bool>, dom0=<bool>, domu=<bool> ]` 2096 2097> Default: `false` on believed-unaffected hardware, or in pv-shim mode. 2098> `domu` on believed-affected hardware. 2099 2100Mitigations for L1TF / XSA-273 / CVE-2018-3620 for PV guests. 2101 2102For backwards compatibility, we may not alter an architecturally-legitimate 2103pagetable entry a PV guest chooses to write. We can however force such a 2104guest into shadow mode so that Xen controls the PTEs which are reachable by 2105the CPU pagewalk. 2106 2107Shadowing is performed at the point where a PV guest first tries to write an 2108L1TF-vulnerable PTE. Therefore, a PV guest kernel which has been updated with 2109its own L1TF mitigations will not trigger shadow mode if it is well behaved. 2110 2111If `CONFIG_SHADOW_PAGING` is not compiled in, this mitigation instead crashes 2112the guest when an L1TF-vulnerable PTE is written, which still allows updated, 2113well-behaved PV guests to run, despite Shadow being compiled out. 2114 2115In the pv-shim case, Shadow is expected to be compiled out, and a malicious 2116guest kernel can only leak data from the shim Xen, rather than the host Xen. 2117 2118### pv-shim (x86) 2119> `= <boolean>` 2120 2121> Default: `false` 2122 2123This option is intended for use by a toolstack, when choosing to run a PV 2124guest compatibly inside an HVM container. 2125 2126In this mode, the kernel and initrd passed as modules to the hypervisor are 2127constructed into a plain unprivileged PV domain. 2128 2129### rcu-idle-timer-period-ms 2130> `= <integer>` 2131 2132> Default: `10` 2133 2134How frequently a CPU which has gone idle, but with pending RCU callbacks, 2135should be woken up to check if the grace period has completed, and the 2136callbacks are safe to be executed. Expressed in milliseconds; maximum is 2137100, and it can't be 0. 2138 2139### reboot (x86) 2140> `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]` 2141 2142> Default: `0` 2143 2144Specify the host reboot method. 2145 2146`warm` instructs Xen to not set the cold reboot flag. 2147 2148`cold` instructs Xen to set the cold reboot flag. 2149 2150`no` instructs Xen to not automatically reboot after panics or crashes. 2151 2152`triple` instructs Xen to reboot the host by causing a triple fault. 2153 2154`kbd` instructs Xen to reboot the host via the keyboard controller. 2155 2156`acpi` instructs Xen to reboot the host using RESET_REG in the ACPI FADT. 2157 2158`pci` instructs Xen to reboot the host using PCI reset register (port CF9). 2159 2160`Power` instructs Xen to power-cycle the host using PCI reset register (port CF9). 2161 2162'efi' instructs Xen to reboot using the EFI reboot call (in EFI mode by 2163 default it will use that method first). 2164 2165`xen` instructs Xen to reboot using Xen's SCHEDOP hypercall (this is the default 2166when running nested Xen) 2167 2168### rmrr 2169> `= start<-end>=[s1]bdf1[,[s1]bdf2[,...]];start<-end>=[s2]bdf1[,[s2]bdf2[,...]]` 2170 2171Define RMRR units that are missing from ACPI table along with device they 2172belong to and use them for 1:1 mapping. End addresses can be omitted and one 2173page will be mapped. The ranges are inclusive when start and end are specified. 2174If segment of the first device is not specified, segment zero will be used. 2175If other segments are not specified, first device segment will be used. 2176If a segment is specified for other than the first device and it does not match 2177the one specified for the first one, an error will be reported. 2178 2179'start' and 'end' values are page numbers (not full physical addresses), 2180in hexadecimal format (can optionally be preceded by "0x"). 2181 2182Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be 2183reserved, and device 0:0:1a.0 requires three pages (0xd5d46 thru 0xd5d48) 2184to be reserved, one usage would be: 2185 2186rmrr=d5d45=0:0:1d.0;0xd5d46-0xd5d48=0:0:1a.0 2187 2188Note: grub2 requires to escape or use quotations if special characters are used, 2189namely ';', refer to the grub2 documentation if multiple ranges are specified. 2190 2191### ro-hpet (x86) 2192> `= <boolean>` 2193 2194> Default: `true` 2195 2196Map the HPET page as read only in Dom0. If disabled the page will be mapped 2197with read and write permissions. 2198 2199### sched 2200> `= credit | credit2 | arinc653 | rtds | null` 2201 2202> Default: `sched=credit2` 2203 2204Choose the default scheduler. Note the default scheduler is selectable via 2205Kconfig and depends on enabled schedulers. Check 2206`CONFIG_SCHED_DEFAULT` to see which scheduler is the default. 2207 2208### sched_credit2_max_cpus_runqueue 2209> `= <integer>` 2210 2211> Default: `16` 2212 2213Defines how many CPUs will be put, at most, in each Credit2 runqueue. 2214 2215Runqueues are still arranged according to the host topology (and following 2216what indicated by the 'credit2_runqueue' parameter). But we also have a cap 2217to the number of CPUs that share each runqueues. 2218 2219A value that is a submultiple of the number of online CPUs is recommended, 2220as that would likely produce a perfectly balanced runqueue configuration. 2221 2222### sched_credit2_migrate_resist 2223> `= <integer>` 2224 2225### sched_credit_tslice_ms 2226> `= <integer>` 2227 2228Set the timeslice of the credit1 scheduler, in milliseconds. The 2229default is 30ms. Reasonable values may include 10, 5, or even 1 for 2230very latency-sensitive workloads. 2231 2232### sched-gran (x86) 2233> `= cpu | core | socket` 2234 2235> Default: `sched-gran=cpu` 2236 2237Set the scheduling granularity. In case the granularity is larger than 1 (e.g. 2238`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned 2239statically to a "scheduling unit" which will then be subject to scheduling. 2240This assignment of vcpus to scheduling units is fixed. 2241 2242`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a 2243hyperthread using x86/Intel terminology) 2244 2245`core`: As many vcpus as there are cpus on a physical core are scheduled 2246together on a physical core. 2247 2248`socket`: As many vcpus as there are cpus on a physical sockets are scheduled 2249together on a physical socket. 2250 2251Note: a value other than `cpu` will result in rejecting a runtime modification 2252attempt of the "smt" setting. 2253 2254Note: for AMD x86 processors before Fam17 the terminology in the official data 2255sheets is different: a cpu is named "core" and multiple "cores" are running 2256in the same "compute unit". As from Fam17 on AMD is using the same names as 2257Intel ("thread" and "core") the topology levels are named "cpu", "core" and 2258"socket" even on older AMD processors. 2259 2260### sched_ratelimit_us 2261> `= <integer>` 2262 2263In order to limit the rate of context switching, set the minimum 2264amount of time that a vcpu can be scheduled for before preempting it, 2265in microseconds. The default is 1000us (1ms). Setting this to 0 2266disables it altogether. 2267 2268### sched_smt_power_savings 2269> `= <boolean>` 2270 2271Normally Xen will try to maximize performance and cache utilization by 2272spreading out vcpus across as many different divisions as possible 2273(i.e, numa nodes, sockets, cores threads, &c). This often maximizes 2274throughput, but also maximizes energy usage, since it reduces the 2275depth to which a processor can sleep. 2276 2277This option inverts the logic, so that the scheduler in effect tries 2278to keep the vcpus on the smallest amount of silicon possible; i.e., 2279first fill up sibling threads, then sibling cores, then sibling 2280sockets, &c. This will reduce performance somewhat, particularly on 2281systems with hyperthreading enabled, but should reduce power by 2282enabling more sockets and cores to go into deeper sleep states. 2283 2284### scrub-domheap 2285> `= <boolean>` 2286 2287> Default: `false` 2288 2289Scrub domains' freed pages. This is a safety net against a (buggy) domain 2290accidentally leaking secrets by releasing pages without proper sanitization. 2291 2292### serial_tx_buffer 2293> `= <size>` 2294 2295> Default: `16kB` 2296 2297Set the serial transmit buffer size. 2298 2299### serrors (ARM) 2300> `= diverse | panic` 2301 2302> Default: `diverse` 2303 2304This parameter is provided to administrators to determine how the hypervisor 2305handles SErrors. 2306 2307* `diverse`: 2308 The hypervisor will distinguish guest SErrors from hypervisor SErrors: 2309 - The guest generated SErrors will be forwarded to the currently running 2310 guest. 2311 - The hypervisor generated SErrors will cause the whole system to crash 2312 2313* `panic`: 2314 All SErrors will cause the whole system to crash. This option should only 2315 be used if you trust all your guests and/or they don't have a gadget (e.g. 2316 device) to generate SErrors in normal run. 2317 2318### shim_mem (x86) 2319> `= List of ( min:<size> | max:<size> | <size> )` 2320 2321Set the amount of memory that xen-shim uses. Only has effect if pv-shim mode is 2322enabled. Note that this value accounts for the memory used by the shim itself 2323plus the free memory slack given to the shim for runtime allocations. 2324 2325* `min:<size>` specifies the minimum amount of memory. Ignored if greater 2326 than max. 2327* `max:<size>` specifies the maximum amount of memory. 2328* `<size>` specifies the exact amount of memory. Overrides both min and max. 2329 2330By default, the amount of free memory slack given to the shim for runtime usage 2331is 1MB. 2332 2333### smap (x86) 2334> `= <boolean> | hvm` 2335 2336> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware 2337 2338Flag to enable Supervisor Mode Access Prevention 2339Use `smap=hvm` to allow SMAP use by HVM guests only. 2340 2341In PV shim mode on AMD or Hygon hardware due to significant performance impact 2342in some cases and generally lower security risk the option defaults to false. 2343 2344### smep (x86) 2345> `= <boolean> | hvm` 2346 2347> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware 2348 2349Flag to enable Supervisor Mode Execution Protection 2350Use `smep=hvm` to allow SMEP use by HVM guests only. 2351 2352In PV shim mode on AMD or Hygon hardware due to significant performance impact 2353in some cases and generally lower security risk the option defaults to false. 2354 2355### smt (x86) 2356> `= <boolean>` 2357 2358Default: `true` 2359 2360Control bring up of multiple hyper-threads per CPU core. 2361 2362### snb_igd_quirk 2363> `= <boolean> | cap | <integer>` 2364 2365A true boolean value enables legacy behavior (1s timeout), while `cap` 2366enforces the maximum theoretically necessary timeout of 670ms. Any number 2367is being interpreted as a custom timeout in milliseconds. Zero or boolean 2368false disable the quirk workaround, which is also the default. 2369 2370### spec-ctrl (Arm) 2371> `= List of [ ssbd=force-disable|runtime|force-enable ]` 2372 2373Controls for speculative execution sidechannel mitigations. 2374 2375The option `ssbd=` is used to control the state of Speculative Store 2376Bypass Disable (SSBD) mitigation. 2377 2378* `ssbd=force-disable` will keep the mitigation permanently off. The guest 2379will not be able to control the state of the mitigation. 2380* `ssbd=runtime` will always turn on the mitigation when running in the 2381hypervisor context. The guest will be to turn on/off the mitigation for 2382itself by using the firmware interface `ARCH_WORKAROUND_2`. 2383* `ssbd=force-enable` will keep the mitigation permanently on. The guest will 2384not be able to control the state of the mitigation. 2385 2386By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`). 2387 2388### spec-ctrl (x86) 2389> `= List of [ <bool>, xen=<bool>, {pv,hvm}=<bool>, 2390> {msr-sc,rsb,verw,{ibpb,bhb}-entry}=<bool>|{pv,hvm}=<bool>, 2391> bti-thunk=retpoline|lfence|jmp,bhb-seq=short|tsx|long, 2392> {ibrs,ibpb,ssbd,psfd, 2393> eager-fpu,l1d-flush,branch-harden,srb-lock, 2394> unpriv-mmio,gds-mit,div-scrub,lock-harden, 2395> bhi-dis-s}=<bool> ]` 2396 2397Controls for speculative execution sidechannel mitigations. By default, Xen 2398will pick the most appropriate mitigations based on compiled in support, 2399loaded microcode, and hardware details, and will virtualise appropriate 2400mitigations for guests to use. 2401 2402**WARNING: Any use of this option may interfere with heuristics. Use with 2403extreme care.** 2404 2405An overall boolean value, `spec-ctrl=no`, can be specified to turn off all 2406mitigations, including pieces of infrastructure used to virtualise certain 2407mitigation features for guests. This also includes settings which `xpti`, 2408`smt`, `pv-l1tf`, `tsx` control, unless the respective option(s) have been 2409specified earlier on the command line. 2410 2411Alternatively, a slightly more restricted `spec-ctrl=no-xen` can be used to 2412turn off all of Xen's mitigations, while leaving the virtualisation support 2413in place for guests to use. 2414 2415Use of a positive boolean value for either of these options is invalid. 2416 2417The `pv=`, `hvm=`, `msr-sc=`, `rsb=`, `verw=`, `ibpb-entry=` and `bhb-entry=` 2418options offer fine grained control over the primitives by Xen. These impact 2419Xen's ability to protect itself, and/or Xen's ability to virtualise support 2420for guests to use. 2421 2422* `pv=` and `hvm=` offer control over all suboptions for PV and HVM guests 2423 respectively. 2424* Each other option can be used either as a plain boolean 2425 (e.g. `spec-ctrl=rsb` to control both the PV and HVM sub-options), or with 2426 `pv=` or `hvm=` subsuboptions (e.g. `spec-ctrl=rsb=no-hvm` to disable HVM 2427 RSB only). 2428 2429* `msr-sc=` offers control over Xen's support for manipulating `MSR_SPEC_CTRL` 2430 on entry and exit. These blocks are necessary to virtualise support for 2431 guests and if disabled, guests will be unable to use IBRS/STIBP/SSBD/etc. 2432* `rsb=` offers control over whether to overwrite the Return Stack Buffer / 2433 Return Address Stack on entry to Xen and on idle. 2434* `verw=` offers control over whether to use VERW for its scrubbing side 2435 effects at appropriate privilege transitions. The exact side effects are 2436 microarchitecture and microcode specific. *Note: `md-clear=` is accepted as 2437 a deprecated alias. For compatibility with development versions of XSA-297, 2438 `mds=` is also accepted on Xen 4.12 and earlier as an alias. Consult vendor 2439 documentation in preference to here.* 2440* `ibpb-entry=` offers control over whether IBPB (Indirect Branch Prediction 2441 Barrier) is used on entry to Xen. This is used by default on hardware 2442 vulnerable to Branch Type Confusion, and hardware vulnerable to Speculative 2443 Return Stack Overflow if appropriate microcode has been loaded, but for 2444 performance reasons dom0 is unprotected by default. If it is necessary to 2445 protect dom0 too, boot with `spec-ctrl=ibpb-entry`. 2446* `bhb-entry=` offers control over whether BHB-clearing (Branch History 2447 Buffer) sequences are used on entry to Xen. This is used by default on 2448 hardware vulnerable to Branch History Injection, when the BHI_DIS_S control 2449 is not available (see `bhi-dis-s`). The choice of scrubbing sequence can be 2450 selected using the `bhb-seq=` option. If it is necessary to protect dom0 2451 too, boot with `spec-ctrl=bhb-entry`. 2452 2453If Xen was compiled with `CONFIG_INDIRECT_THUNK` support, `bti-thunk=` can be 2454used to select which of the thunks gets patched into the 2455`__x86_indirect_thunk_%reg` locations. The default thunk is `retpoline` 2456(generally preferred), with the alternatives being `jmp` (a `jmp *%reg` gadget, 2457minimal overhead), and `lfence` (an `lfence; jmp *%reg` gadget). 2458 2459On all hardware, `bhb-seq=` can be used to select which of the BHB-clearing 2460sequences gets used. This interacts with the `bhb-entry=` and `bhi-dis-s=` 2461options in order to mitigate Branch History Injection on affected hardware. 2462The default sequence is `short`, with `tsx` as an alternative available 2463capable hardware, and `long` that can be opted in to. 2464 2465On hardware supporting IBRS (Indirect Branch Restricted Speculation), the 2466`ibrs=` option can be used to force or prevent Xen using the feature itself. 2467If Xen is not using IBRS itself, functionality is still set up so IBRS can be 2468virtualised for guests. 2469 2470On hardware supporting STIBP (Single Thread Indirect Branch Predictors), the 2471`stibp=` option can be used to force or prevent Xen using the feature itself. 2472By default, Xen will use STIBP when IBRS is in use (IBRS implies STIBP), and 2473when hardware hints recommend using it as a blanket setting. 2474 2475On hardware supporting SSBD (Speculative Store Bypass Disable), the `ssbd=` 2476option can be used to force or prevent Xen using the feature itself. The 2477feature is virtualised for guests, independently of Xen's choice of setting. 2478On AMD hardware, disabling Xen SSBD usage on the command line (`ssbd=0` which 2479is the default value) can lead to Xen running with the guest SSBD selection 2480depending on hardware support, on the same hardware setting `ssbd=1` will 2481result in SSBD always being enabled, regardless of guest choice. 2482 2483On hardware supporting PSFD (Predictive Store Forwarding Disable), the `psfd=` 2484option can be used to force or prevent Xen using the feature itself. By 2485default, Xen will not use PSFD. PSFD is implied by SSBD, and SSBD is off by 2486default. 2487 2488On hardware supporting BHI_DIS_S (Branch History Injection Disable 2489Supervisor), the `bhi-dis-s=` option can be used to force or prevent Xen using 2490the feature itself. By default Xen will use BHI_DIS_S on hardware susceptible 2491to Branch History Injection. 2492 2493On hardware supporting IBPB (Indirect Branch Prediction Barrier), the `ibpb=` 2494option can be used to force (the default) or prevent Xen from issuing branch 2495prediction barriers on vcpu context switches. 2496 2497On all hardware, the `eager-fpu=` option can be used to force or prevent Xen 2498from using fully eager FPU context switches. This is currently implemented as 2499a global control. By default, Xen will choose to use fully eager context 2500switches on hardware believed to speculate past #NM exceptions. 2501 2502On hardware supporting L1D_FLUSH, the `l1d-flush=` option can be used to force 2503or prevent Xen from issuing an L1 data cache flush on each VMEntry. 2504Irrespective of Xen's setting, the feature is virtualised for HVM guests to 2505use. By default, Xen will enable this mitigation on hardware believed to be 2506vulnerable to L1TF. 2507 2508If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_BRANCH`, the 2509`branch-harden=` boolean can be used to force or prevent Xen from using 2510speculation barriers to protect selected conditional branches. By default, 2511Xen will enable this mitigation. 2512 2513On hardware supporting SRBDS_CTRL, the `srb-lock=` option can be used to force 2514or prevent Xen from protect the Special Register Buffer from leaking stale 2515data. By default, Xen will enable this mitigation, except on parts where MDS 2516is fixed and TAA is fixed/mitigated and there are no unprivileged MMIO 2517mappings (in which case, there is believed to be no way for an attacker to 2518obtain stale data). 2519 2520The `unpriv-mmio=` boolean indicates whether the system has (or will have) 2521less than fully privileged domains granted access to MMIO devices. By 2522default, this option is disabled. If enabled, Xen will use the `FB_CLEAR` 2523and/or `SRBDS_CTRL` functionality available in the Intel May 2022 microcode 2524release to mitigate cross-domain leakage of data via the MMIO Stale Data 2525vulnerabilities. 2526 2527On all hardware, the `gds-mit=` option can be used to force or prevent Xen 2528from mitigating the GDS (Gather Data Sampling) vulnerability. By default, Xen 2529will mitigate GDS on hardware believed to be vulnerable. On hardware 2530supporting GDS_CTRL (requires the August 2023 microcode), and where firmware 2531has elected not to lock the configuration, Xen will use GDS_CTRL to mitigate 2532GDS with. Otherwise, Xen will mitigate by disabling AVX, which blocks the use 2533of the AVX2 Gather instructions. 2534 2535On all hardware, the `div-scrub=` option can be used to force or prevent Xen 2536from mitigating the DIV-leakage vulnerability. By default, Xen will mitigate 2537DIV-leakage on hardware believed to be vulnerable. 2538 2539If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_LOCK`, the `lock-harden=` 2540boolean can be used to force or prevent Xen from using speculation barriers to 2541protect lock critical regions. This mitigation won't be engaged by default, 2542and needs to be explicitly enabled on the command line. 2543 2544### sync_console 2545> `= <boolean>` 2546 2547> Default: `false` 2548 2549Flag to force synchronous console output. Useful for debugging, but 2550not suitable for production environments due to incurred overhead. 2551 2552### tboot (x86) 2553> `= 0x<phys_addr>` 2554 2555Specify the physical address of the trusted boot shared page. 2556 2557### tbuf_size 2558> `= <integer>` 2559 2560Specify the per-cpu trace buffer size in pages. 2561 2562### tdt (x86) 2563> `= <boolean>` 2564 2565> Default: `true` 2566 2567Flag to enable TSC deadline as the APIC timer mode. 2568 2569### tevt_mask 2570> `= <integer>` 2571 2572Specify a mask for Xen event tracing. This allows Xen tracing to be 2573enabled at boot. Refer to the xentrace(8) documentation for a list of 2574valid event mask values. In order to enable tracing, a buffer size (in 2575pages) must also be specified via the tbuf_size parameter. 2576 2577### tickle_one_idle_cpu 2578> `= <boolean>` 2579 2580### timer_slop 2581> `= <integer>` 2582 2583### tsc (x86) 2584> `= unstable | skewed | stable:socket` 2585 2586### tsx 2587 = <bool> 2588 2589 Applicability: x86 2590 Default: false on parts vulnerable to TAA, true otherwise 2591 2592Controls for the use of Transactional Synchronization eXtensions. 2593 2594Several microcode updates are relevant: 2595 2596 * March 2019, fixing the TSX memory ordering errata on all TSX-enabled CPUs 2597 to date. Introduced MSR_TSX_FORCE_ABORT on SKL/SKX/KBL/WHL/CFL parts. The 2598 errata workaround uses Performance Counter 3, so the user can select 2599 between working TSX and working perfcounters. 2600 2601 * November 2019, fixing the TSX Async Abort speculative vulnerability. 2602 Introduced MSR_TSX_CTRL on all TSX-enabled MDS_NO parts to date, 2603 CLX/WHL-R/CFL-R, with the controls becoming architectural moving forward 2604 and formally retiring HLE from the architecture. The user can disable TSX 2605 to mitigate TAA, and elect to hide the HLE/RTM CPUID bits. Also causes 2606 VERW to once-again flush the microarchiectural buffers in case a TAA 2607 mitigation is wanted along with TSX being enabled. 2608 2609 * June 2021, removing the workaround for March 2019 on client CPUs and 2610 formally de-featured TSX on SKL/KBL/WHL/CFL (Note: SKX still retains the 2611 March 2019 fix). Introduced the ability to hide the HLE/RTM CPUID bits. 2612 PCR3 works fine, and TSX is disabled by default, but the user can re-enable 2613 TSX at their own risk, accepting that the memory order erratum is unfixed. 2614 2615 * February 2022, removing the VERW flushing workaround from November 2019 on 2616 client CPUs and formally de-featuring TSX on WHL-R/CFL-R (Note: CLX still 2617 retains the VERW flushing workaround). TSX defaults to disabled, and is 2618 locked off when SGX is enabled in the BIOS. When SGX is not enabled, TSX 2619 can be re-enabled at the users own risk, as it reintroduces the TSX Async 2620 Abort speculative vulnerability. 2621 2622On systems with the ability to configure TSX, this boolean offers system wide 2623control of whether TSX is enabled or disabled. 2624 2625When TSX is disabled, transactions unconditionally abort. This is compatible 2626with the TSX spec, which requires software to have a non-transactional path as 2627a fallback. The RTM and HLE CPUID bits are hidden from VMs by default, but 2628can be re-enabled if required. This allows VMs which previously saw RTM/HLE 2629to be migrated in, although any TSX-enabled software will run with reduced 2630performance. 2631 2632 * When TSX is locked off by firmware, `tsx=` is ignored and treated as 2633 `false`. 2634 2635 * An explicit `tsx=` choice is honoured, even if it is `true` and would 2636 result in a vulnerable system. 2637 2638 * When no explicit `tsx=` choice is given, parts vulnerable to TAA will be 2639 mitigated by disabling TSX, as this is the lowest overhead option. 2640 2641 * When no explicit `tsx=` option is given, parts susceptible to the memory 2642 ordering errata default to `true` to enable working TSX. Alternatively, 2643 selecting `tsx=0` will disable TSX and restore PCR3 to a working state. 2644 2645 SKX and SKL/KBL/WHL/CFL on pre-June 2021 microcode default to `true`. 2646 Alternatively, selecting `tsx=0` will disable TSX and restore PCR3 to a 2647 working state. 2648 2649 SKL/KBL/WHL/CFL on the June 2021 microcode or later default to `false`. 2650 Alternatively, selecting `tsx=1` will re-enable TSX at the users own risk. 2651 2652### ucode 2653> `= List of [ <integer> | scan=<bool>, nmi=<bool>, allow-same=<bool> ]` 2654 2655 Applicability: x86 2656 Default: `nmi` 2657 2658Controls for CPU microcode loading. For early loading, this parameter can 2659specify how and where to find the microcode update blob. For late loading, 2660this parameter specifies if the update happens within a NMI handler. 2661 2662'integer' specifies the CPU microcode update blob module index. When positive, 2663this specifies the n-th module (in the GrUB entry, zero based) to be used 2664for updating CPU micrcode. When negative, counting starts at the end of 2665the modules in the GrUB entry (so with the blob commonly being last, 2666one could specify `ucode=-1`). Note that the value of zero is not valid 2667here (entry zero, i.e. the first module, is always the Dom0 kernel 2668image). Note further that use of this option has an unspecified effect 2669when used with xen.efi (there the concept of modules doesn't exist, and 2670the blob gets specified via the `ucode=<filename>` config file/section 2671entry; see [EFI configuration file description](efi.html)). 2672 2673'scan' instructs the hypervisor to scan the multiboot images for an cpio 2674image that contains microcode. Depending on the platform the blob with the 2675microcode in the cpio name space must be: 2676 - on Intel: kernel/x86/microcode/GenuineIntel.bin 2677 - on AMD : kernel/x86/microcode/AuthenticAMD.bin 2678When using xen.efi, the `ucode=<filename>` config file setting takes 2679precedence over `scan`. 2680 2681'nmi' determines late loading is performed in NMI handler or just in 2682stop_machine context. In NMI handler, even NMIs are blocked, which is 2683considered safer. The default value is `true`. 2684 2685'allow-same' alters the default acceptance policy for new microcode to permit 2686trying to reload the same version. Many CPUs will actually reload microcode 2687of the same version, and this allows for easy testing of the late microcode 2688loading path. 2689 2690### unrestricted_guest (Intel) 2691> `= <boolean>` 2692 2693### vcpu_migration_delay 2694> `= <integer>` 2695 2696> Default: `0` 2697 2698Specify a delay, in microseconds, between migrations of a VCPU between 2699PCPUs when using the credit1 scheduler. This prevents rapid fluttering 2700of a VCPU between CPUs, and reduces the implicit overheads such as 2701cache-warming. 1ms (1000) has been measured as a good value. 2702 2703### vesa-ram 2704> `= <integer>` 2705 2706> Default: `0` 2707 2708This allows to override the amount of video RAM, in MiB, determined to be 2709present. 2710 2711### vga 2712> `= ( ask | current | text-80x<rows> | gfx-<width>x<height>x<depth> | mode-<mode> )[,keep]` 2713 2714`ask` causes Xen to display a menu of available modes and request the 2715user to choose one of them. 2716 2717`current` causes Xen to use the graphics adapter in its current state, 2718without further setup. 2719 2720`text-80x<rows>` instructs Xen to set up text mode. Valid values for 2721`<rows>` are `25, 28, 30, 34, 43, 50, 80` 2722 2723`gfx-<width>x<height>x<depth>` instructs Xen to set up graphics mode 2724with the specified width, height and depth. 2725 2726`mode-<mode>` instructs Xen to use a specific mode, as shown with the 2727`ask` option. (N.B menu modes are displayed in hex, so `<mode>` 2728should be a hexadecimal number) 2729 2730The optional `keep` parameter causes Xen to continue using the vga 2731console even after dom0 has been started. The default behaviour is to 2732relinquish control to dom0. 2733 2734### viridian-spinlock-retry-count (x86) 2735> `= <integer>` 2736 2737> Default: `2047` 2738 2739Specify the maximum number of retries before an enlightened Windows 2740guest will notify Xen that it has failed to acquire a spinlock. 2741 2742### viridian-version (x86) 2743> `= [<major>],[<minor>],[<build>]` 2744 2745> Default: `6,0,0x1772` 2746 2747<major>, <minor> and <build> must be integers. The values will be 2748encoded in guest CPUID 0x40000002 if viridian enlightenments are enabled. 2749 2750### vm-notify-window (Intel) 2751> `= <integer>` 2752 2753> Default: `0` 2754 2755Specify the value of the VM Notify window used to detect locked VMs. Set to -1 2756to disable the feature. Value is in units of crystal clock cycles. 2757 2758Note the hardware might add a threshold to the provided value in order to make 2759it safe, and hence using 0 is fine. 2760 2761### vpid (Intel) 2762> `= <boolean>` 2763 2764> Default: `true` 2765 2766Use Virtual Processor ID support if available. This prevents the need for TLB 2767flushes on VM entry and exit, increasing performance. 2768 2769### vpmu (x86) 2770 = List of [ <bool>, bts, ipc, arch, rtm-abort=<bool> ] 2771 2772 Applicability: x86. Default: false 2773 2774Controls for Performance Monitoring Unit virtualisation. 2775 2776Performance monitoring facilities tend to be very hardware specific, and 2777provide access to a wealth of low level processor information. 2778 2779* An overall boolean can be used to enable or disable vPMU support. vPMU is 2780 disabled by default. 2781 2782 When enabled, guests have full access to all performance counter settings, 2783 including model specific functionality. This is a superset of the 2784 functionality offered by `ipc` and/or `arch`, but a subset of the 2785 functionality offered by `bts`. 2786 2787 Xen's watchdog functionality is implemented using performance counters. 2788 As a result, use of the **watchdog** option will override and disable 2789 vPMU. 2790 2791* The `bts` option enables performance monitoring, and permits additional 2792 access to the Branch Trace Store controls. BTS is an Intel feature where 2793 the processor can write data into a buffer whenever a branch occurs. 2794 However, as this feature isn't virtualised, a misconfiguration by the 2795 guest can lock the entire system up. 2796 2797* The `ipc` option allows access to the most minimal set of counters 2798 possible: instructions, cycles, and reference cycles. These can be used 2799 to calculate instructions per cycle (IPC). 2800 2801* The `arch` option allows access to the pre-defined architectural events. 2802 2803* The `rtm-abort` boolean has been superseded. Use `tsx=0` instead. 2804 2805*Warning:* 2806As the virtualisation is not 100% safe, don't use the vpmu flag on 2807production systems (see https://xenbits.xen.org/xsa/advisory-163.html)! 2808 2809### vwfi (arm) 2810> `= trap | native` 2811 2812> Default: `trap` 2813 2814WFI is the ARM instruction to "wait for interrupt". WFE is similar and 2815means "wait for event". This option, which is ARM specific, changes the 2816way guest WFI and WFE are implemented in Xen. By default, Xen traps both 2817instructions. In the case of WFI, Xen blocks the guest vcpu; in the case 2818of WFE, Xen yield the guest vcpu. When setting vwfi to `native`, Xen 2819doesn't trap either instruction, running them in guest context. Setting 2820vwfi to `native` reduces irq latency significantly. It can also lead to 2821suboptimal scheduling decisions, but only when the system is 2822oversubscribed (i.e., in total there are more vCPUs than pCPUs). 2823 2824### watchdog (x86) 2825> `= force | <boolean>` 2826 2827> Default: `false` 2828 2829Run an NMI watchdog on each processor. If a processor is stuck for 2830longer than the **watchdog_timeout**, a panic occurs. When `force` is 2831specified, in addition to running an NMI watchdog on each processor, 2832unknown NMIs will still be processed. 2833 2834### watchdog_timeout (x86) 2835> `= <integer>` 2836 2837> Default: `5` 2838 2839Set the NMI watchdog timeout in seconds. Specifying `0` will turn off 2840the watchdog. 2841 2842### x2apic (x86) 2843> `= <boolean>` 2844 2845> Default: `true` 2846 2847Permit use of x2apic setup for SMP environments. 2848 2849### x2apic-mode (x86) 2850> `= physical | cluster | mixed` 2851 2852> Default: `physical` if **FADT** mandates physical mode, otherwise set at 2853> build time by CONFIG_X2APIC_{PHYSICAL,LOGICAL,MIXED}. 2854 2855In the case that x2apic is in use, this option switches between modes to 2856address APICs in the system as interrupt destinations. 2857 2858### x2apic_phys (x86) 2859> `= <boolean>` 2860 2861> Default: `true` if **FADT** mandates physical mode or if interrupt remapping 2862> is not available, `false` otherwise. 2863 2864In the case that x2apic is in use, this option switches between physical and 2865clustered mode. The default, given no hint from the **FADT**, is cluster 2866mode. 2867 2868**WARNING: `x2apic_phys` is deprecated and superseded by `x2apic-mode`. 2869The latter takes precedence if both are set.** 2870 2871### xenheap_megabytes (arm32) 2872> `= <size>` 2873 2874> Default: `0` (1/32 of RAM) 2875 2876Amount of RAM to set aside for the Xenheap. Must be an integer multiple of 32. 2877 2878By default will use 1/32 of the RAM up to a maximum of 1GB and with a 2879minimum of 32M, subject to a suitably aligned and sized contiguous 2880region of memory being available. 2881 2882### xpti (x86) 2883> `= List of [ default | <boolean> | dom0=<bool> | domu=<bool> ]` 2884 2885> Default: `false` on hardware known not to be vulnerable to Meltdown (e.g. AMD) 2886> Default: `true` everywhere else 2887 2888Override default selection of whether to isolate 64-bit PV guest page 2889tables. 2890 2891`true` activates page table isolation even on hardware not vulnerable by 2892Meltdown for all domains. 2893 2894`false` deactivates page table isolation on all systems for all domains. 2895 2896`default` sets the default behaviour. 2897 2898With `dom0` and `domu` it is possible to control page table isolation 2899for dom0 or guest domains only. 2900 2901### xsave (x86) 2902> `= <boolean>` 2903 2904> Default: `true` 2905 2906Permit use of the `xsave/xrstor` instructions. 2907 2908### xsm 2909> `= dummy | flask | silo` 2910 2911> Default: selectable via Kconfig. Depends on enabled XSM modules. 2912 2913Specify which XSM module should be enabled. This option is only available if 2914the hypervisor was compiled with `CONFIG_XSM` enabled. 2915 2916* `dummy`: this is the default choice. Basic restriction for common deployment 2917 (the dummy module) will be applied. It's also used when XSM is compiled out. 2918* `flask`: this is the policy based access control. To choose this, the 2919 separated option in kconfig must also be enabled. 2920* `silo`: this will deny any unmediated communication channels between 2921 unprivileged VMs. To choose this, the separated option in kconfig must also 2922 be enabled. 2923