1# Xen Hypervisor Command Line Options 2 3This document covers the command line options which the Xen 4Hypervisor. 5 6## Types of parameter 7 8Most parameters take the form `option=value`. Different options on 9the command line should be space delimited. All options are case 10sensitive, as are all values unless explicitly noted. 11 12### Boolean (`<boolean>`) 13 14All boolean option may be explicitly enabled using a `value` of 15> `yes`, `on`, `true`, `enable` or `1` 16 17They may be explicitly disabled using a `value` of 18> `no`, `off`, `false`, `disable` or `0` 19 20In addition, a boolean option may be enabled by simply stating its 21name, and may be disabled by prefixing its name with `no-`. 22 23####Examples 24 25Enable noreboot mode 26> `noreboot=true` 27 28Disable x2apic support (if present) 29> `x2apic=off` 30 31Enable synchronous console mode 32> `sync_console` 33 34Explicitly specifying any value other than those listed above is 35undefined, as is stacking a `no-` prefix with an explicit value. 36 37### Integer (`<integer>`) 38 39An integer parameter will default to decimal and may be prefixed with 40a `-` for negative numbers. Alternatively, a hexadecimal number may be 41used by prefixing the number with `0x`, or an octal number may be used 42if a leading `0` is present. 43 44Providing a string which does not validly convert to an integer is 45undefined. 46 47### Size (`<size>`) 48 49A size parameter may be any integer, with a single size suffix 50 51* `T` or `t`: TiB (2^40) 52* `G` or `g`: GiB (2^30) 53* `M` or `m`: MiB (2^20) 54* `K` or `k`: KiB (2^10) 55* `B` or `b`: Bytes 56 57Without a size suffix, the default will be kilo. Providing a suffix 58other than those listed above is undefined. 59 60### String 61 62Many parameters are more complicated and require more intricate 63configuration. The detailed description of each individual parameter 64specify which values are valid. 65 66### List 67 68Some options take a comma separated list of values. 69 70### Combination 71 72Some parameters act as combinations of the above, most commonly a mix 73of Boolean and String. These are noted in the relevant sections. 74 75## Parameter details 76 77### acpi 78> `= force | ht | noirq | <boolean> | verbose` 79 80**String**, or **Boolean** to disable. 81 82By default, Xen will scan the DMI data and blacklist certain systems 83which are known to have broken ACPI setups. Providing `acpi=force` 84will cause Xen to ignore the blacklist and attempt to use all ACPI 85features. 86 87Using `acpi=ht` causes Xen to parse the ACPI tables enough to 88enumerate all CPUs, but will not use other ACPI features. This is not 89common, and only has an effect if your system is blacklisted. 90 91The `acpi=noirq` option causes Xen to not parse the ACPI MADT table 92looking for IO-APIC entries. This is also not common, and any system 93which requires this option to function should be blacklisted. 94Additionally, this will not prevent Xen from finding IO-APIC entries 95from the MP tables. 96 97Further, any of the boolean false options can be used to disable ACPI 98usage entirely. 99 100Because responsibility for ACPI processing is shared between Xen and 101the domain 0 kernel this option is automatically propagated to the 102domain 0 command line. 103 104Finally, `acpi=verbose` will enable per-processor information logging 105which may otherwise be too noisy in particular on large systems. 106 107### acpi_apic_instance 108> `= <integer>` 109 110Specify which ACPI MADT table to parse for APIC information, if more 111than one is present. 112 113### acpi_pstate_strict (x86) 114> `= <boolean>` 115 116> Default: `false` 117 118Enforce checking that P-state transitions by the ACPI cpufreq driver 119actually result in the nominated frequency to be established. A warning 120message will be logged if that isn't the case. 121 122### acpi_skip_timer_override (x86) 123> `= <boolean>` 124 125Instruct Xen to ignore timer-interrupt override. 126 127### acpi_sleep (x86) 128> `= s3_bios | s3_mode` 129 130`s3_bios` instructs Xen to invoke video BIOS initialization during S3 131resume. 132 133`s3_mode` instructs Xen to set up the boot time (option `vga=`) video 134mode during S3 resume. 135 136### allow_unsafe (x86) 137> `= <boolean>` 138 139> Default: `false` 140 141Force boot on potentially unsafe systems. By default Xen will refuse 142to boot on systems with the following errata: 143 144* AMD Erratum 121. Processors with this erratum are subject to a guest 145 triggerable Denial of Service. Override only if you trust all of 146 your PV guests. 147 148### altp2m (Intel) 149> `= <boolean>` 150 151> Default: `false` 152 153Permit multiple copies of host p2m. 154 155### apic (x86) 156> `= bigsmp | default` 157 158Override Xen's logic for choosing the APIC driver. By default, if 159there are more than 8 CPUs, Xen will switch to `bigsmp` over 160`default`. 161 162### apicv (Intel) 163> `= <boolean>` 164 165> Default: `true` 166 167Permit Xen to use APIC Virtualisation Extensions. This is an optimisation 168available as part of VT-x, and allows hardware to take care of the guests APIC 169handling, rather than requiring emulation in Xen. 170 171### apic_verbosity (x86) 172> `= verbose | debug` 173 174Increase the verbosity of the APIC code from the default value. 175 176### arat (x86) 177> `= <boolean>` 178 179> Default: `true` 180 181Permit Xen to use "Always Running APIC Timer" support on compatible hardware 182in combination with cpuidle. This option is only expected to be useful for 183developers wishing Xen to fall back to older timing methods on newer hardware. 184 185### argo 186 = List of [ <bool>, mac-permissive=<bool> ] 187 188Controls for the Argo hypervisor-mediated interdomain communication service. 189 190The functionality that this option controls is only available when Xen has been 191compiled with the build setting for Argo enabled in the build configuration. 192 193Argo is a interdomain communication mechanism, where Xen acts as the central 194point of authority. Guests may register memory rings to recieve messages, 195query the status of other domains, and send messages by hypercall, all subject 196to appropriate auditing by Xen. Argo is disabled by default. 197 198* The `mac-permissive` boolean controls whether wildcard receive rings may be 199 registered (`mac-permissive=1`) or may not be registered 200 (`mac-permissive=0`). 201 202 This option is disabled by default, to protect domains from a DoS by a 203 buggy or malicious other domain spamming the ring. 204 205### asid (x86) 206> `= <boolean>` 207 208> Default: `true` 209 210Permit Xen to use Address Space Identifiers. This is an optimisation which 211tags the TLB entries with an ID per vcpu. This allows for guest TLB flushes 212to be performed without the overhead of a complete TLB flush. 213 214### async-show-all (x86) 215> `= <boolean>` 216 217> Default: `false` 218 219Forces all CPUs' full state to be logged upon certain fatal asynchronous 220exceptions (watchdog NMIs and unexpected MCEs). 221 222### ats (x86) 223> `= <boolean>` 224 225> Default: `false` 226 227Permits Xen to set up and use PCI Address Translation Services. This is a 228performance optimisation for PCI Passthrough. 229 230**WARNING: Xen cannot currently safely use ATS because of its synchronous wait 231loops for Queued Invalidation completions.** 232 233### availmem 234> `= <size>` 235 236> Default: `0` (no limit) 237 238Specify a maximum amount of available memory, to which Xen will clamp 239the e820 table. 240 241### badpage 242> `= List of [ <integer> | <integer>-<integer> ]` 243 244Specify that certain pages, or certain ranges of pages contain bad 245bytes and should not be used. For example, if your memory tester says 246that byte `0x12345678` is bad, you would place `badpage=0x12345` on 247Xen's command line. 248 249### bootscrub 250> `= idle | <boolean>` 251 252> Default: `idle` 253 254Scrub free RAM during boot. This is a safety feature to prevent 255accidentally leaking sensitive VM data into other VMs if Xen crashes 256and reboots. 257 258In `idle` mode, RAM is scrubbed in background on all CPUs during idle-loop 259with a guarantee that memory allocations always provide scrubbed pages. 260This option reduces boot time on machines with a large amount of RAM while 261still providing security benefits. 262 263### bootscrub_chunk 264> `= <size>` 265 266> Default: `128M` 267 268Maximum RAM block size chunks to be scrubbed whilst holding the page heap lock 269and not running softirqs. Reduce this if softirqs are not being run frequently 270enough. Setting this to a high value may cause boot failure, particularly if 271the NMI watchdog is also enabled. 272 273### buddy-alloc-size (arm64) 274> `= <size>` 275 276> Default: `64M` 277 278Amount of memory reserved for the buddy allocator when colored allocator is 279active. This option is available only when `CONFIG_LLC_COLORING` is enabled. 280The colored allocator is meant as an alternative to the buddy allocator, 281because its allocation policy is by definition incompatible with the generic 282one. Since the Xen heap systems is not colored yet, we need to support the 283coexistence of the two allocators for now. This parameter, which is optional 284and for expert only, it's used to set the amount of memory reserved to the 285buddy allocator. 286 287### cet 288 = List of [ <bool>, shstk=<bool>, ibt=<bool> ] 289 290 Applicability: x86 291 292Controls for the use of Control-flow Enforcement Technology. CET is group a 293of hardware features designed to combat Return-oriented Programming (ROP, also 294call/jmp COP/JOP) attacks. 295 296CET is incompatible with 32bit PV guests. If any CET sub-options are active, 297they will override the `pv=32` boolean to `false`. Backwards compatibility 298can be maintained with the pv-shim mechanism. 299 300* An unqualified boolean is a shorthand for setting all suboptions at once. 301 302* The `shstk=` boolean controls whether Xen uses Shadow Stacks for its own 303 protection. 304 305 The option is available when `CONFIG_XEN_SHSTK` is compiled in, and 306 generally defaults to `true` on hardware supporting CET-SS. Specifying 307 `cet=no-shstk` will cause Xen not to use Shadow Stacks even when support 308 is available in hardware. 309 310 Some hardware suffers from an issue known as Supervisor Shadow Stack 311 Fracturing. On such hardware, Xen will default to not using Shadow Stacks 312 when virtualised. Specifying `cet=shstk` will override this heuristic and 313 enable Shadow Stacks unilaterally. 314 315* The `ibt=` boolean controls whether Xen uses Indirect Branch Tracking for 316 its own protection. 317 318 The option is available when `CONFIG_XEN_IBT` is compiled in, and defaults 319 to `true` on hardware supporting CET-IBT. Specifying `cet=no-ibt` will 320 cause Xen not to use Indirect Branch Tracking even when support is 321 available in hardware. 322 323### clocksource (x86) 324> `= pit | hpet | acpi | tsc` 325 326If set, override Xen's default choice for the platform timer. 327Having TSC as platform timer requires being explicitly set. This is because 328TSC can only be safely used if CPU hotplug isn't performed on the system. On 329some platforms, the "maxcpus" option may need to be used to further adjust 330the number of allowed CPUs. When running on platforms that can guarantee a 331monotonic TSC across sockets you may want to adjust the "tsc" command line 332parameter to "stable:socket". 333 334### cmci-threshold (Intel) 335> `= <integer>` 336 337> Default: `2` 338 339Specify the event count threshold for raising Corrected Machine Check 340Interrupts. Specifying zero disables CMCI handling. 341 342### cmos-rtc-probe (x86) 343> `= <boolean>` 344 345> Default: `false` 346 347Flag to indicate whether to probe for a CMOS Real Time Clock irrespective of 348ACPI indicating none to be there. 349 350### com1 (x86) 351### com2 (x86) 352> `= <baud>[/<base-baud>][,[DPS][,[<io-base>|pci|amt][,[<irq>|msi][,[<port-bdf>][,[<bridge-bdf>]]]]]]` 353 354Both option `com1` and `com2` follow the same format. 355 356* `<baud>` may be either an integer baud rate, or the string `auto` if 357 the bootloader or other earlier firmware has already set it up. 358* Optionally, the base baud rate (usually the highest baud rate the 359 device can communicate at) can be specified. 360* `DPS` represents the number of data bits, the parity, and the number 361 of stop bits. 362 * `D` is an integer between 5 and 8 for the number of data bits. 363 * `P` is a single character representing the type of parity: 364 * `n` No 365 * `o` Odd 366 * `e` Even 367 * `m` Mark 368 * `s` Space 369 * `S` is an integer 1 or 2 for the number of stop bits. 370* `<io-base>` is an integer which specifies the IO base port for UART 371 registers. 372* `<irq>` is the IRQ number to use, or `0` to use the UART in poll 373 mode only, or `msi` to set up a Message Signaled Interrupt. 374* `<port-bdf>` is the PCI location of the UART, in 375 `<bus>:<device>.<function>` notation. 376* `<bridge-bdf>` is the PCI bridge behind which is the UART, in 377 `<bus>:<device>.<function>` notation. 378* `pci` indicates that Xen should scan the PCI bus for the UART, 379 avoiding Intel AMT devices. 380* `amt` indicated that Xen should scan the PCI bus for the UART, 381 including Intel AMT devices if present. 382 383A typical setup for most situations might be `com1=115200,8n1` 384 385In addition to the above positional specification for UART parameters, 386name=value pair specfications are also supported. This is used to add 387flexibility for UART devices which require additional UART parameter 388configurations. 389 390The comma separation still delineates positional parameters. Hence, 391unless the parameter is explicitly specified with name=value option, it 392will be considered a positional parameter. 393 394The syntax consists of 395com1=(comma-separated positional parameters),(comma separated name-value pairs) 396 397The accepted name keywords for name=value pairs are: 398 399* `baud` - accepts integer baud rate (eg. 115200) or `auto` 400* `bridge`- Similar to bridge-bdf in positional parameters. 401 Used to determine the PCI bridge to access the UART device. 402 Notation is xx:xx.x `<bus>:<device>.<function>` 403* `clock-hz`- accepts large integers to setup UART clock frequencies. 404 Do note - these values are multiplied by 16. 405* `data-bits` - integer between 5 and 8 406* `dev` - accepted values are `pci` OR `amt`. If this option 407 is used to specify if the serial device is pci-based. The io_base 408 cannot be specified when `dev=pci` or `dev=amt` is used. 409* `io-base` - accepts integer which specified IO base port for UART registers 410* `irq` - IRQ number to use 411* `parity` - accepted values are same as positional parameters 412* `port` - Used to specify which port the PCI serial device is located on 413 Notation is xx:xx.x `<bus>:<device>.<function>` 414* `reg-shift` - register shifts required to set UART registers 415* `reg-width` - register width required to set UART registers 416 (only accepts 1 and 4) 417* `stop-bits` - only accepts 1 or 2 for the number of stop bits 418 419The following are examples of correct specifications: 420 421 com1=115200,8n1,0x3f8,4 422 com1=115200,8n1,0x3f8,4,reg-width=4,reg-shift=2 423 com1=baud=115200,parity=n,stop-bits=1,io-base=0x3f8,reg-width=4 424 425### conring_size 426> `= <size>` 427 428> Default: `conring_size=16k` 429 430Specify the size of the console ring buffer. 431 432### console 433> `= List of [ vga | com1[H,L] | com2[H,L] | pv | dbgp | ehci | xhci | none ]` 434 435> Default: `console=com1,vga` 436 437Specify which console(s) Xen should use. 438 439`vga` indicates that Xen should try and use the vga graphics adapter. 440 441`com1` and `com2` indicates that Xen should use serial ports 1 and 2 442respectively. Optionally, these arguments may be followed by an `H` or 443`L`. `H` indicates that transmitted characters will have their MSB 444set, while received characters must have their MSB set. `L` indicates 445the converse; transmitted and received characters will have their MSB 446cleared. This allows a single port to be shared by two subsystems 447(e.g. console and debugger). 448 449`pv` indicates that Xen should use Xen's PV console. This option is 450only available when used together with `pv-in-pvh`. 451 452`dbgp` or `ehci` indicates that Xen should use a USB2 debug port. 453 454`xhci` indicates that Xen should use a USB3 debug port. 455 456`none` indicates that Xen should not use a console. This option only 457makes sense on its own. 458 459### console_timestamps 460> `= none | date | datems | boot | raw` 461 462> Default: `none` 463 464> Can be modified at runtime 465 466Specify which timestamp format Xen should use for each console line. 467 468* `none`: No timestamps 469* `date`: Date and time information 470 * `[YYYY-MM-DD HH:MM:SS]` 471* `datems`: Date and time, with milliseconds 472 * `[YYYY-MM-DD HH:MM:SS.mmm]` 473* `boot`: Seconds and microseconds since boot 474 * `[SSSSSS.uuuuuu]` 475+ `raw`: Raw platform ticks, architecture and implementation dependent 476 * `[XXXXXXXXXXXXXXXX]` 477 478For compatibility with the older boolean parameter, specifying 479`console_timestamps` alone will enable the `date` option. 480 481### console_to_ring 482> `= <boolean>` 483 484> Default: `false` 485 486Flag to indicate whether all guest console output should be copied 487into the console ring buffer. 488 489### conswitch 490> `= <switch char>[x]` 491 492> Default: `conswitch=a` 493 494> Can be modified at runtime 495 496Specify which character should be used to switch serial input between 497Xen and dom0. The required sequence is CTRL-<switch char> three 498times. 499 500The optional trailing `x` indicates that Xen should not automatically 501switch the console input to dom0 during boot. Any other value, 502including omission, causes Xen to automatically switch to the dom0 503console during dom0 boot. Use `conswitch=ax` to keep the default switch 504character, but for xen to keep the console. 505 506### core_parking 507> `= power | performance` 508 509> Default: `power` 510 511### cpu_type (x86) 512> `= arch_perfmon` 513 514If set, force use of the performance counters for oprofile, rather than detecting 515available support. 516 517### cpufreq 518> `= none | {{ <boolean> | xen } { [:[powersave|performance|ondemand|userspace][,[<maxfreq>]][,[<minfreq>]]] } [,verbose]} | dom0-kernel | hwp[:[<hdc>][,verbose]]` 519 520> Default: `xen` 521 522Indicate where the responsibility for driving power states lies. Note that the 523choice of `dom0-kernel` is deprecated and not supported by all Dom0 kernels. 524 525* Default governor policy is ondemand. 526* `<maxfreq>` and `<minfreq>` are integers which represent max and min processor frequencies 527 respectively. 528* `verbose` option can be included as a string or also as `verbose=<integer>` 529 for `xen`. It is a boolean for `hwp`. 530* `hwp` selects Hardware-Controlled Performance States (HWP) on supported Intel 531 hardware. HWP is a Skylake+ feature which provides better CPU power 532 management. The default is disabled. If `hwp` is selected, but hardware 533 support is not available, Xen will fallback to cpufreq=xen. 534* `<hdc>` is a boolean to enable Hardware Duty Cycling (HDC). HDC enables the 535 processor to autonomously force physical package components into idle state. 536 The default is enabled, but the option only applies when `hwp` is enabled. 537 538There is also support for `;`-separated fallback options: 539`cpufreq=hwp;xen,verbose`. This first tries `hwp` and falls back to `xen` if 540unavailable. Note: The `verbose` suboption is handled globally. Setting it 541for either the primary or fallback option applies to both irrespective of where 542it is specified. 543 544Note: grub2 requires to escape or quote ';', so `"cpufreq=hwp;xen"` should be 545specified within double quotes inside grub.cfg. Refer to the grub2 546documentation for more information. 547 548### cpuid (x86) 549> `= List of comma separated booleans` 550 551This option allows for fine tuning of the facilities Xen will use, after 552accounting for hardware capabilities as enumerated via CPUID. 553 554Unless otherwise noted, options only have any effect in their negative form, 555to hide the named feature(s). Ignoring a feature using this mechanism will 556cause Xen not to use the feature, nor offer them as usable to guests. 557 558Currently accepted: 559 560The Speculation Control hardware features `srbds-ctrl`, `md-clear`, `ibrsb`, 561`stibp`, `ibpb`, `l1d-flush` and `ssbd` are used by default if available and 562applicable. They can all be ignored. 563 564`rdrand` and `rdseed` have multiple interactions. 565 566* For Special Register Buffer Data Sampling (SRBDS, XSA-320, CVE-2020-0543), 567 RDRAND and RDSEED can be ignored. 568 569 Due to the absence of microcode to address SRBDS on IvyBridge client 570 hardware, the RDRAND feature is hidden by default for guests, unless 571 `rdrand` is used in its positive form. Irrespective of the setting here, 572 VMs can use RDRAND if explicitly enabled in guest config file, and VMs 573 already using RDRAND can migrate in. 574 575* The RDRAND feature is disabled by default on AMD Fam15/16 systems, due to 576 possible malfunctions after ACPI S3 suspend/resume. `rdrand` may be used 577 in its positive form to override Xen's default behaviour on these systems, 578 and make the feature fully usable. 579 580### cpuid_mask_cpu 581> `= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b` 582 583> Applicability: AMD 584 585If none of the other **cpuid_mask_\*** options are given, Xen has a set of 586pre-configured masks to make the current processor appear to be 587family/revision specified. 588 589See below for general information on masking. 590 591**Warning: This option is not fully effective on Family 15h processors or 592later.** 593 594### cpuid_mask_ecx 595### cpuid_mask_edx 596### cpuid_mask_ext_ecx 597### cpuid_mask_ext_edx 598### cpuid_mask_l7s0_eax 599### cpuid_mask_l7s0_ebx 600### cpuid_mask_thermal_ecx 601### cpuid_mask_xsave_eax 602> `= <integer>` 603 604> Applicability: x86. Default: `~0` (all bits set) 605 606The availability of these options are model specific. Some processors don't 607support any of them, and no processor supports all of them. Xen will ignore 608options on processors which are lacking support. 609 610These options can be used to alter the features visible via the `CPUID` 611instruction. Settings applied here take effect globally, including for Xen 612and all guests. 613 614Note: Since Xen 4.7, it is no longer necessary to mask a host to create 615migration safety in heterogeneous scenarios. All necessary CPUID settings 616should be provided in the VM configuration file. Furthermore, it is 617recommended not to use this option, as doing so causes an unnecessary 618reduction of features at Xen's disposal to manage guests. 619 620### cpuidle (x86) 621> `= <boolean>` 622 623### cpuinfo (x86) 624> `= <boolean>` 625 626### crash-debug-debugkey 627### crash-debug-hwdom 628### crash-debug-kexeccmd 629### crash-debug-panic 630### crash-debug-watchdog 631> `= <string>` 632 633> Can be modified at runtime 634 635Specify debug-key actions in cases of crashes. Each of the parameters applies 636to a different crash reason. The `<string>` is a sequence of debug key 637characters, with `+` having the special meaning of a 10 millisecond pause. 638 639`crash-debug-debugkey` will be used for crashes induced by the `C` debug 640key (i.e. manually induced crash). 641 642`crash-debug-hwdom` denotes a crash of dom0. 643 644`crash-debug-kexeccmd` is an explicit request of dom0 to continue with the 645kdump kernel via kexec. Only available on hypervisors built with CONFIG_KEXEC. 646 647`crash-debug-panic` is a crash of the hypervisor. 648 649`crash-debug-watchdog` is a crash due to the watchdog timer expiring. 650 651It should be noted that dumping diagnosis data to the console can fail in 652multiple ways (missing data, hanging system, ...) depending on the reason 653of the crash, which might have left the hypervisor in a bad state. In case 654a debug-key action leads to another crash recursion will be avoided, so no 655additional debug-key actions will be performed in this case. A crash in the 656early boot phase will not result in any debug-key action, as the system 657might not yet be in a state where the handlers can work. 658 659So e.g. `crash-debug-watchdog=0+0r` would dump dom0 state twice with 10 660milliseconds between the two state dumps, followed by the run queues of the 661hypervisor, if the system crashes due to a watchdog timeout. 662 663Depending on the reason of the system crash it might happen that triggering 664some debug key action will result in a hang instead of dumping data and then 665doing a reboot or crash dump. 666 667### crashinfo_maxaddr 668> `= <size>` 669 670> Default: `4G` 671 672Specify the maximum address to allocate certain structures, if used in 673combination with the **low_crashinfo** command line option. 674 675### crashkernel 676> `= <ramsize-range>:<size>[,...][{@,<}<offset>]` 677> `= <size>[{@,<}<offset>]` 678> `= <size>,below=offset` 679 680Specify sizes and optionally placement of the crash kernel reservation 681area. The `<ramsize-range>:<size>` pairs indicate how much memory to 682set aside for a crash kernel (`<size>`) for a given range of installed 683RAM (`<ramsize-range>`). Each `<ramsize-range>` is of the form 684`<start>-[<end>]`. 685 686A trailing `@<offset>` specifies the exact address this area should be 687placed at, whereas `<` in place of `@` just specifies an upper bound of 688the address range the area should fall into. 689 690< and below are synonyomous, the latter being useful for grub2 systems 691which would otherwise require escaping of the < option 692 693 694### credit2_balance_over 695> `= <integer>` 696 697### credit2_balance_under 698> `= <integer>` 699 700### credit2_cap_period_ms 701> `= <integer>` 702 703> Default: `10` 704 705Domains subject to a cap receive a replenishment of their runtime budget 706once every cap period interval. Default is 10 ms. The amount of budget 707they receive depends on their cap. For instance, a domain with a 50% cap 708will receive 50% of 10 ms, so 5 ms. 709 710### credit2_load_precision_shift 711> `= <integer>` 712 713> Default: `18` 714 715Specify the number of bits to use for the fractional part of the 716values involved in Credit2 load tracking and load balancing math. 717 718### credit2_load_window_shift 719> `= <integer>` 720 721> Default: `30` 722 723Specify the number of bits to use to represent the length of the 724window (in nanoseconds) we use for load tracking inside Credit2. 725This means that, with the default value (30), we use 7262^30 nsec ~= 1 sec long window. 727 728Load tracking is done by means of a variation of exponentially 729weighted moving average (EWMA). The window length defined here 730is what tells for how long we give value to previous history 731of the load itself. In fact, after a full window has passed, 732what happens is that we discard all previous history entirely. 733 734A short window will make the load balancer quick at reacting 735to load changes, but also short-sighted about previous history 736(and hence, e.g., long term load trends). A long window will 737make the load balancer thoughtful of previous history (and 738hence capable of capturing, e.g., long term load trends), but 739also slow in responding to load changes. 740 741The default value of `1 sec` is rather long. 742 743### credit2_runqueue 744> `= cpu | core | socket | node | all` 745 746> Default: `socket` 747 748Specify how host CPUs are arranged in runqueues. Runqueues are kept 749balanced with respect to the load generated by the vCPUs running on 750them. Smaller runqueues (as in with `core`) means more accurate load 751balancing (for instance, it will deal better with hyperthreading), 752but also more overhead. 753 754Available alternatives, with their meaning, are: 755* `cpu`: one runqueue per each logical pCPUs of the host; 756* `core`: one runqueue per each physical core of the host; 757* `socket`: one runqueue per each physical socket (which often, 758 but not always, matches a NUMA node) of the host; 759* `node`: one runqueue per each NUMA node of the host; 760* `all`: just one runqueue shared by all the logical pCPUs of 761 the host 762 763Regardless of the above choice, Xen attempts to respect 764`sched_credit2_max_cpus_runqueue` limit, which may mean more than one runqueue 765for the `all` value. If that isn't intended, raise 766the `sched_credit2_max_cpus_runqueue` value. 767 768### dbgp 769> `= ehci[ <integer> | @pci<bus>:<slot>.<func> ]` 770> `= xhci[ <integer> | @pci<bus>:<slot>.<func> ][,share=<bool>|hwdom]` 771 772Specify the USB controller to use, either by instance number (when going 773over the PCI busses sequentially) or by PCI device (must be on segment 0). 774 775Use `ehci` for EHCI debug port, use `xhci` for XHCI debug capability. 776XHCI driver will wait indefinitely for the debug host to connect - make sure 777the cable is connected. 778The `share` option for xhci controls who else can use the controller: 779* `no`: use the controller exclusively for console, even hardware domain 780 (dom0) cannot use it 781* `hwdom`: hardware domain may use the controller too, ports not used for debug 782 console will be available for normal devices; this is the default 783* `yes`: the controller can be assigned to any domain; it is not safe to assign 784 the controller to untrusted domain 785 786Choosing `share=hwdom` (the default) or `share=yes` allows a domain to reset the 787controller, which may cause small portion of the console output to be lost. 788 789The `share=yes` configuration is not security supported. 790 791### debug_stack_lines 792> `= <integer>` 793 794> Default: `20` 795 796Limits the number lines printed in Xen stack traces. 797 798### debugtrace 799> `= [cpu:]<size>` 800 801> Default: `128` 802 803Specify the size of the console debug trace buffer. By specifying `cpu:` 804additionally a trace buffer of the specified size is allocated per cpu. 805The debug trace feature is only enabled in debugging builds of Xen. 806 807### dit (x86/Intel) 808> `= <boolean>` 809 810> Default: `CONFIG_DIT_DEFAULT` 811 812Specify whether Xen and guests should operate in Data Independent Timing 813mode (Intel calls this DOITM, Data Operand Independent Timing Mode). Note 814that enabling this option cannot guarantee anything beyond what underlying 815hardware guarantees (with, where available and known to Xen, respective 816tweaks applied). 817 818### dma_bits 819> `= <integer>` 820 821Specify the bit width of the DMA heap. 822 823### dom0 824 = List of [ pv | pvh, shadow=<bool>, verbose=<bool>, 825 cpuid-faulting=<bool>, msr-relaxed=<bool>, 826 pf-fixup=<bool> ] (x86) 827 828 = List of [ sve=<integer> ] (Arm64) 829 830Controls for how dom0 is constructed on x86 systems. 831 832* The `pv` and `pvh` options select the virtualisation mode of dom0. 833 834 The `pv` option is only available when `CONFIG_PV` is compiled in. The 835 `pvh` option is only available when `CONFIG_HVM` is compiled in. When 836 both options are compiled in, the default is PV. 837 838 In addition, the following requirements must be met: 839 840 * The dom0 kernel selected by the boot loader must be capable of the 841 selected mode. 842 * For a PVH dom0, the hardware must have VT-x/SVM extensions available. 843 844* The `shadow` boolean allows dom0 to be explicitly constructed using shadow 845 paging. This option is unavailable when `CONFIG_SHADOW_PAGING` is 846 disabled. 847 848 For PVH, dom0 defaults to using HAP on capable hardware, and falls back to 849 shadow paging otherwise. A PVH dom0 cannot be used if Xen is compiled 850 without shadow paging support, and the hardware lacks HAP support. 851 852 For PV, the use of dom0 shadow mode is only for development purposes. PV 853 guests do no require any paging support by default. 854 855* The `verbose` boolean is intended for diagnostics, and prints out extra 856 information during the dom0 build. It defaults to the compile time choice 857 of `CONFIG_VERBOSE_DEBUG`. 858 859* The `cpuid-faulting` boolean is an interim option, is only applicable to 860 PV dom0, and defaults to true. 861 862 Before Xen 4.13, the domain builder logic for guest construction depended 863 on seeing host CPUID values to function correctly. As a result, CPUID 864 Faulting was never activated for PV dom0's, even on capable hardware. 865 866 In Xen 4.13, the domain builder logic has been fixed, and no longer has 867 this dependency. As a consequence, CPUID Faulting is activated by default 868 even for PV dom0's. 869 870 However, as PV dom0's have always seen host CPUID data in the past, there 871 is a chance that further dependencies exist. This boolean can be used to 872 restore the pre-4.13 behaviour. If specifying `no-cpuid-faulting` fixes 873 an issue in dom0, please report a bug. 874 875* The `msr-relaxed` boolean is an interim option, and defaults to false. 876 877 In Xen 4.15, the default behaviour for unhandled MSRs has been changed, 878 to avoid leaking host data into guests, and to avoid breaking guest 879 logic which uses \#GP probing to identify the availability of MSRs. 880 881 However, this new stricter behaviour has the possibility to break 882 guests, and a more 4.14-like behaviour can be selected by specifying 883 `dom0=msr-relaxed`. 884 885 If using this option is necessary to fix an issue, please report a bug. 886 887* The `pf-fixup` boolean is only applicable when using a PVH dom0 and 888 defaults to false. 889 890 When running dom0 in PVH mode the dom0 kernel has no way to map MMIO 891 regions into its physical memory map, such mode relies on Xen dom0 builder 892 populating the physical memory map with all MMIO regions that dom0 should 893 access. However Xen doesn't have a complete picture of the host memory 894 map, due to not being able to process ACPI dynamic tables. 895 896 The `pf-fixup` option allows Xen to attempt to add missing MMIO regions 897 to the dom0 physical memory map in response to page-faults generated by 898 dom0 trying to access unpopulated entries in the memory map. 899 900Enables features on dom0 on Arm systems. 901 902* The `sve` integer parameter enables Arm SVE usage for Dom0 and sets the 903 maximum SVE vector length, the option is applicable only to Arm64 Dom0 904 kernels. 905 A value equal to 0 disables the feature, this is the default value. 906 Values below 0 means the feature uses the maximum SVE vector length 907 supported by hardware, if SVE is supported. 908 Values above 0 explicitly set the maximum SVE vector length for Dom0, 909 allowed values are from 128 to maximum 2048, being multiple of 128. 910 Please note that when the user explicitly specifies the value, if that value 911 is above the hardware supported maximum SVE vector length, the domain 912 creation will fail and the system will stop, the same will occur if the 913 option is provided with a positive non zero value, but the platform doesn't 914 support SVE. 915 916### dom0-cpuid 917 = List of comma separated booleans 918 919 Applicability: x86 920 921This option allows for fine tuning of the facilities dom0 will use, after 922accounting for hardware capabilities and Xen settings as enumerated via CPUID. 923 924Options are accepted in positive and negative form, to enable or disable 925specific features. All selections via this mechanism are subject to normal 926CPU Policy safety and dependency logic. 927 928This option is intended for developers to opt dom0 into non-default features, 929and is not intended for use in production circumstances. If using this option 930is necessary to fix an issue, please report a bug. 931 932### dom0-iommu 933 = List of [ passthrough=<bool>, strict=<bool>, map-inclusive=<bool>, 934 map-reserved=<bool>, none ] 935 936Controls for the dom0 IOMMU setup. 937 938* The `passthrough` boolean controls whether IOMMU translation functionality 939 is disabled for devices in dom0 (`passthrough=1`) or whether the IOMMU is 940 used to ensure that dom0 can only DMA to its permitted areas of RAM 941 (`passthrough=0`). 942 943 This option is only applicable to x86 PV dom0's, and defaults to false. 944 945 Some older Intel VT-d hardware isn't capable of disabling translation 946 functionality on a per-device basis, and will cause this option to be 947 ignored and assumed to be 0. Similar behaviour on such systems is only 948 available by fully disabling all IOMMUs. 949 950 This option is hardwired to false for x86 PVH dom0's (where a non-identity 951 transform is required for dom0 to function), and is ignored for ARM. 952 953* The `strict` boolean is applicable to x86 PV dom0's only and defaults to 954 false. It controls whether dom0 can have IOMMU mappings for all domain 955 RAM in the system, or only for its allocated RAM (and grant mappings etc.) 956 957 This option is hardwired to true for x86 PVH dom0's (as RAM belonging to 958 other domains in the system don't live in a compatible address space), and 959 is ignored for ARM. 960 961* The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up 962 identity IOMMU mappings for all non-RAM regions below 4GB except for 963 unusable ranges, and ranges belonging to Xen. 964 965 Typically, some devices in a system use bits of RAM for communication, and 966 these areas should be listed as reserved in the E820 table and identified 967 via RMRR or IVMD entries in the ACPI tables, so Xen can ensure that they 968 are identity-mapped in the IOMMU. However, some firmware makes mistakes, 969 and this option is a coarse-grain workaround for those errors. 970 971 Where possible, finer grain corrections should be made with the `rmrr=`, 972 `ivmd=`, `ivrs_hpet[]=`, or `ivrs_ioapic[]=` command line options. 973 974 This option is disabled by default, and deprecated and intended for 975 removal in future versions of Xen. If specifying `map-inclusive` is the 976 only way to make your system boot, please report a bug. 977 978* The `map-reserved` functionality is very similar to `map-inclusive`. 979 980 The differences from `map-inclusive` are that `map-reserved` is applicable 981 to both x86 PV and PVH dom0's, is enabled by default, and represents a 982 subset of the correction by only mapping reserved memory regions rather 983 than all non-RAM regions. 984 985* The `none` option is intended for development purposes only, and skips 986 certain safety checks pertaining to the correct IOMMU configuration for 987 dom0 to boot. 988 989 Incorrect use of this option may result in a malfunctioning system. 990 991### dom0_ioports_disable (x86) 992> `= List of <hex>-<hex>` 993 994Specify a list of IO ports to be excluded from dom0 access. 995 996### dom0-llc-colors (arm64) 997> `= List of [ <integer> | <integer>-<integer> ]` 998 999> Default: `All available LLC colors` 1000 1001Specify dom0 LLC color configuration. This option is available only when 1002`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available 1003colors are used. 1004 1005### dom0_max_vcpus 1006 1007Either: 1008 1009> `= <integer>`. 1010 1011The number of VCPUs to give to dom0. This number of VCPUs can be more 1012than the number of PCPUs on the host. The default is the number of 1013PCPUs. 1014 1015Or: 1016 1017> `= <min>-<max>` where `<min>` and `<max>` are integers. 1018 1019Gives dom0 a number of VCPUs equal to the number of PCPUs, but always 1020at least `<min>` and no more than `<max>`. Using `<min>` may give 1021more VCPUs than PCPUs. `<min>` or `<max>` may be omitted and the 1022defaults of 1 and unlimited respectively are used instead. 1023 1024For example, with `dom0_max_vcpus=4-8`: 1025 1026> Number of 1027> PCPUs | Dom0 VCPUs 1028> 2 | 4 1029> 4 | 4 1030> 6 | 6 1031> 8 | 8 1032> 10 | 8 1033 1034### dom0_mem (ARM) 1035> `= <size>` 1036 1037Set the amount of memory for the initial domain (dom0). It must be 1038greater than zero. This parameter is required (and only used) when the initial 1039domain is not described in the Device-Tree. 1040 1041### dom0_mem (x86) 1042> `= List of ( min:<sz> | max:<sz> | <sz> )` 1043 1044Set the amount of memory for the initial domain (dom0). If a size is 1045positive, it represents an absolute value. If a size is negative, it 1046is subtracted from the total available memory. 1047 1048* `<sz>` specifies the exact amount of memory. 1049* `min:<sz>` specifies the minimum amount of memory. 1050* `max:<sz>` specifies the maximum amount of memory. 1051 1052If `<sz>` is not specified, the default is all the available memory 1053minus some reserve. The reserve is 1/16 of the available memory or 1054128 MB (whichever is smaller). 1055 1056The amount of memory will be at least the minimum but never more than 1057the maximum (i.e., `max` overrides the `min` option). If there isn't 1058enough memory then as much as possible is allocated. 1059 1060`max:<sz>` also sets the maximum reservation (the maximum amount of 1061memory dom0 can balloon up to). If this is omitted then the maximum 1062reservation is unlimited. 1063 1064For example, to set dom0's initial memory allocation to 512MB but 1065allow it to balloon up as far as 1GB use `dom0_mem=512M,max:1G` 1066 1067> `<sz>` is: `<size> | [<size>+]<frac>%` 1068> `<frac>` is an integer < 100 1069 1070* `<frac>` specifies a fraction of host memory size in percent. 1071 1072So `<sz>` being `1G+25%` on a 256 GB host would result in 65 GB. 1073 1074If you use this option then it is highly recommended that you disable 1075any dom0 autoballooning feature present in your toolstack. See the 1076_xl.conf(5)_ man page or [Xen Best 1077Practices](https://wiki.xen.org/wiki/Xen_Best_Practices#Xen_dom0_dedicated_memory_and_preventing_dom0_memory_ballooning). 1078 1079This option doesn't have effect if pv-shim mode is enabled. 1080 1081### dom0_nodes (x86) 1082 1083> `= List of [ <integer> | relaxed | strict ]` 1084 1085> Default: `strict` 1086 1087Specify the NUMA nodes to place Dom0 on. Defaults for vCPU-s created 1088and memory assigned to Dom0 will be adjusted to match the node 1089restrictions set up here. Note that the values to be specified here are 1090ACPI PXM ones, not Xen internal node numbers. `relaxed` sets up vCPU 1091affinities to prefer but be not limited to the specified node(s). 1092 1093### dom0_vcpus_pin 1094> `= <boolean>` 1095 1096> Default: `false` 1097 1098Pin dom0 vcpus to their respective pcpus 1099 1100### dtuart (ARM) 1101> `= path [:options]` 1102 1103> Default: `""` 1104 1105Specify the full path in the device tree for the UART. If the path doesn't 1106start with `/`, it is assumed to be an alias. The options are device specific. 1107 1108### e820-mtrr-clip (x86) 1109> `= <boolean>` 1110 1111Flag that specifies if RAM should be clipped to the highest cacheable 1112MTRR. 1113 1114> Default: `true` on Intel CPUs, otherwise `false` 1115 1116### e820-verbose (x86) 1117> `= <boolean>` 1118 1119> Default: `false` 1120 1121Flag that enables verbose output when processing e820 information and 1122applying clipping. 1123 1124### edd (x86) 1125> `= off | on | skipmbr` 1126 1127Control retrieval of Extended Disc Data (EDD) from the BIOS during 1128boot. 1129 1130### edid (x86) 1131> `= no | force` 1132 1133Either force retrieval of monitor EDID information via VESA DDC, or 1134disable it (edid=no). This option should not normally be required 1135except for debugging purposes. 1136 1137### efi 1138 = List of [ rs=<bool>, attr=no|uc ] 1139 1140Controls for interacting with the system Extended Firmware Interface. 1141 1142* The `rs` boolean controls whether Runtime Services are used. By default, 1143 Xen uses Runtime Services itself, and proxies certain calls on behalf of 1144 dom0. Selecting `rs=0` prohibits all use of Runtime Services. 1145 1146* The `attr=` string exists to specify what to do with memory regions of 1147 unknown/unrecognised cacheability. `attr=no` is the default and will 1148 leave the memory regions unmapped, while `attr=uc` will map them as fully 1149 uncacheable. 1150 1151### ept 1152> `= List of [ ad=<bool>, pml=<bool>, exec-sp=<bool> ]` 1153 1154> Applicability: Intel 1155 1156Extended Page Tables are a feature of Intel's VT-x technology, whereby 1157hardware manages the virtualisation of HVM guest pagetables. EPT was 1158introduced with the Nehalem architecture. 1159 1160* The `ad` boolean controls hardware tracking of Access and Dirty bits in the 1161 EPT pagetables, and was first introduced in Broadwell Server. 1162 1163 By default, Xen will use A/D tracking when available in hardware, except 1164 on Avoton processors affected by erratum AVR41. Explicitly choosing 1165 `ad=0` will disable the use of A/D tracking on capable hardware, whereas 1166 choosing `ad=1` will cause tracking to be used even on AVR41-affected 1167 hardware. 1168 1169* The `pml` boolean controls the use of Page Modification Logging, which is 1170 also introduced in Broadwell Server. 1171 1172 PML is a feature whereby the processor generates a list of pages which 1173 have been dirtied. This is necessary information for operations such as 1174 live migration, and having the processor maintain the list of dirtied 1175 pages is more efficient than traditional software implementations where 1176 all guest writes trap into Xen so the dirty bitmap can be maintained. 1177 1178 By default, Xen will use PML when it is available in hardware. PML 1179 functionally depends on A/D tracking, so choosing `ad=0` will implicitly 1180 disable PML. `pml=0` can be used to prevent the use of PML on otherwise 1181 capable hardware. 1182 1183* The `exec-sp` boolean controls whether EPT superpages with execute 1184 permissions are permitted. In general this is good for performance. 1185 1186 However, on processors vulnerable CVE-2018-12207, HVM guest kernels can 1187 use executable superpages to crash the host. By default, executable 1188 superpages are disabled on affected hardware. 1189 1190 If HVM guest kernels are trusted not to mount a DoS against the system, 1191 this option can enabled to regain performance. 1192 1193 This boolean may be modified at runtime using `xl set-parameters 1194 ept=[no-]exec-sp` to switch between fast and secure. 1195 1196 * When switching from secure to fast, preexisting HVM domains will run 1197 at their current performance until they are rebooted; new domains will 1198 run without any overhead. 1199 1200 * When switching from fast to secure, all HVM domains will immediately 1201 suffer a performance penalty. 1202 1203 **Warning: No guarantee is made that this runtime option will be retained 1204 indefinitely, or that it will retain this exact behaviour. It is 1205 intended as an emergency option for people who first chose fast, then 1206 change their minds to secure, and wish not to reboot.** 1207 1208### extra_guest_irqs (x86) 1209> `= [<domU number>][,<dom0 number>]` 1210 1211> Default: `32,<variable>` 1212 1213Change the number of PIRQs available for guests. The optional first number is 1214common for all domUs, while the optional second number (preceded by a comma) 1215is for dom0. Changing the setting for domU has no impact on dom0 and vice 1216versa. For example to change dom0 without changing domU, use 1217`extra_guest_irqs=,512`. The default value for Dom0 and an eventual separate 1218hardware domain is architecture dependent. The upper limit for both values on 1219x86 is such that the resulting total number of IRQs can't be higher than 32768. 1220Note that specifying zero as domU value means zero, while for dom0 it means 1221to use the default. Note further that the Dom0 setting has no useful meaning 1222for the PVH case; use of the option may have an adverse effect there, though. 1223 1224### ext_regions (Arm) 1225> `= <boolean>` 1226 1227> Default : `true` 1228 1229Flag to enable or disable support for extended regions for Dom0 and 1230Dom0less DomUs. 1231 1232Extended regions are ranges of unused address space exposed to the guest 1233as "safe to use" for special memory mappings. Disable if your board 1234device tree is incomplete. 1235 1236### flask 1237> `= permissive | enforcing | late | disabled` 1238 1239> Default: `enforcing` 1240 1241Specify how the FLASK security server should be configured. This option is only 1242available if the hypervisor was compiled with FLASK support. This can be 1243enabled by running either: 1244- make -C xen config and enabling XSM and FLASK. 1245- make -C xen menuconfig and enabling 'FLux Advanced Security Kernel support' and 'Xen Security Modules support' 1246 1247* `permissive`: This is intended for development and is not suitable for use 1248 with untrusted guests. If a policy is provided by the bootloader, it will be 1249 loaded; errors will be reported to the ring buffer but will not prevent 1250 booting. The policy can be changed to enforcing mode using "xl setenforce". 1251* `enforcing`: This will cause the security server to enter enforcing mode prior 1252 to the creation of domain 0. If an valid policy is not provided by the 1253 bootloader and no built-in policy is present, the hypervisor will not continue 1254 booting. 1255* `late`: This disables loading of the built-in security policy or the policy 1256 provided by the bootloader. FLASK will be enabled but will not enforce access 1257 controls until a policy is loaded by a domain using "xl loadpolicy". Once a 1258 policy is loaded, FLASK will run in enforcing mode unless "xl setenforce" has 1259 changed that setting. 1260* `disabled`: This causes the XSM framework to revert to the dummy module. The 1261 dummy module provides the same security policy as is used when compiling the 1262 hypervisor without support for XSM. The xsm_op hypercall can also be used to 1263 switch to this mode after boot, but there is no way to re-enable FLASK once 1264 the dummy module is loaded. 1265 1266### font 1267> `= <height>` where height is `8x8 | 8x14 | 8x16` 1268 1269Specify the font size when using the VESA console driver. 1270 1271### force-ept (Intel) 1272> `= <boolean>` 1273 1274> Default: `false` 1275 1276Allow EPT to be enabled when VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is not 1277present. 1278 1279*Warning:* 1280Due to CVE-2013-2212, VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is by default 1281required as a prerequisite for using EPT. If you are not using PCI Passthrough, 1282or trust the guest administrator who would be using passthrough, then the 1283requirement can be relaxed. This option is particularly useful for nested 1284virtualization, to allow the L1 hypervisor to use EPT even if the L0 hypervisor 1285does not provide `VM_ENTRY_LOAD_GUEST_PAT`. 1286 1287### gnttab 1288> `= List of [ max-ver:<integer>, transitive=<bool>, transfer=<bool> ]` 1289 1290> Default (Arm): `gnttab=max-ver:1` 1291> Default (x86,PV): `gnttab=max-ver:2,transitive,transfer` 1292> Default (x86,HVM): `gnttab=max-ver:2,transitive` 1293 1294Control various aspects of the grant table behaviour available to guests. 1295 1296* `max-ver` Select the maximum grant table version to offer to guests. Valid 1297version are 1 and 2. 1298* `transitive` Permit or disallow the use of transitive grants. Note that the 1299use of grant table v2 without transitive grants is an ABI breakage from the 1300guests point of view. 1301* `transfer` Permit or disallow the GNTTABOP_transfer operation of the 1302grant table hypercall. Note that disallowing GNTTABOP_transfer is an ABI 1303breakage from the guests point of view. This option is only available on 1304hypervisors configured to support PV guests. 1305 1306The usage of gnttab v2 is not security supported on ARM platforms. 1307 1308### gnttab_max_frames 1309> `= <integer>` 1310 1311> Default: `64` 1312 1313> Can be modified at runtime 1314 1315Specify the default upper bound on the number of frames which any domain may 1316use as part of its grant table unless a different value is specified at domain 1317creation. 1318 1319Note this value is the effective upper bound for dom0. 1320 1321### gnttab_max_maptrack_frames 1322> `= <integer>` 1323 1324> Default: `1024` 1325 1326> Can be modified at runtime 1327 1328Specify the default upper bound on the number of frames which any domain may 1329use as part of its maptrack array unless a different value is specified at 1330domain creation. 1331 1332Note this value is the effective upper bound for dom0. 1333 1334### global-pages 1335 = <boolean> 1336 1337 Applicability: x86 1338 Default: true unless running virtualized on AMD or Hygon hardware 1339 1340Control whether to use global pages for PV guests, and thus the need to 1341perform TLB flushes by writing to CR4. This is a performance trade-off. 1342 1343AMD SVM does not support selective trapping of CR4 writes, which means that a 1344global TLB flush (two CR4 writes) takes two VMExits, and massively outweigh 1345the benefit of using global pages to begin with. This case is easy for Xen to 1346spot, and is accounted for in the default setting. 1347 1348Other cases where this option might be a benefit is on VT-x hardware when 1349selective CR4 writes are not supported/enabled by the hypervisor, or in any 1350virtualised case using shadow paging. These are not easy for Xen to spot, so 1351are not accounted for in the default setting. 1352 1353### guest_loglvl 1354> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all` 1355 1356> Default: `guest_loglvl=none/warning` 1357 1358> Can be modified at runtime 1359 1360Set the logging level for Xen guests. Any log message with equal more 1361more importance will be printed. 1362 1363The optional `<rate-limited level>` option instructs which severities 1364should be rate limited. 1365 1366### hap (x86) 1367> `= <boolean>` 1368 1369> Default: `true` 1370 1371Flag to globally enable or disable support for Hardware Assisted 1372Paging (HAP) 1373 1374### hap_1gb (x86) 1375> `= <boolean>` 1376 1377> Default: `true` 1378 1379Flag to enable 1 GB host page table support for Hardware Assisted 1380Paging (HAP). 1381 1382### hap_2mb (x86) 1383> `= <boolean>` 1384 1385> Default: `true` 1386 1387Flag to enable 2 MB host page table support for Hardware Assisted 1388Paging (HAP). 1389 1390### hardware_dom 1391> `= <domid>` 1392 1393> Default: `0` 1394 1395Enable late hardware domain creation using the specified domain ID. This is 1396intended to be used when domain 0 is a stub domain which builds a disaggregated 1397system including a hardware domain with the specified domain ID. This option is 1398supported only when compiled with XSM on x86. 1399 1400### hest_disable 1401> ` = <boolean>` 1402 1403> Default: `false` 1404 1405Control Xens use of the APEI Hardware Error Source Table, should one be found. 1406 1407### highmem-start (x86) 1408> `= <size>` 1409 1410Specify the memory boundary past which memory will be treated as highmem (x86 1411debug hypervisor only). 1412 1413### hmp-unsafe (arm) 1414> `= <boolean>` 1415 1416> Default : `false` 1417 1418Say yes at your own risk if you want to enable heterogenous computing 1419(such as big.LITTLE). This may result to an unstable and insecure 1420platform, unless you manually specify the cpu affinity of all domains so 1421that all vcpus are scheduled on the same class of pcpus (big or LITTLE 1422but not both). vcpu migration between big cores and LITTLE cores is not 1423supported. See docs/misc/arm/big.LITTLE.txt for more information. 1424 1425When the hmp-unsafe option is disabled (default), CPUs that are not 1426identical to the boot CPU will be parked and not used by Xen. 1427 1428### hpet 1429 = List of [ <bool> | broadcast=<bool> | legacy-replacement=<bool> ] 1430 1431 Applicability: x86 1432 1433Controls Xen's use of the system's High Precision Event Timer. By default, 1434Xen will use an HPET when available and not subject to errata. Use of the 1435HPET can be disabled by specifying `hpet=0`. 1436 1437 * The `broadcast` boolean is disabled by default, but forces Xen to keep 1438 using the broadcast for CPUs in deep C-states even when an RTC interrupt is 1439 enabled. This then also affects raising of the RTC interrupt. 1440 1441 * The `legacy-replacement` boolean allows for control over whether Legacy 1442 Replacement mode is enabled. 1443 1444 Legacy Replacement mode is intended for hardware which does not have an 1445 8254 PIT, and allows the HPET to be configured into a compatible mode. 1446 Intel chipsets from Skylake/ApolloLake onwards can turn the PIT off for 1447 power saving reasons, and there is no platform-agnostic mechanism for 1448 discovering this. 1449 1450 By default, Xen will not change hardware configuration, unless the PIT 1451 appears to be absent, at which point Xen will try to enable Legacy 1452 Replacement mode before falling back to pre-IO-APIC interrupt routing 1453 options. 1454 1455 This behaviour can be inhibited by specifying `legacy-replacement=0`. 1456 Alternatively, this mode can be enabled unconditionally (if available) by 1457 specifying `legacy-replacement=1`. 1458 1459### hpetbroadcast (x86) 1460> `= <boolean>` 1461 1462Deprecated alternative of `hpet=broadcast`. 1463 1464### hvm_debug (x86) 1465> `= <integer>` 1466 1467The specified value is a bit mask with the individual bits having the 1468following meaning: 1469 1470> Bit 0 - debug level 0 (unused at present) 1471> Bit 1 - debug level 1 (Control Register logging) 1472> Bit 2 - debug level 2 (VMX logging of MSR restores when context switching) 1473> Bit 3 - debug level 3 (unused at present) 1474> Bit 4 - I/O operation logging 1475> Bit 5 - vMMU logging 1476> Bit 6 - vLAPIC general logging 1477> Bit 7 - vLAPIC timer logging 1478> Bit 8 - vLAPIC interrupt logging 1479> Bit 9 - vIOAPIC logging 1480> Bit 10 - hypercall logging 1481> Bit 11 - MSR operation logging 1482 1483Recognized in debug builds of the hypervisor only. 1484 1485### hvm_fep (x86) 1486> `= <boolean>` 1487 1488> Default: `false` 1489 1490Allow use of the Forced Emulation Prefix in HVM guests, to allow emulation of 1491arbitrary instructions. 1492 1493This option is intended for development and testing purposes. 1494 1495*Warning* 1496As this feature opens up the instruction emulator to arbitrary 1497instruction from an HVM guest, don't use this in production system. No 1498security support is provided when this flag is set. 1499 1500### hvm_port80 (x86) 1501> `= <boolean>` 1502 1503> Default: `true` 1504 1505Specify whether guests are to be given access to physical port 80 1506(often used for debugging purposes), to override the DMI based 1507detection of systems known to misbehave upon accesses to that port. 1508 1509### idle_latency_factor (x86) 1510> `= <integer>` 1511 1512### ioapic_ack (x86) 1513> `= old | new` 1514 1515> Default: `new` unless directed-EOI is supported 1516 1517### iommu 1518 = List of [ <bool>, verbose, debug, force, required, 1519 quarantine=<bool>|scratch-page, 1520 sharept, superpages, intremap, intpost, crash-disable, 1521 snoop, qinval, igfx, amd-iommu-perdev-intremap, 1522 dom0-{passthrough,strict} ] 1523 1524 All sub-options are boolean in nature. 1525 1526I/O Memory Memory Units perform a function similar to the CPU MMU (hence the 1527name), but typically exist as a discrete device, integrated as part of a PCI 1528Root Complex. The most common configuration is to have one IOMMU per package 1529(for on-die PCIe devices and directly attached PCIe lanes), and one IOMMU 1530covering the remaining I/O in the system. 1531 1532The functionality in an IOMMU commonly falls into two orthogonal categories: 1533 15341. DMA remapping which uses a pagetable-like hierarchical structure and maps 1535 I/O Virtual Addresses (DFNs - Device Frame Numbers in Xen's terminology) 1536 to System Physical Addresses (MFNs - Machine Frame Numbers in Xen's 1537 terminology). 1538 15392. Interrupt Remapping, which controls incoming Message Signalled Interrupt 1540 requests, including their routing to specific CPUs. 1541 1542IOMMU functionality can be used to provide a translation which the hardware 1543device driver isn't aware of (e.g. PCI Passthrough and a native driver inside 1544the guest) and/or to enforce fine-grained control over the memory and 1545interrupts which a device is attempting to access. 1546 1547By default, IOMMUs are configured for use if they are available. An overall 1548boolean (e.g. `iommu=no`) can override this and leave the IOMMUs disabled. 1549 1550* The `verbose` and `debug` booleans can be used to print additional 1551 diagnostic information. Neither are active by default. 1552 1553* The `force` and `required` booleans are synonymous and, when requested, 1554 will prevent Xen from booting if IOMMUs aren't discovered and enabled 1555 successfully. 1556 1557* The `quarantine` option can be used to control Xen's behavior when 1558 de-assigning devices from guests. The default behaviour is chosen at 1559 compile time, and is one of `CONFIG_IOMMU_QUARANTINE_{NONE,BASIC,SCRATCH_PAGE}`. 1560 1561 When a PCI device is assigned to an untrusted domain, it is possible 1562 for that domain to program the device to DMA to an arbitrary address. 1563 The IOMMU is used to protect the host from malicious DMA by making 1564 sure that the device addresses can only target memory assigned to the 1565 guest. However, when the guest domain is torn down, assigning the 1566 device back to the hardware domain would allow any in-flight DMA to 1567 potentially target critical host data. To avoid this, quarantining 1568 should be enabled. Quarantining can be done in two ways: In its basic 1569 form, all in-flight DMA will simply be forced to encounter IOMMU 1570 faults. Since there are systems where doing so can cause host lockup, 1571 an alternative form is available where accesses to memory will be directed 1572 to a scratch page. The implication here is that such accesses will go 1573 unnoticed, i.e. an admin may not become aware of the underlying problem. 1574 1575 Therefore, if this option is set to true (the default), Xen always 1576 quarantines such devices; they must be explicitly assigned back to Dom0 1577 before they can be used there again. If set to "scratch-page", still 1578 active DMA operations will additionally be directed to a "scratch" page. If 1579 set to false, Xen will only quarantine devices the toolstack has arranged 1580 for getting quarantined, and only in the "basic" form. 1581 1582 This option is only valid on builds supporting PCI. 1583 1584* The `sharept` boolean controls whether the IOMMU pagetables are shared 1585 with the CPU-side HAP pagetables, or allocated separately. Sharing 1586 reduces the memory overhead, but doesn't work in combination with CPU-side 1587 pagefault-based features, e.g. dirty VRAM tracking when a PCI device is 1588 assigned. 1589 1590 Due to implementation choices, sharing pagetables doesn't work on AMD 1591 hardware, and this option is ignored. It is enabled by default on Intel 1592 systems. 1593 1594 This option is ignored on ARM, and the pagetables are always shared. 1595 1596* The `superpages` boolean controls whether superpage mappings may be used 1597 in IOMMU page tables. If using this option is necessary to fix an issue, 1598 please report a bug. 1599 1600 This option is only valid on x86. 1601 1602* The `intremap` boolean controls the Interrupt Remapping sub-feature, and 1603 is active by default on compatible hardware. On x86 systems, the first 1604 generation of IOMMUs only supported DMA remapping, and Interrupt Remapping 1605 appeared in the second generation. 1606 1607 This option is only valid on x86. 1608 1609* The `intpost` boolean controls the Posted Interrupt sub-feature. In 1610 combination with APIC acceleration (VT-x APICV, SVM AVIC), the IOMMU can 1611 be configured to deliver interrupts from assigned PCI devices directly 1612 into the guest, without trapping out into hypervisor context. 1613 1614 This option depends on `intremap`, and is disabled by default due to some 1615 corner cases in the implementation which have yet to be resolved. 1616 1617 This option is only valid on x86, and only builds of Xen with HVM support. 1618 1619* The `crash-disable` boolean controls disabling IOMMU functionality (DMAR/IR/QI) 1620 before switching to a crash kernel. This option is inactive by default and 1621 is for compatibility with older kdump kernels only. Modern kernels copy 1622 all the necessary tables from the previous one following kexec which makes 1623 the transition transparent for them with IOMMU functions still on. 1624 1625The following options are specific to Intel VT-d hardware: 1626 1627* The `snoop` boolean controls the Snoop Control sub-feature, and is active 1628 by default on compatible hardware. 1629 1630 An incoming DMA request may specify _Snooped_ (query the CPU caches for 1631 the appropriate lines) or _Non-Snooped_ (don't query the CPU caches). 1632 _Non-Snooped_ accesses incur less latency, but behind-the-scenes 1633 hypervisor activity can invalidate the expectations of the device driver, 1634 and Snoop Control allows the hypervisor to force DMA requests to be 1635 _Snooped_ when they would otherwise not be. 1636 1637* The `qinval` boolean controls the Queued Invalidation sub-feature, and is 1638 active by default on compatible hardware. Queued Invalidation is a 1639 feature in second-generation IOMMUs and is a functional prerequisite for 1640 Interrupt Remapping. Note that Xen disregards this setting for Intel VT-d 1641 version 6 and greater as Registered-Based Invalidation isn't supported 1642 by them. 1643 1644* The `igfx` boolean is active by default, and controls whether IOMMUs in 1645 front of solely graphics devices get enabled or not. 1646 1647 It is intended as a debugging mechanism for graphics issues, and to be 1648 similar to Linux's `intel_iommu=igfx_off` option. If specifying `no-igfx` 1649 fixes anything, please report the problem. 1650 1651The following options are specific to AMD-Vi hardware: 1652 1653* The `amd-iommu-perdev-intremap` boolean controls whether the interrupt 1654 remapping table is per device (the default), or a single global table for 1655 the entire system. 1656 1657 Using a global table is not security supported as it allows all devices to 1658 impersonate each other as far as interrupts as concerned (see XSA-36), but 1659 it is a workaround for SP5100 Erratum 28. 1660 1661**WARNING: The `dom0-passthrough` and `dom0-strict` booleans are both 1662deprecated, and superseded by _dom0-iommu={passthrough,strict}_ respectively - 1663using both the old and new command line options in combination is undefined.** 1664 1665### iommu_dev_iotlb_timeout 1666> `= <integer>` 1667 1668> Default: `1000` 1669 1670Specify the timeout of the device IOTLB invalidation in milliseconds. 1671By default, the timeout is 1000 ms. When you see error 'Queue invalidate 1672wait descriptor timed out', try increasing this value. 1673 1674### iommu_inclusive_mapping 1675> `= <boolean>` 1676 1677**WARNING: This command line option is deprecated, and superseded by 1678_dom0-iommu=map-inclusive_ - using both options in combination is undefined.** 1679 1680### irq-max-guests (x86) 1681> `= <integer>` 1682 1683> Default: `32` 1684 1685Maximum number of guests any individual IRQ could be shared between, 1686i.e. a limit on the number of guests it is possible to start each having 1687assigned a device sharing a common interrupt line. Accepts values between 16881 and 255. 1689 1690### irq_ratelimit (x86) 1691> `= <integer>` 1692 1693### irq_vector_map (x86) 1694 1695### ivmd (x86) 1696> `= <start>[-<end>][=<bdf1>[-<bdf1'>][,<bdf2>[-<bdf2'>][,...]]][;<start>...]` 1697 1698Define IVMD-like ranges that are missing from ACPI tables along with the 1699device(s) they belong to, and use them for 1:1 mapping. End addresses can be 1700omitted when exactly one page is meant. The ranges are inclusive when start 1701and end are specified. Note that only PCI segment 0 is supported at this time, 1702but it is fine to specify it explicitly. 1703 1704'start' and 'end' values are page numbers (not full physical addresses), 1705in hexadecimal format (can optionally be preceded by "0x"). 1706 1707Omitting the optional (range of) BDF spcifiers signals that the range is to 1708be applied to all devices. 1709 1710Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be 1711reserved, and devices 0:0:1a.0...0:0:1a.3 collectively require three pages 1712(0xd5d46 thru 0xd5d48) to be reserved, one usage would be: 1713 1714ivmd=d5d45=0:1d.0;0xd5d46-0xd5d48=0:1a.0-0:1a.3 1715 1716Note: grub2 requires to escape or quote special characters, like ';' when 1717multiple ranges are specified - refer to the grub2 documentation. 1718 1719### ivrs_hpet[`<hpet>`] (AMD) 1720> `=[<seg>:]<bus>:<device>.<func>` 1721 1722Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of HPET 1723`<hpet>` instead of the one specified by the IVHD sub-tables of the IVRS 1724ACPI table. 1725 1726### ivrs_ioapic[`<ioapic>`] (AMD) 1727> `=[<seg>:]<bus>:<device>.<func>` 1728 1729Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of IO-APIC 1730`<ioapic>` instead of the one specified by the IVHD sub-tables of the IVRS 1731ACPI table. 1732 1733### lapic (x86) 1734> `= <boolean>` 1735 1736Force the use of use of the local APIC on a uniprocessor system, even 1737if left disabled by the BIOS. 1738 1739### lapic_timer_c2_ok (x86) 1740> `= <boolean>` 1741 1742### ler (x86) 1743> `= <boolean>` 1744 1745> Default: false 1746 1747This option is intended for debugging purposes only. Enable MSR_DEBUGCTL.LBR 1748in hypervisor context to be able to dump the Last Interrupt/Exception To/From 1749record with other registers. 1750 1751### llc-coloring (arm64) 1752> `= <boolean>` 1753 1754> Default: `false` 1755 1756Flag to enable or disable LLC coloring support at runtime. This option is 1757available only when `CONFIG_LLC_COLORING` is enabled. See the general 1758cache coloring documentation for more info. 1759 1760### llc-nr-ways (arm64) 1761> `= <integer>` 1762 1763> Default: `Obtained from hardware` 1764 1765Specify the number of ways of the Last Level Cache. This option is available 1766only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used 1767to find the number of supported cache colors. By default the value is 1768automatically computed by probing the hardware, but in case of specific needs, 1769it can be manually set. Those include failing probing and debugging/testing 1770purposes so that it's possible to emulate platforms with different number of 1771supported colors. If set, also "llc-size" must be set, otherwise the default 1772will be used. Note that using both options implies "llc-coloring=on" unless an 1773earlier "llc-coloring=off" is there. 1774 1775### llc-size (arm64) 1776> `= <size>` 1777 1778> Default: `Obtained from hardware` 1779 1780Specify the size of the Last Level Cache. This option is available only when 1781`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find 1782the number of supported cache colors. By default the value is automatically 1783computed by probing the hardware, but in case of specific needs, it can be 1784manually set. Those include failing probing and debugging/testing purposes so 1785that it's possible to emulate platforms with different number of supported 1786colors. If set, also "llc-nr-ways" must be set, otherwise the default will be 1787used. Note that using both options implies "llc-coloring=on" unless an 1788earlier "llc-coloring=off" is there. 1789 1790### lock-depth-size 1791> `= <integer>` 1792 1793> Default: `lock-depth-size=64` 1794 1795Specifies the maximum number of nested locks tested for illegal recursions. 1796Higher nesting levels still work, but recursion testing is omitted for those 1797levels. In case an illegal recursion is detected the system will crash 1798immediately. Specifying `0` will disable all testing of illegal lock nesting. 1799 1800This option is available for hypervisors built with CONFIG_DEBUG_LOCKS only. 1801 1802### loglvl 1803> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all` 1804 1805> Default: `loglvl=info` 1806 1807> Can be modified at runtime 1808 1809Set the logging level for Xen. Any log message with equal more more 1810importance will be printed. 1811 1812The optional `<rate-limited level>` option instructs which severities 1813should be rate limited. 1814 1815### low_crashinfo 1816> `= none | min | all` 1817 1818> Default: `none` if not specified at all, or to `min` if **low_crashinfo** is present without qualification. 1819 1820This option is only useful for hosts with a 32bit dom0 kernel, wishing 1821to use kexec functionality in the case of a crash. It represents 1822which data structures should be deliberately allocated in low memory, 1823so the crash kernel may find find them. Should be used in combination 1824with **crashinfo_maxaddr**. 1825 1826### low_mem_virq_limit 1827> `= <size>` 1828 1829> Default: `64M` 1830 1831Specify the threshold below which Xen will inform dom0 that the quantity of 1832free memory is getting low. Specifying `0` will disable this notification. 1833 1834### maxcpus 1835> `= <integer>` 1836 1837Specify the maximum number of CPUs that should be brought up. 1838 1839This option is ignored in **pv-shim** mode. 1840 1841**WARNING: On Arm big.LITTLE systems, when `hmp-unsafe` option is enabled, this command line 1842option does not guarantee on which CPU types will be used.** 1843 1844### max_cstate (x86) 1845> `= <integer>[,<integer>]` 1846 1847Specify the deepest C-state CPUs are permitted to be placed in, and 1848optionally the maximum sub C-state to be used used. The latter only applies 1849to the highest permitted C-state. 1850 1851### max_gsi_irqs (x86) 1852> `= <integer>` 1853 1854Specifies the number of interrupts to be use for pin (IO-APIC or legacy PIC) 1855based interrupts. Any higher IRQs will be available for use via PCI MSI. 1856 1857### max_lpi_bits (arm) 1858> `= <integer>` 1859 1860Specifies the number of ARM GICv3 LPI interrupts to allocate on the host, 1861presented as the number of bits needed to encode it. This must be at least 186214 and not exceed 32, and each LPI requires one byte (configuration) and 1863one pending bit to be allocated. 1864Defaults to 20 bits (to cover at most 1048576 interrupts). 1865 1866### mce (x86) 1867> `= <boolean>` 1868 1869> Default: `true` 1870 1871Allows to disable the use of Machine Check Exceptions. Note that doing 1872so may result in silent shutdown of the system in case an event occurs 1873which would have resulted in raising a Machine Check Exception. Silent 1874here is as far as Xen is concerned; firmware may offer to retrieve some 1875collected data. 1876 1877### mce_fb (Intel) 1878> `= <boolean>` 1879 1880> Default: `false` 1881 1882Force broadcasting of Machine Check Exceptions, suppressing the use of 1883Local MCE functionality available in newer Intel hardware. 1884 1885### mce_verbosity (x86) 1886> `= verbose` 1887 1888Specify verbose machine check output. 1889 1890### mem (x86) 1891> `= <size>` 1892 1893Specify the maximum address of physical RAM. Any RAM beyond this 1894limit is ignored by Xen. 1895 1896### memop-max-order 1897> `= [<domU>][,[<ctldom>][,[<hwdom>][,<ptdom>]]]` 1898 1899> x86 default: `9,18,12,12` 1900> ARM default: `9,18,10,10` 1901 1902Change the maximum order permitted for allocation (or allocation-like) 1903requests issued by the various kinds of domains (in this order: 1904ordinary DomU, control domain, hardware domain, and - when supported 1905by the platform - DomU with pass-through device assigned). 1906 1907### mmcfg (x86) 1908> `= <boolean>[,amd-fam10]` 1909 1910> Default: `1` 1911 1912Specify if the MMConfig space should be enabled. 1913 1914### mmio-relax (x86) 1915> `= <boolean> | all` 1916 1917> Default: `false` 1918 1919By default, domains may not create cached mappings to MMIO regions. 1920This option relaxes the check for Domain 0 (or when using `all`, all PV 1921domains), to permit the use of cacheable MMIO mappings. 1922 1923### msi (x86) 1924> `= <boolean>` 1925 1926> Default: `true` 1927 1928Force Xen to (not) use PCI-MSI, even if ACPI FADT says otherwise. 1929 1930### mtrr.show (x86) 1931> `= <boolean>` 1932 1933> Default: `false` 1934 1935Print boot time MTRR state. 1936 1937### mwait-idle (x86) 1938> `= <boolean>` 1939 1940> Default: `true` 1941 1942Use the MWAIT idle driver (with model specific C-state knowledge) instead 1943of the ACPI based one. 1944 1945### nmi (x86) 1946> `= ignore | dom0 | fatal` 1947 1948> Default: `fatal` for a debug build, or `dom0` for a non-debug build 1949 1950Specify what Xen should do in the event of an NMI parity or I/O error. 1951`ignore` discards the error; `dom0` causes Xen to report the error to 1952dom0, while 'fatal' causes Xen to print diagnostics and then hang. 1953 1954### noapic (x86) 1955 1956Instruct Xen to ignore any IOAPICs that are present in the system, and 1957instead continue to use the legacy PIC. This is _not_ recommended with 1958pvops type kernels. 1959 1960Because responsibility for APIC setup is shared between Xen and the 1961domain 0 kernel this option is automatically propagated to the domain 19620 command line. 1963 1964### invpcid (x86) 1965> `= <boolean>` 1966 1967> Default: `true` 1968 1969By default, Xen will use the INVPCID instruction for TLB management if 1970it is available. This option can be used to cause Xen to fall back to 1971older mechanisms, which are generally slower. 1972 1973### load-balance-ratelimit 1974> `= <integer>` 1975 1976The minimum interval between load balancing events on a given pcpu, in 1977microseconds. A value of '0' will disable rate limiting. Maximum 1978value 1 second. At the moment only credit honors this parameter. 1979Default 1ms. 1980 1981### noirqbalance (x86) 1982> `= <boolean>` 1983 1984Disable software IRQ balancing and affinity. This can be used on 1985systems such as Dell 1850/2850 that have workarounds in hardware for 1986IRQ routing issues. 1987 1988### nolapic (x86) 1989> `= <boolean>` 1990 1991> Default: `false` 1992 1993Ignore the local APIC on a uniprocessor system, even if enabled by the 1994BIOS. 1995 1996### no-real-mode (x86) 1997> `= <boolean>` 1998 1999Do not execute real-mode bootstrap code when booting Xen. This option 2000should not be used except for debugging. It will effectively disable 2001the **vga** option, which relies on real mode to set the video mode. 2002 2003### noreboot 2004> `= <boolean>` 2005 2006Do not automatically reboot after an error. This is useful for 2007catching debug output. Defaults to automatically reboot after 5 2008seconds. 2009 2010### nosmp (x86) 2011> `= <boolean>` 2012 2013Disable SMP support. No secondary processors will be booted. 2014Defaults to booting secondary processors. 2015 2016This option is ignored in **pv-shim** mode. 2017 2018### nr_irqs (x86) 2019> `= <integer>` 2020 2021### numa (x86) 2022> `= on | off | fake=<integer> | noacpi` 2023 2024> Default: `on` 2025 2026### partial-emulation (arm) 2027> `= <boolean>` 2028 2029> Default: `false` 2030 2031Flag to enable or disable partial emulation of system/coprocessor registers. 2032Only effective if CONFIG_PARTIAL_EMULATION is enabled. 2033 2034**WARNING: Enabling this option might result in unwanted/non-spec compliant 2035behavior.** 2036 2037### pci 2038 = List of [ serr=<bool>, perr=<bool> ] 2039 2040 Default: Signaling left as set by firmware. 2041 2042Override the firmware settings, and explicitly enable or disable the 2043signalling of PCI System and Parity errors. 2044 2045### pci-phantom 2046> `=[<seg>:]<bus>:<device>,<stride>` 2047 2048Mark a group of PCI devices as using phantom functions without actually 2049advertising so, so the IOMMU can create translation contexts for them. 2050 2051All numbers specified must be hexadecimal ones. 2052 2053This option can be specified more than once (up to 8 times at present). 2054 2055### pci-passthrough (arm) 2056> `= <boolean>` 2057 2058> Default: `false` 2059 2060Flag to enable or disable support for PCI passthrough 2061 2062### pcid (x86) 2063> `= <boolean> | xpti=<bool>` 2064 2065> Default: `xpti` 2066 2067> Can be modified at runtime (change takes effect only for domains created 2068 afterwards) 2069 2070If available, control usage of the PCID feature of the processor for 207164-bit pv-domains. PCID can be used either for no domain at all (`false`), 2072for all of them (`true`), only for those subject to XPTI (`xpti`) or for 2073those not subject to XPTI (`no-xpti`). The feature is used only in case 2074INVPCID is supported and not disabled via `invpcid=false`. 2075 2076### pdx-compress 2077> `= <boolean>` 2078 2079> Default: `true` if CONFIG_PDX_NONE is unset 2080 2081Only relevant when the hypervisor is build with PFN PDX compression. Controls 2082whether Xen will engage in PFN compression. The algorithm used for PFN 2083compression is selected at build time from Kconfig. 2084 2085### ple_gap 2086> `= <integer>` 2087 2088### ple_window (Intel) 2089> `= <integer>` 2090 2091### preferred-cstates (x86) 2092> `= ( <integer> | List of ( C1 | C1E | C2 | ... )` 2093 2094This is a mask of C-states which are to be used preferably. This option is 2095applicable only on hardware were certain C-states are exclusive of one another. 2096 2097### probe-port-aliases (x86) 2098> `= <boolean>` 2099 2100> Default: `true` outside of shim mode, `false` in shim mode 2101 2102Certain devices accessible by I/O ports may be accessible also through "alias" 2103ports (originally a result of incomplete address decoding). When such devices 2104are solely under Xen's control, Xen disallows even Dom0 access to the "primary" 2105ports. When alias probing is active and aliases are detected, "alias" ports 2106would then be treated similar to the "primary" ones. 2107 2108### psr (Intel) 2109> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> | cdp:<boolean> )` 2110 2111> Default: `psr=cmt:0,rmid_max:255,cat:0,cos_max:255,cdp:0` 2112 2113Platform Shared Resource(PSR) Services. Intel Haswell and later server 2114platforms offer information about the sharing of resources. 2115 2116To use the PSR monitoring service for a certain domain, a Resource 2117Monitoring ID(RMID) is used to bind the domain to corresponding shared 2118resource. RMID is a hardware-provided layer of abstraction between software 2119and logical processors. 2120 2121To use the PSR cache allocation service for a certain domain, a capacity 2122bitmasks(CBM) is used to bind the domain to corresponding shared resource. 2123CBM represents cache capacity and indicates the degree of overlap and isolation 2124between domains. In hypervisor a Class of Service(COS) ID is allocated for each 2125unique CBM. 2126 2127The following resources are available: 2128 2129* Cache Monitoring Technology (Haswell and later). Information regarding the 2130 L3 cache occupancy. 2131 * `cmt` instructs Xen to enable/disable Cache Monitoring Technology. 2132 * `rmid_max` indicates the max value for rmid. 2133* Memory Bandwidth Monitoring (Broadwell and later). Information regarding the 2134 total/local memory bandwidth. Follow the same options with Cache Monitoring 2135 Technology. 2136 2137* Cache Allocation Technology (Broadwell and later). Information regarding 2138 the cache allocation. 2139 * `cat` instructs Xen to enable/disable Cache Allocation Technology. 2140 * `cos_max` indicates the max value for COS ID. 2141* Code and Data Prioritization Technology (Broadwell and later). Information 2142 regarding the code cache and the data cache allocation. CDP is based on CAT. 2143 * `cdp` instructs Xen to enable/disable Code and Data Prioritization. Note 2144 that `cos_max` of CDP is a little different from `cos_max` of CAT. With 2145 CDP, one COS will corespond two CBMs other than one with CAT, due to the 2146 sum of CBMs is fixed, that means actual `cos_max` in use will automatically 2147 reduce to half when CDP is enabled. 2148 2149### pv 2150 = List of [ 32=<bool> ] 2151 2152 Applicability: x86 2153 2154Controls for aspects of PV guest support. 2155 2156* The `32` boolean controls whether 32bit PV guests can be created. It 2157 defaults to `true`, and is ignored when `CONFIG_PV32` is compiled out. 2158 2159 32bit PV guests are incompatible with CET Shadow Stacks. If Xen is using 2160 shadow stacks, this option will be overridden to `false`. Backwards 2161 compatibility can be maintained with the `pv-shim` mechanism. 2162 2163### pv-linear-pt (x86) 2164> `= <boolean>` 2165 2166> Default: `true` 2167 2168Only available if Xen is compiled with `CONFIG_PV_LINEAR_PT` support 2169enabled. 2170 2171Allow PV guests to have pagetable entries pointing to other pagetables 2172of the same level (i.e., allowing L2 PTEs to point to other L2 pages). 2173This technique is often called "linear pagetables", and is sometimes 2174used to allow operating systems a simple way to consistently map the 2175current process's pagetables into its own virtual address space. 2176 2177Linux and MiniOS don't use this technique. NetBSD and Novell Netware 2178do; there may be other custom operating systems which do. If you're 2179certain you don't plan on having PV guests which use this feature, 2180turning it off can reduce the attack surface. 2181 2182### pv-l1tf (x86) 2183> `= List of [ <bool>, dom0=<bool>, domu=<bool> ]` 2184 2185> Default: `false` on believed-unaffected hardware, or in pv-shim mode. 2186> `domu` on believed-affected hardware. 2187 2188Mitigations for L1TF / XSA-273 / CVE-2018-3620 for PV guests. 2189 2190For backwards compatibility, we may not alter an architecturally-legitimate 2191pagetable entry a PV guest chooses to write. We can however force such a 2192guest into shadow mode so that Xen controls the PTEs which are reachable by 2193the CPU pagewalk. 2194 2195Shadowing is performed at the point where a PV guest first tries to write an 2196L1TF-vulnerable PTE. Therefore, a PV guest kernel which has been updated with 2197its own L1TF mitigations will not trigger shadow mode if it is well behaved. 2198 2199If `CONFIG_SHADOW_PAGING` is not compiled in, this mitigation instead crashes 2200the guest when an L1TF-vulnerable PTE is written, which still allows updated, 2201well-behaved PV guests to run, despite Shadow being compiled out. 2202 2203In the pv-shim case, Shadow is expected to be compiled out, and a malicious 2204guest kernel can only leak data from the shim Xen, rather than the host Xen. 2205 2206### pv-shim (x86) 2207> `= <boolean>` 2208 2209> Default: `false` 2210 2211This option is intended for use by a toolstack, when choosing to run a PV 2212guest compatibly inside an HVM container. 2213 2214In this mode, the kernel and initrd passed as modules to the hypervisor are 2215constructed into a plain unprivileged PV domain. 2216 2217### rcu-idle-timer-period-ms 2218> `= <integer>` 2219 2220> Default: `10` 2221 2222How frequently a CPU which has gone idle, but with pending RCU callbacks, 2223should be woken up to check if the grace period has completed, and the 2224callbacks are safe to be executed. Expressed in milliseconds; maximum is 2225100, and it can't be 0. 2226 2227### reboot (x86) 2228> `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]` 2229 2230> Default: system dependent 2231 2232Specify the host reboot method. 2233 2234`warm` instructs Xen to not set the cold reboot flag. 2235 2236`cold` instructs Xen to set the cold reboot flag. 2237 2238`no` instructs Xen to not automatically reboot after panics or crashes. 2239 2240`triple` instructs Xen to reboot the host by causing a triple fault. 2241 2242`kbd` instructs Xen to reboot the host via the keyboard controller. 2243 2244`acpi` instructs Xen to reboot the host using RESET_REG in the ACPI FADT (this 2245is default mode if available). 2246 2247`pci` instructs Xen to reboot the host using PCI reset register (port CF9). 2248 2249`Power` instructs Xen to power-cycle the host using PCI reset register (port CF9). 2250 2251`efi` instructs Xen to reboot using the EFI reboot call. 2252 2253`xen` instructs Xen to reboot using Xen's SCHEDOP hypercall (this is the default 2254when running nested Xen) 2255 2256### rmrr 2257> `= start<-end>=[s1]bdf1[,[s1]bdf2[,...]];start<-end>=[s2]bdf1[,[s2]bdf2[,...]]` 2258 2259Define RMRR units that are missing from ACPI table along with device they 2260belong to and use them for 1:1 mapping. End addresses can be omitted and one 2261page will be mapped. The ranges are inclusive when start and end are specified. 2262If segment of the first device is not specified, segment zero will be used. 2263If other segments are not specified, first device segment will be used. 2264If a segment is specified for other than the first device and it does not match 2265the one specified for the first one, an error will be reported. 2266 2267'start' and 'end' values are page numbers (not full physical addresses), 2268in hexadecimal format (can optionally be preceded by "0x"). 2269 2270Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be 2271reserved, and device 0:0:1a.0 requires three pages (0xd5d46 thru 0xd5d48) 2272to be reserved, one usage would be: 2273 2274rmrr=d5d45=0:0:1d.0;0xd5d46-0xd5d48=0:0:1a.0 2275 2276Note: grub2 requires to escape or use quotations if special characters are used, 2277namely ';', refer to the grub2 documentation if multiple ranges are specified. 2278 2279### ro-hpet (x86) 2280> `= <boolean>` 2281 2282> Default: `true` 2283 2284Map the HPET page as read only in Dom0. If disabled the page will be mapped 2285with read and write permissions. 2286 2287### sched 2288> `= credit | credit2 | arinc653 | rtds | null` 2289 2290> Default: `sched=credit2` 2291 2292Choose the default scheduler. Note the default scheduler is selectable via 2293Kconfig and depends on enabled schedulers. Check 2294`CONFIG_SCHED_DEFAULT` to see which scheduler is the default. 2295 2296### sched_credit2_max_cpus_runqueue 2297> `= <integer>` 2298 2299> Default: `16` 2300 2301Defines how many CPUs will be put, at most, in each Credit2 runqueue. 2302 2303Runqueues are still arranged according to the host topology (and following 2304what indicated by the 'credit2_runqueue' parameter). But we also have a cap 2305to the number of CPUs that share each runqueues. 2306 2307A value that is a submultiple of the number of online CPUs is recommended, 2308as that would likely produce a perfectly balanced runqueue configuration. 2309 2310### sched_credit2_migrate_resist 2311> `= <integer>` 2312 2313### sched_credit_tslice_ms 2314> `= <integer>` 2315 2316Set the timeslice of the credit1 scheduler, in milliseconds. The 2317default is 30ms. Reasonable values may include 10, 5, or even 1 for 2318very latency-sensitive workloads. 2319 2320### sched-gran (x86) 2321> `= cpu | core | socket` 2322 2323> Default: `sched-gran=cpu` 2324 2325Set the scheduling granularity. In case the granularity is larger than 1 (e.g. 2326`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned 2327statically to a "scheduling unit" which will then be subject to scheduling. 2328This assignment of vcpus to scheduling units is fixed. 2329 2330`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a 2331hyperthread using x86/Intel terminology) 2332 2333`core`: As many vcpus as there are cpus on a physical core are scheduled 2334together on a physical core. 2335 2336`socket`: As many vcpus as there are cpus on a physical sockets are scheduled 2337together on a physical socket. 2338 2339Note: a value other than `cpu` will result in rejecting a runtime modification 2340attempt of the "smt" setting. 2341 2342Note: for AMD x86 processors before Fam17 the terminology in the official data 2343sheets is different: a cpu is named "core" and multiple "cores" are running 2344in the same "compute unit". As from Fam17 on AMD is using the same names as 2345Intel ("thread" and "core") the topology levels are named "cpu", "core" and 2346"socket" even on older AMD processors. 2347 2348### sched_ratelimit_us 2349> `= <integer>` 2350 2351In order to limit the rate of context switching, set the minimum 2352amount of time that a vcpu can be scheduled for before preempting it, 2353in microseconds. The default is 1000us (1ms). Setting this to 0 2354disables it altogether. 2355 2356### sched_smt_power_savings 2357> `= <boolean>` 2358 2359Normally Xen will try to maximize performance and cache utilization by 2360spreading out vcpus across as many different divisions as possible 2361(i.e, numa nodes, sockets, cores threads, &c). This often maximizes 2362throughput, but also maximizes energy usage, since it reduces the 2363depth to which a processor can sleep. 2364 2365This option inverts the logic, so that the scheduler in effect tries 2366to keep the vcpus on the smallest amount of silicon possible; i.e., 2367first fill up sibling threads, then sibling cores, then sibling 2368sockets, &c. This will reduce performance somewhat, particularly on 2369systems with hyperthreading enabled, but should reduce power by 2370enabling more sockets and cores to go into deeper sleep states. 2371 2372### scrub-domheap 2373> `= <boolean>` 2374 2375> Default: `false` 2376 2377Scrub domains' freed pages. This is a safety net against a (buggy) domain 2378accidentally leaking secrets by releasing pages without proper sanitization. 2379 2380### serial_tx_buffer 2381> `= <size>` 2382 2383> Default: `CONFIG_SERIAL_TX_BUFSIZE` 2384 2385Set the serial transmit buffer size. 2386 2387### serrors (ARM) 2388> `= diverse | panic` 2389 2390> Default: `diverse` 2391 2392This parameter is provided to administrators to determine how the hypervisor 2393handles SErrors. 2394 2395* `diverse`: 2396 The hypervisor will distinguish guest SErrors from hypervisor SErrors: 2397 - The guest generated SErrors will be forwarded to the currently running 2398 guest. 2399 - The hypervisor generated SErrors will cause the whole system to crash 2400 2401* `panic`: 2402 All SErrors will cause the whole system to crash. This option should only 2403 be used if you trust all your guests and/or they don't have a gadget (e.g. 2404 device) to generate SErrors in normal run. 2405 2406### shim_mem (x86) 2407> `= List of ( min:<size> | max:<size> | <size> )` 2408 2409Set the amount of memory that xen-shim uses. Only has effect if pv-shim mode is 2410enabled. Note that this value accounts for the memory used by the shim itself 2411plus the free memory slack given to the shim for runtime allocations. 2412 2413* `min:<size>` specifies the minimum amount of memory. Ignored if greater 2414 than max. 2415* `max:<size>` specifies the maximum amount of memory. 2416* `<size>` specifies the exact amount of memory. Overrides both min and max. 2417 2418By default, the amount of free memory slack given to the shim for runtime usage 2419is 1MB. 2420 2421### smap (x86) 2422> `= <boolean> | hvm` 2423 2424> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware 2425 2426Flag to enable Supervisor Mode Access Prevention 2427Use `smap=hvm` to allow SMAP use by HVM guests only. 2428 2429In PV shim mode on AMD or Hygon hardware due to significant performance impact 2430in some cases and generally lower security risk the option defaults to false. 2431 2432### smep (x86) 2433> `= <boolean> | hvm` 2434 2435> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware 2436 2437Flag to enable Supervisor Mode Execution Protection 2438Use `smep=hvm` to allow SMEP use by HVM guests only. 2439 2440In PV shim mode on AMD or Hygon hardware due to significant performance impact 2441in some cases and generally lower security risk the option defaults to false. 2442 2443### smt (x86) 2444> `= <boolean>` 2445 2446Default: `true` 2447 2448Control bring up of multiple hyper-threads per CPU core. 2449 2450### snb_igd_quirk 2451> `= <boolean> | cap | <integer>` 2452 2453A true boolean value enables legacy behavior (1s timeout), while `cap` 2454enforces the maximum theoretically necessary timeout of 670ms. Any number 2455is being interpreted as a custom timeout in milliseconds. Zero or boolean 2456false disable the quirk workaround, which is also the default. 2457 2458### spec-ctrl (Arm) 2459> `= List of [ ssbd=force-disable|runtime|force-enable ]` 2460 2461Controls for speculative execution sidechannel mitigations. 2462 2463The option `ssbd=` is used to control the state of Speculative Store 2464Bypass Disable (SSBD) mitigation. 2465 2466* `ssbd=force-disable` will keep the mitigation permanently off. The guest 2467will not be able to control the state of the mitigation. 2468* `ssbd=runtime` will always turn on the mitigation when running in the 2469hypervisor context. The guest will be to turn on/off the mitigation for 2470itself by using the firmware interface `ARCH_WORKAROUND_2`. 2471* `ssbd=force-enable` will keep the mitigation permanently on. The guest will 2472not be able to control the state of the mitigation. 2473 2474By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`). 2475 2476### spec-ctrl (x86) 2477> `= List of [ <bool>, xen=<bool>, {pv,hvm}=<bool>, 2478> {msr-sc,rsb,verw,{ibpb,bhb}-entry}=<bool>|{pv,hvm}=<bool>, 2479> bti-thunk=retpoline|lfence|jmp,bhb-seq=short|tsx|long, 2480> {ibrs,ibpb,ssbd,psfd, 2481> eager-fpu,l1d-flush,branch-harden,srb-lock, 2482> unpriv-mmio,gds-mit,div-scrub,lock-harden, 2483> bhi-dis-s,bp-spec-reduce,ibpb-alt}=<bool> ]` 2484 2485Controls for speculative execution sidechannel mitigations. By default, Xen 2486will pick the most appropriate mitigations based on compiled in support, 2487loaded microcode, and hardware details, and will virtualise appropriate 2488mitigations for guests to use. 2489 2490**WARNING: Any use of this option may interfere with heuristics. Use with 2491extreme care.** 2492 2493An overall boolean value, `spec-ctrl=no`, can be specified to turn off all 2494mitigations, including pieces of infrastructure used to virtualise certain 2495mitigation features for guests. This also includes settings which `xpti`, 2496`smt`, `pv-l1tf`, `tsx` control, unless the respective option(s) have been 2497specified earlier on the command line. 2498 2499Alternatively, a slightly more restricted `spec-ctrl=no-xen` can be used to 2500turn off all of Xen's mitigations, while leaving the virtualisation support 2501in place for guests to use. 2502 2503Use of a positive boolean value for either of these options is invalid. 2504 2505The `pv=`, `hvm=`, `msr-sc=`, `rsb=`, `verw=`, `ibpb-entry=` and `bhb-entry=` 2506options offer fine grained control over the primitives by Xen. These impact 2507Xen's ability to protect itself, and/or Xen's ability to virtualise support 2508for guests to use. 2509 2510* `pv=` and `hvm=` offer control over all suboptions for PV and HVM guests 2511 respectively. 2512* Each other option can be used either as a plain boolean 2513 (e.g. `spec-ctrl=rsb` to control both the PV and HVM sub-options), or with 2514 `pv=` or `hvm=` subsuboptions (e.g. `spec-ctrl=rsb=no-hvm` to disable HVM 2515 RSB only). 2516 2517* `msr-sc=` offers control over Xen's support for manipulating `MSR_SPEC_CTRL` 2518 on entry and exit. These blocks are necessary to virtualise support for 2519 guests and if disabled, guests will be unable to use IBRS/STIBP/SSBD/etc. 2520* `rsb=` offers control over whether to overwrite the Return Stack Buffer / 2521 Return Address Stack on entry to Xen and on idle. 2522* `verw=` offers control over whether to use VERW for its scrubbing side 2523 effects at appropriate privilege transitions. The exact side effects are 2524 microarchitecture and microcode specific. *Note: `md-clear=` is accepted as 2525 a deprecated alias. For compatibility with development versions of XSA-297, 2526 `mds=` is also accepted on Xen 4.12 and earlier as an alias. Consult vendor 2527 documentation in preference to here.* 2528* `ibpb-entry=` offers control over whether IBPB (Indirect Branch Prediction 2529 Barrier) is used on entry to Xen. This is used by default on hardware 2530 vulnerable to Branch Type Confusion, and hardware vulnerable to Speculative 2531 Return Stack Overflow if appropriate microcode has been loaded, but for 2532 performance reasons dom0 is unprotected by default. If it is necessary to 2533 protect dom0 too, boot with `spec-ctrl=ibpb-entry`. 2534* `bhb-entry=` offers control over whether BHB-clearing (Branch History 2535 Buffer) sequences are used on entry to Xen. This is used by default on 2536 hardware vulnerable to Branch History Injection, when the BHI_DIS_S control 2537 is not available (see `bhi-dis-s`). The choice of scrubbing sequence can be 2538 selected using the `bhb-seq=` option. If it is necessary to protect dom0 2539 too, boot with `spec-ctrl=bhb-entry`. 2540 2541If Xen was compiled with `CONFIG_INDIRECT_THUNK` support, `bti-thunk=` can be 2542used to select which of the thunks gets patched into the 2543`__x86_indirect_thunk_%reg` locations. The default thunk is `retpoline` 2544(generally preferred), with the alternatives being `jmp` (a `jmp *%reg` gadget, 2545minimal overhead), and `lfence` (an `lfence; jmp *%reg` gadget). 2546 2547On all hardware, `bhb-seq=` can be used to select which of the BHB-clearing 2548sequences gets used. This interacts with the `bhb-entry=` and `bhi-dis-s=` 2549options in order to mitigate Branch History Injection on affected hardware. 2550The default sequence is `short`, with `tsx` as an alternative available 2551capable hardware, and `long` that can be opted in to. 2552 2553On hardware supporting IBRS (Indirect Branch Restricted Speculation), the 2554`ibrs=` option can be used to force or prevent Xen using the feature itself. 2555If Xen is not using IBRS itself, functionality is still set up so IBRS can be 2556virtualised for guests. 2557 2558On hardware supporting STIBP (Single Thread Indirect Branch Predictors), the 2559`stibp=` option can be used to force or prevent Xen using the feature itself. 2560By default, Xen will use STIBP when IBRS is in use (IBRS implies STIBP), and 2561when hardware hints recommend using it as a blanket setting. 2562 2563On hardware supporting SSBD (Speculative Store Bypass Disable), the `ssbd=` 2564option can be used to force or prevent Xen using the feature itself. The 2565feature is virtualised for guests, independently of Xen's choice of setting. 2566On AMD hardware, disabling Xen SSBD usage on the command line (`ssbd=0` which 2567is the default value) can lead to Xen running with the guest SSBD selection 2568depending on hardware support, on the same hardware setting `ssbd=1` will 2569result in SSBD always being enabled, regardless of guest choice. 2570 2571On hardware supporting PSFD (Predictive Store Forwarding Disable), the `psfd=` 2572option can be used to force or prevent Xen using the feature itself. By 2573default, Xen will not use PSFD. PSFD is implied by SSBD, and SSBD is off by 2574default. 2575 2576On hardware supporting BHI_DIS_S (Branch History Injection Disable 2577Supervisor), the `bhi-dis-s=` option can be used to force or prevent Xen using 2578the feature itself. By default Xen will use BHI_DIS_S on hardware susceptible 2579to Branch History Injection. 2580 2581On hardware supporting IBPB (Indirect Branch Prediction Barrier), the `ibpb=` 2582option can be used to force (the default) or prevent Xen from issuing branch 2583prediction barriers on vcpu context switches. 2584 2585On all hardware, the `eager-fpu=` option can be used to force or prevent Xen 2586from using fully eager FPU context switches. This is currently implemented as 2587a global control. By default, Xen will choose to use fully eager context 2588switches on hardware believed to speculate past #NM exceptions. 2589 2590On hardware supporting L1D_FLUSH, the `l1d-flush=` option can be used to force 2591or prevent Xen from issuing an L1 data cache flush on each VMEntry. 2592Irrespective of Xen's setting, the feature is virtualised for HVM guests to 2593use. By default, Xen will enable this mitigation on hardware believed to be 2594vulnerable to L1TF. 2595 2596If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_BRANCH`, the 2597`branch-harden=` boolean can be used to force or prevent Xen from using 2598speculation barriers to protect selected conditional branches. By default, 2599Xen will enable this mitigation. 2600 2601On hardware supporting SRBDS_CTRL, the `srb-lock=` option can be used to force 2602or prevent Xen from protect the Special Register Buffer from leaking stale 2603data. By default, Xen will enable this mitigation, except on parts where MDS 2604is fixed and TAA is fixed/mitigated and there are no unprivileged MMIO 2605mappings (in which case, there is believed to be no way for an attacker to 2606obtain stale data). 2607 2608The `unpriv-mmio=` boolean indicates whether the system has (or will have) 2609less than fully privileged domains granted access to MMIO devices. By 2610default, this option is disabled. If enabled, Xen will use the `FB_CLEAR` 2611and/or `SRBDS_CTRL` functionality available in the Intel May 2022 microcode 2612release to mitigate cross-domain leakage of data via the MMIO Stale Data 2613vulnerabilities. 2614 2615On all hardware, the `gds-mit=` option can be used to force or prevent Xen 2616from mitigating the GDS (Gather Data Sampling) vulnerability. By default, Xen 2617will mitigate GDS on hardware believed to be vulnerable. On hardware 2618supporting GDS_CTRL (requires the August 2023 microcode), and where firmware 2619has elected not to lock the configuration, Xen will use GDS_CTRL to mitigate 2620GDS with. Otherwise, Xen will mitigate by disabling AVX, which blocks the use 2621of the AVX2 Gather instructions. 2622 2623On all hardware, the `div-scrub=` option can be used to force or prevent Xen 2624from mitigating the DIV-leakage vulnerability. By default, Xen will mitigate 2625DIV-leakage on hardware believed to be vulnerable. 2626 2627If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_LOCK`, the `lock-harden=` 2628boolean can be used to force or prevent Xen from using speculation barriers to 2629protect lock critical regions. This mitigation won't be engaged by default, 2630and needs to be explicitly enabled on the command line. 2631 2632On hardware supporting SRSO_MSR_FIX, the `bp-spec-reduce=` option can be used 2633to force or prevent Xen from using MSR_BP_CFG.BP_SPEC_REDUCE to mitigate the 2634SRSO (Speculative Return Stack Overflow) vulnerability. Xen will use 2635bp-spec-reduce when available, as it is preferable to using `ibpb-entry=hvm` 2636to mitigate SRSO for HVM guests, and because it is a prerequisite to advertise 2637SRSO_U/S_NO to PV guests. 2638 2639On Sappire and Emerald Rapids CPUs with May 2025 microcode or later, the 2640`ibpb-alt=` option can be used to switch to the alternative mitigation for 2641Intel SA-00982. Intel suggest that some workloads will benefit from this. 2642 2643### sync_console 2644> `= <boolean>` 2645 2646> Default: `false` 2647 2648Flag to force synchronous console output. Useful for debugging, but 2649not suitable for production environments due to incurred overhead. 2650 2651### tboot (x86) 2652> `= 0x<phys_addr>` 2653 2654Specify the physical address of the trusted boot shared page. 2655 2656### tbuf_size 2657> `= <integer>` 2658 2659Specify the per-cpu trace buffer size in pages. 2660 2661### tdt (x86) 2662> `= <boolean>` 2663 2664> Default: `true` 2665 2666Flag to enable TSC deadline as the APIC timer mode. 2667 2668### tee (arm) 2669> `= <string>` 2670 2671Specify the TEE mediator to be probed and use. 2672 2673The default behaviour is to probe all TEEs supported by Xen and use 2674the first one successfully probed. When this parameter is passed, Xen will 2675probe only the TEE mediator passed as argument and boot will fail if this 2676mediator is not properly probed or if the requested TEE is not supported by 2677Xen. 2678 2679This parameter can be set to `optee` or `ffa` if the corresponding mediators 2680are compiled in. 2681 2682### tevt_mask 2683> `= <integer>` 2684 2685Specify a mask for Xen event tracing. This allows Xen tracing to be 2686enabled at boot. Refer to the xentrace(8) documentation for a list of 2687valid event mask values. In order to enable tracing, a buffer size (in 2688pages) must also be specified via the tbuf_size parameter. 2689 2690### tickle_one_idle_cpu 2691> `= <boolean>` 2692 2693### timer_slop 2694> `= <integer>` 2695 2696### tsc (x86) 2697> `= unstable | skewed | stable:socket` 2698 2699### tsx 2700 = <bool> 2701 2702 Applicability: x86 with CONFIG_INTEL active 2703 Default: false on parts vulnerable to TAA, true otherwise 2704 2705Controls for the use of Transactional Synchronization eXtensions. 2706 2707Several microcode updates are relevant: 2708 2709 * March 2019, fixing the TSX memory ordering errata on all TSX-enabled CPUs 2710 to date. Introduced MSR_TSX_FORCE_ABORT on SKL/SKX/KBL/WHL/CFL parts. The 2711 errata workaround uses Performance Counter 3, so the user can select 2712 between working TSX and working perfcounters. 2713 2714 * November 2019, fixing the TSX Async Abort speculative vulnerability. 2715 Introduced MSR_TSX_CTRL on all TSX-enabled MDS_NO parts to date, 2716 CLX/WHL-R/CFL-R, with the controls becoming architectural moving forward 2717 and formally retiring HLE from the architecture. The user can disable TSX 2718 to mitigate TAA, and elect to hide the HLE/RTM CPUID bits. Also causes 2719 VERW to once-again flush the microarchiectural buffers in case a TAA 2720 mitigation is wanted along with TSX being enabled. 2721 2722 * June 2021, removing the workaround for March 2019 on client CPUs and 2723 formally de-featured TSX on SKL/KBL/WHL/CFL (Note: SKX still retains the 2724 March 2019 fix). Introduced the ability to hide the HLE/RTM CPUID bits. 2725 PCR3 works fine, and TSX is disabled by default, but the user can re-enable 2726 TSX at their own risk, accepting that the memory order erratum is unfixed. 2727 2728 * February 2022, removing the VERW flushing workaround from November 2019 on 2729 client CPUs and formally de-featuring TSX on WHL-R/CFL-R (Note: CLX still 2730 retains the VERW flushing workaround). TSX defaults to disabled, and is 2731 locked off when SGX is enabled in the BIOS. When SGX is not enabled, TSX 2732 can be re-enabled at the users own risk, as it reintroduces the TSX Async 2733 Abort speculative vulnerability. 2734 2735On systems with the ability to configure TSX, this boolean offers system wide 2736control of whether TSX is enabled or disabled. 2737 2738When TSX is disabled, transactions unconditionally abort. This is compatible 2739with the TSX spec, which requires software to have a non-transactional path as 2740a fallback. The RTM and HLE CPUID bits are hidden from VMs by default, but 2741can be re-enabled if required. This allows VMs which previously saw RTM/HLE 2742to be migrated in, although any TSX-enabled software will run with reduced 2743performance. 2744 2745 * When TSX is locked off by firmware, `tsx=` is ignored and treated as 2746 `false`. 2747 2748 * An explicit `tsx=` choice is honoured, even if it is `true` and would 2749 result in a vulnerable system. 2750 2751 * When no explicit `tsx=` choice is given, parts vulnerable to TAA will be 2752 mitigated by disabling TSX, as this is the lowest overhead option. 2753 2754 * When no explicit `tsx=` option is given, parts susceptible to the memory 2755 ordering errata default to `true` to enable working TSX. Alternatively, 2756 selecting `tsx=0` will disable TSX and restore PCR3 to a working state. 2757 2758 SKX and SKL/KBL/WHL/CFL on pre-June 2021 microcode default to `true`. 2759 Alternatively, selecting `tsx=0` will disable TSX and restore PCR3 to a 2760 working state. 2761 2762 SKL/KBL/WHL/CFL on the June 2021 microcode or later default to `false`. 2763 Alternatively, selecting `tsx=1` will re-enable TSX at the users own risk. 2764 2765### ucode 2766> `= List of [ <integer> | scan=<bool>, nmi=<bool>, digest-check=<bool> ]` 2767 2768 Applicability: x86 2769 Default: `scan` is selectable via Kconfig, `nmi,digest-check` 2770 2771Controls for CPU microcode loading. For early loading, this parameter can 2772specify how and where to find the microcode update blob. For late loading, 2773this parameter specifies if the update happens within a NMI handler. 2774 2775'integer' specifies the CPU microcode update blob module index. When positive, 2776this specifies the n-th module (in the GrUB entry, zero based) to be used 2777for updating CPU micrcode. When negative, counting starts at the end of 2778the modules in the GrUB entry (so with the blob commonly being last, 2779one could specify `ucode=-1`). Note that the value of zero is not valid 2780here (entry zero, i.e. the first module, is always the Dom0 kernel 2781image). Note further that use of this option has an unspecified effect 2782when used with xen.efi (there the concept of modules doesn't exist, and 2783the blob gets specified via the `ucode=<filename>` config file/section 2784entry; see [EFI configuration file description](efi.html)). 2785 2786'scan' instructs the hypervisor to scan the multiboot images for an cpio 2787image that contains microcode. Depending on the platform the blob with the 2788microcode in the cpio name space must be: 2789 - on Intel: kernel/x86/microcode/GenuineIntel.bin 2790 - on AMD : kernel/x86/microcode/AuthenticAMD.bin 2791When using xen.efi, the `ucode=<filename>` config file setting takes 2792precedence over `scan`. The default value for `scan` is set with 2793`CONFIG_UCODE_SCAN_DEFAULT`. 2794 2795'nmi' determines late loading is performed in NMI handler or just in 2796stop_machine context. In NMI handler, even NMIs are blocked, which is 2797considered safer. The default value is `true`. 2798 2799The `digest-check=` option is active by default and controls whether to 2800perform additional authenticity checks. Collisions in the signature algorithm 2801used by AMD Fam17h/19h processors have been found. Xen contains a table of 2802digests of microcode patches with known-good provenance, and will block 2803loading of patches that do not match. 2804 2805### unrestricted_guest (Intel) 2806> `= <boolean>` 2807 2808### vcpu_migration_delay 2809> `= <integer>` 2810 2811> Default: `0` 2812 2813Specify a delay, in microseconds, between migrations of a VCPU between 2814PCPUs when using the credit1 scheduler. This prevents rapid fluttering 2815of a VCPU between CPUs, and reduces the implicit overheads such as 2816cache-warming. 1ms (1000) has been measured as a good value. 2817 2818### vesa-ram 2819> `= <integer>` 2820 2821> Default: `0` 2822 2823This allows to override the amount of video RAM, in MiB, determined to be 2824present. 2825 2826### vga 2827> `= ( ask | current | text-80x<rows> | gfx-<width>x<height>x<depth> | mode-<mode> )[,keep]` 2828 2829`ask` causes Xen to display a menu of available modes and request the 2830user to choose one of them. 2831 2832`current` causes Xen to use the graphics adapter in its current state, 2833without further setup. 2834 2835`text-80x<rows>` instructs Xen to set up text mode. Valid values for 2836`<rows>` are `25, 28, 30, 34, 43, 50, 80` 2837 2838`gfx-<width>x<height>x<depth>` instructs Xen to set up graphics mode 2839with the specified width, height and depth. 2840 2841`mode-<mode>` instructs Xen to use a specific mode, as shown with the 2842`ask` option. (N.B menu modes are displayed in hex, so `<mode>` 2843should be a hexadecimal number) 2844 2845The optional `keep` parameter causes Xen to continue using the vga 2846console even after dom0 has been started. The default behaviour is to 2847relinquish control to dom0. 2848 2849### viridian-spinlock-retry-count (x86) 2850> `= <integer>` 2851 2852> Default: `2047` 2853 2854Specify the maximum number of retries before an enlightened Windows 2855guest will notify Xen that it has failed to acquire a spinlock. 2856 2857### viridian-version (x86) 2858> `= [<major>],[<minor>],[<build>]` 2859 2860> Default: `6,0,0x1772` 2861 2862<major>, <minor> and <build> must be integers. The values will be 2863encoded in guest CPUID 0x40000002 if viridian enlightenments are enabled. 2864 2865### vm-notify-window (Intel) 2866> `= <integer>` 2867 2868> Default: `0` 2869 2870Specify the value of the VM Notify window used to detect locked VMs. Set to -1 2871to disable the feature. Value is in units of crystal clock cycles. 2872 2873Note the hardware might add a threshold to the provided value in order to make 2874it safe, and hence using 0 is fine. 2875 2876### vpid (Intel) 2877> `= <boolean>` 2878 2879> Default: `true` 2880 2881Use Virtual Processor ID support if available. This prevents the need for TLB 2882flushes on VM entry and exit, increasing performance. 2883 2884### vpmu (x86) 2885 = List of [ <bool>, bts, ipc, arch, rtm-abort=<bool> ] 2886 2887 Applicability: x86. Default: false 2888 2889Controls for Performance Monitoring Unit virtualisation. 2890 2891Performance monitoring facilities tend to be very hardware specific, and 2892provide access to a wealth of low level processor information. 2893 2894* An overall boolean can be used to enable or disable vPMU support. vPMU is 2895 disabled by default. 2896 2897 When enabled, guests have full access to all performance counter settings, 2898 including model specific functionality. This is a superset of the 2899 functionality offered by `ipc` and/or `arch`, but a subset of the 2900 functionality offered by `bts`. 2901 2902 Xen's watchdog functionality is implemented using performance counters. 2903 As a result, use of the **watchdog** option will override and disable 2904 vPMU. 2905 2906* The `bts` option enables performance monitoring, and permits additional 2907 access to the Branch Trace Store controls. BTS is an Intel feature where 2908 the processor can write data into a buffer whenever a branch occurs. 2909 However, as this feature isn't virtualised, a misconfiguration by the 2910 guest can lock the entire system up. 2911 2912* The `ipc` option allows access to the most minimal set of counters 2913 possible: instructions, cycles, and reference cycles. These can be used 2914 to calculate instructions per cycle (IPC). 2915 2916* The `arch` option allows access to the pre-defined architectural events. 2917 2918* The `rtm-abort` boolean has been superseded. Use `tsx=0` instead. 2919 2920*Warning:* 2921As the virtualisation is not 100% safe, don't use the vpmu flag on 2922production systems (see https://xenbits.xen.org/xsa/advisory-163.html)! 2923 2924### vwfi (arm) 2925> `= trap | native` 2926 2927> Default: `trap` 2928 2929WFI is the ARM instruction to "wait for interrupt". WFE is similar and 2930means "wait for event". This option, which is ARM specific, changes the 2931way guest WFI and WFE are implemented in Xen. By default, Xen traps both 2932instructions. In the case of WFI, Xen blocks the guest vcpu; in the case 2933of WFE, Xen yield the guest vcpu. When setting vwfi to `native`, Xen 2934doesn't trap either instruction, running them in guest context. Setting 2935vwfi to `native` reduces irq latency significantly. It can also lead to 2936suboptimal scheduling decisions, but only when the system is 2937oversubscribed (i.e., in total there are more vCPUs than pCPUs). 2938 2939### wallclock (x86) 2940> `= auto | xen | cmos | efi` 2941 2942> Default: `auto` 2943 2944Allow forcing the usage of a specific wallclock source. 2945 2946 * `auto` let the hypervisor select the clocksource based on internal 2947 heuristics. 2948 2949 * `xen` force usage of the Xen shared_info wallclock when booted as a Xen 2950 guest. This option is only available if the hypervisor was compiled with 2951 `CONFIG_XEN_GUEST` enabled. 2952 2953 * `cmos` force usage of the CMOS RTC wallclock. 2954 2955 * `efi` force usage of the EFI_GET_TIME run-time method when booted from EFI 2956 firmware. 2957 2958If the selected option is invalid or not available Xen will default to `auto`. 2959 2960### watchdog (x86) 2961> `= force | <boolean>` 2962 2963> Default: `false` 2964 2965Run an NMI watchdog on each processor. If a processor is stuck for 2966longer than the **watchdog_timeout**, a panic occurs. When `force` is 2967specified, in addition to running an NMI watchdog on each processor, 2968unknown NMIs will still be processed. 2969 2970### watchdog_timeout (x86) 2971> `= <integer>` 2972 2973> Default: `5` 2974 2975Set the NMI watchdog timeout in seconds. Specifying `0` will turn off 2976the watchdog. 2977 2978### x2apic (x86) 2979> `= <boolean>` 2980 2981> Default: `true` 2982 2983Permit use of x2apic setup for SMP environments. 2984 2985### x2apic-mode (x86) 2986> `= physical | mixed` 2987 2988> Default: `physical` if **FADT** mandates physical mode, otherwise set at 2989> build time by CONFIG_X2APIC_{PHYSICAL,MIXED}. 2990 2991In the case that x2apic is in use, this option switches between modes to 2992address APICs in the system as interrupt destinations. 2993 2994### x2apic_phys (x86) 2995> `= <boolean>` 2996 2997> Default: `true` if **FADT** mandates physical mode or if interrupt remapping 2998> is not available, `false` otherwise. 2999 3000In the case that x2apic is in use, this option switches between physical and 3001clustered mode. The default, given no hint from the **FADT**, is cluster 3002mode. 3003 3004**WARNING: `x2apic_phys` is deprecated and superseded by `x2apic-mode`. 3005The latter takes precedence if both are set.** 3006 3007### xen-llc-colors (arm64) 3008> `= List of [ <integer> | <integer>-<integer> ]` 3009 3010> Default: `0: the lowermost color` 3011 3012Specify Xen LLC color configuration. This options is available only when 3013`CONFIG_LLC_COLORING` is enabled. 3014Two colors are most likely needed on platforms where private caches are 3015physically indexed, e.g. the L1 instruction cache of the Arm Cortex-A57. 3016 3017### xenheap_megabytes (arm32) 3018> `= <size>` 3019 3020> Default: `0` (1/32 of RAM) 3021 3022Amount of RAM to set aside for the Xenheap. Must be an integer multiple of 32. 3023 3024By default will use 1/32 of the RAM up to a maximum of 1GB and with a 3025minimum of 32M, subject to a suitably aligned and sized contiguous 3026region of memory being available. 3027 3028### xpti (x86) 3029> `= List of [ default | <boolean> | dom0=<bool> | domu=<bool> ]` 3030 3031> Default: `false` on hardware known not to be vulnerable to Meltdown (e.g. AMD) 3032> Default: `true` everywhere else 3033 3034Override default selection of whether to isolate 64-bit PV guest page 3035tables. 3036 3037`true` activates page table isolation even on hardware not vulnerable by 3038Meltdown for all domains. 3039 3040`false` deactivates page table isolation on all systems for all domains. 3041 3042`default` sets the default behaviour. 3043 3044With `dom0` and `domu` it is possible to control page table isolation 3045for dom0 or guest domains only. 3046 3047### xsave (x86) 3048> `= <boolean>` 3049 3050> Default: `true` 3051 3052Permit use of the `xsave/xrstor` instructions. 3053 3054### xsm 3055> `= dummy | flask | silo` 3056 3057> Default: selectable via Kconfig. Depends on enabled XSM modules. 3058 3059Specify which XSM module should be enabled. This option is only available if 3060the hypervisor was compiled with `CONFIG_XSM` enabled. 3061 3062* `dummy`: this is the default choice. Basic restriction for common deployment 3063 (the dummy module) will be applied. It's also used when XSM is compiled out. 3064* `flask`: this is the policy based access control. To choose this, the 3065 separated option in kconfig must also be enabled. 3066* `silo`: this will deny any unmediated communication channels between 3067 unprivileged VMs. To choose this, the separated option in kconfig must also 3068 be enabled. 3069