1# Xen Hypervisor Command Line Options
2
3This document covers the command line options which the Xen
4Hypervisor.
5
6## Types of parameter
7
8Most parameters take the form `option=value`.  Different options on
9the command line should be space delimited.  All options are case
10sensitive, as are all values unless explicitly noted.
11
12### Boolean (`<boolean>`)
13
14All boolean option may be explicitly enabled using a `value` of
15> `yes`, `on`, `true`, `enable` or `1`
16
17They may be explicitly disabled using a `value` of
18> `no`, `off`, `false`, `disable` or `0`
19
20In addition, a boolean option may be enabled by simply stating its
21name, and may be disabled by prefixing its name with `no-`.
22
23####Examples
24
25Enable noreboot mode
26> `noreboot=true`
27
28Disable x2apic support (if present)
29> `x2apic=off`
30
31Enable synchronous console mode
32> `sync_console`
33
34Explicitly specifying any value other than those listed above is
35undefined, as is stacking a `no-` prefix with an explicit value.
36
37### Integer (`<integer>`)
38
39An integer parameter will default to decimal and may be prefixed with
40a `-` for negative numbers.  Alternatively, a hexadecimal number may be
41used by prefixing the number with `0x`, or an octal number may be used
42if a leading `0` is present.
43
44Providing a string which does not validly convert to an integer is
45undefined.
46
47### Size (`<size>`)
48
49A size parameter may be any integer, with a single size suffix
50
51* `T` or `t`: TiB (2^40)
52* `G` or `g`: GiB (2^30)
53* `M` or `m`: MiB (2^20)
54* `K` or `k`: KiB (2^10)
55* `B` or `b`: Bytes
56
57Without a size suffix, the default will be kilo.  Providing a suffix
58other than those listed above is undefined.
59
60### String
61
62Many parameters are more complicated and require more intricate
63configuration.  The detailed description of each individual parameter
64specify which values are valid.
65
66### List
67
68Some options take a comma separated list of values.
69
70### Combination
71
72Some parameters act as combinations of the above, most commonly a mix
73of Boolean and String.  These are noted in the relevant sections.
74
75## Parameter details
76
77### acpi
78> `= force | ht | noirq | <boolean> | verbose`
79
80**String**, or **Boolean** to disable.
81
82By default, Xen will scan the DMI data and blacklist certain systems
83which are known to have broken ACPI setups.  Providing `acpi=force`
84will cause Xen to ignore the blacklist and attempt to use all ACPI
85features.
86
87Using `acpi=ht` causes Xen to parse the ACPI tables enough to
88enumerate all CPUs, but will not use other ACPI features.  This is not
89common, and only has an effect if your system is blacklisted.
90
91The `acpi=noirq` option causes Xen to not parse the ACPI MADT table
92looking for IO-APIC entries.  This is also not common, and any system
93which requires this option to function should be blacklisted.
94Additionally, this will not prevent Xen from finding IO-APIC entries
95from the MP tables.
96
97Further, any of the boolean false options can be used to disable ACPI
98usage entirely.
99
100Because responsibility for ACPI processing is shared between Xen and
101the domain 0 kernel this option is automatically propagated to the
102domain 0 command line.
103
104Finally, `acpi=verbose` will enable per-processor information logging
105which may otherwise be too noisy in particular on large systems.
106
107### acpi_apic_instance
108> `= <integer>`
109
110Specify which ACPI MADT table to parse for APIC information, if more
111than one is present.
112
113### acpi_pstate_strict (x86)
114> `= <boolean>`
115
116> Default: `false`
117
118Enforce checking that P-state transitions by the ACPI cpufreq driver
119actually result in the nominated frequency to be established. A warning
120message will be logged if that isn't the case.
121
122### acpi_skip_timer_override (x86)
123> `= <boolean>`
124
125Instruct Xen to ignore timer-interrupt override.
126
127### acpi_sleep (x86)
128> `= s3_bios | s3_mode`
129
130`s3_bios` instructs Xen to invoke video BIOS initialization during S3
131resume.
132
133`s3_mode` instructs Xen to set up the boot time (option `vga=`) video
134mode during S3 resume.
135
136### allow_unsafe (x86)
137> `= <boolean>`
138
139> Default: `false`
140
141Force boot on potentially unsafe systems. By default Xen will refuse
142to boot on systems with the following errata:
143
144* AMD Erratum 121. Processors with this erratum are subject to a guest
145  triggerable Denial of Service. Override only if you trust all of
146  your PV guests.
147
148### altp2m (Intel)
149> `= <boolean>`
150
151> Default: `false`
152
153Permit multiple copies of host p2m.
154
155### apic (x86)
156> `= bigsmp | default`
157
158Override Xen's logic for choosing the APIC driver.  By default, if
159there are more than 8 CPUs, Xen will switch to `bigsmp` over
160`default`.
161
162### apicv (Intel)
163> `= <boolean>`
164
165> Default: `true`
166
167Permit Xen to use APIC Virtualisation Extensions.  This is an optimisation
168available as part of VT-x, and allows hardware to take care of the guests APIC
169handling, rather than requiring emulation in Xen.
170
171### apic_verbosity (x86)
172> `= verbose | debug`
173
174Increase the verbosity of the APIC code from the default value.
175
176### arat (x86)
177> `= <boolean>`
178
179> Default: `true`
180
181Permit Xen to use "Always Running APIC Timer" support on compatible hardware
182in combination with cpuidle.  This option is only expected to be useful for
183developers wishing Xen to fall back to older timing methods on newer hardware.
184
185### argo
186    = List of [ <bool>, mac-permissive=<bool> ]
187
188Controls for the Argo hypervisor-mediated interdomain communication service.
189
190The functionality that this option controls is only available when Xen has been
191compiled with the build setting for Argo enabled in the build configuration.
192
193Argo is a interdomain communication mechanism, where Xen acts as the central
194point of authority.  Guests may register memory rings to recieve messages,
195query the status of other domains, and send messages by hypercall, all subject
196to appropriate auditing by Xen.  Argo is disabled by default.
197
198*   The `mac-permissive` boolean controls whether wildcard receive rings may be
199    registered (`mac-permissive=1`) or may not be registered
200    (`mac-permissive=0`).
201
202    This option is disabled by default, to protect domains from a DoS by a
203    buggy or malicious other domain spamming the ring.
204
205### asid (x86)
206> `= <boolean>`
207
208> Default: `true`
209
210Permit Xen to use Address Space Identifiers.  This is an optimisation which
211tags the TLB entries with an ID per vcpu.  This allows for guest TLB flushes
212to be performed without the overhead of a complete TLB flush.
213
214### async-show-all (x86)
215> `= <boolean>`
216
217> Default: `false`
218
219Forces all CPUs' full state to be logged upon certain fatal asynchronous
220exceptions (watchdog NMIs and unexpected MCEs).
221
222### ats (x86)
223> `= <boolean>`
224
225> Default: `false`
226
227Permits Xen to set up and use PCI Address Translation Services.  This is a
228performance optimisation for PCI Passthrough.
229
230**WARNING: Xen cannot currently safely use ATS because of its synchronous wait
231loops for Queued Invalidation completions.**
232
233### availmem
234> `= <size>`
235
236> Default: `0` (no limit)
237
238Specify a maximum amount of available memory, to which Xen will clamp
239the e820 table.
240
241### badpage
242> `= List of [ <integer> | <integer>-<integer> ]`
243
244Specify that certain pages, or certain ranges of pages contain bad
245bytes and should not be used.  For example, if your memory tester says
246that byte `0x12345678` is bad, you would place `badpage=0x12345` on
247Xen's command line.
248
249### bootscrub
250> `= idle | <boolean>`
251
252> Default: `idle`
253
254Scrub free RAM during boot.  This is a safety feature to prevent
255accidentally leaking sensitive VM data into other VMs if Xen crashes
256and reboots.
257
258In `idle` mode, RAM is scrubbed in background on all CPUs during idle-loop
259with a guarantee that memory allocations always provide scrubbed pages.
260This option reduces boot time on machines with a large amount of RAM while
261still providing security benefits.
262
263### bootscrub_chunk
264> `= <size>`
265
266> Default: `128M`
267
268Maximum RAM block size chunks to be scrubbed whilst holding the page heap lock
269and not running softirqs. Reduce this if softirqs are not being run frequently
270enough. Setting this to a high value may cause boot failure, particularly if
271the NMI watchdog is also enabled.
272
273### buddy-alloc-size (arm64)
274> `= <size>`
275
276> Default: `64M`
277
278Amount of memory reserved for the buddy allocator when colored allocator is
279active. This option is available only when `CONFIG_LLC_COLORING` is enabled.
280The colored allocator is meant as an alternative to the buddy allocator,
281because its allocation policy is by definition incompatible with the generic
282one. Since the Xen heap systems is not colored yet, we need to support the
283coexistence of the two allocators for now. This parameter, which is optional
284and for expert only, it's used to set the amount of memory reserved to the
285buddy allocator.
286
287### cet
288    = List of [ <bool>, shstk=<bool>, ibt=<bool> ]
289
290    Applicability: x86
291
292Controls for the use of Control-flow Enforcement Technology.  CET is group a
293of hardware features designed to combat Return-oriented Programming (ROP, also
294call/jmp COP/JOP) attacks.
295
296CET is incompatible with 32bit PV guests.  If any CET sub-options are active,
297they will override the `pv=32` boolean to `false`.  Backwards compatibility
298can be maintained with the pv-shim mechanism.
299
300*   An unqualified boolean is a shorthand for setting all suboptions at once.
301
302*   The `shstk=` boolean controls whether Xen uses Shadow Stacks for its own
303    protection.
304
305    The option is available when `CONFIG_XEN_SHSTK` is compiled in, and
306    generally defaults to `true` on hardware supporting CET-SS.  Specifying
307    `cet=no-shstk` will cause Xen not to use Shadow Stacks even when support
308    is available in hardware.
309
310    Some hardware suffers from an issue known as Supervisor Shadow Stack
311    Fracturing.  On such hardware, Xen will default to not using Shadow Stacks
312    when virtualised.  Specifying `cet=shstk` will override this heuristic and
313    enable Shadow Stacks unilaterally.
314
315*   The `ibt=` boolean controls whether Xen uses Indirect Branch Tracking for
316    its own protection.
317
318    The option is available when `CONFIG_XEN_IBT` is compiled in, and defaults
319    to `true` on hardware supporting CET-IBT.  Specifying `cet=no-ibt` will
320    cause Xen not to use Indirect Branch Tracking even when support is
321    available in hardware.
322
323### clocksource (x86)
324> `= pit | hpet | acpi | tsc`
325
326If set, override Xen's default choice for the platform timer.
327Having TSC as platform timer requires being explicitly set. This is because
328TSC can only be safely used if CPU hotplug isn't performed on the system. On
329some platforms, the "maxcpus" option may need to be used to further adjust
330the number of allowed CPUs.  When running on platforms that can guarantee a
331monotonic TSC across sockets you may want to adjust the "tsc" command line
332parameter to "stable:socket".
333
334### cmci-threshold (Intel)
335> `= <integer>`
336
337> Default: `2`
338
339Specify the event count threshold for raising Corrected Machine Check
340Interrupts.  Specifying zero disables CMCI handling.
341
342### cmos-rtc-probe (x86)
343> `= <boolean>`
344
345> Default: `false`
346
347Flag to indicate whether to probe for a CMOS Real Time Clock irrespective of
348ACPI indicating none to be there.
349
350### com1 (x86)
351### com2 (x86)
352> `= <baud>[/<base-baud>][,[DPS][,[<io-base>|pci|amt][,[<irq>|msi][,[<port-bdf>][,[<bridge-bdf>]]]]]]`
353
354Both option `com1` and `com2` follow the same format.
355
356* `<baud>` may be either an integer baud rate, or the string `auto` if
357  the bootloader or other earlier firmware has already set it up.
358* Optionally, the base baud rate (usually the highest baud rate the
359  device can communicate at) can be specified.
360* `DPS` represents the number of data bits, the parity, and the number
361  of stop bits.
362  * `D` is an integer between 5 and 8 for the number of data bits.
363  * `P` is a single character representing the type of parity:
364      * `n` No
365      * `o` Odd
366      * `e` Even
367      * `m` Mark
368      * `s` Space
369  * `S` is an integer 1 or 2 for the number of stop bits.
370* `<io-base>` is an integer which specifies the IO base port for UART
371  registers.
372* `<irq>` is the IRQ number to use, or `0` to use the UART in poll
373  mode only, or `msi` to set up a Message Signaled Interrupt.
374* `<port-bdf>` is the PCI location of the UART, in
375  `<bus>:<device>.<function>` notation.
376* `<bridge-bdf>` is the PCI bridge behind which is the UART, in
377  `<bus>:<device>.<function>` notation.
378* `pci` indicates that Xen should scan the PCI bus for the UART,
379  avoiding Intel AMT devices.
380* `amt` indicated that Xen should scan the PCI bus for the UART,
381  including Intel AMT devices if present.
382
383A typical setup for most situations might be `com1=115200,8n1`
384
385In addition to the above positional specification for UART parameters,
386name=value pair specfications are also supported. This is used to add
387flexibility for UART devices which require additional UART parameter
388configurations.
389
390The comma separation still delineates positional parameters. Hence,
391unless the parameter is explicitly specified with name=value option, it
392will be considered a positional parameter.
393
394The syntax consists of
395com1=(comma-separated positional parameters),(comma separated name-value pairs)
396
397The accepted name keywords for name=value pairs are:
398
399* `baud` - accepts integer baud rate (eg. 115200) or `auto`
400* `bridge`- Similar to bridge-bdf in positional parameters.
401            Used to determine the PCI bridge to access the UART device.
402            Notation is xx:xx.x `<bus>:<device>.<function>`
403* `clock-hz`- accepts large integers to setup UART clock frequencies.
404              Do note - these values are multiplied by 16.
405* `data-bits` - integer between 5 and 8
406* `dev` - accepted values are `pci` OR `amt`. If this option
407          is used to specify if the serial device is pci-based. The io_base
408          cannot be specified when `dev=pci` or `dev=amt` is used.
409* `io-base` - accepts integer which specified IO base port for UART registers
410* `irq` - IRQ number to use
411* `parity` - accepted values are same as positional parameters
412* `port` - Used to specify which port the PCI serial device is located on
413           Notation is xx:xx.x `<bus>:<device>.<function>`
414* `reg-shift` - register shifts required to set UART registers
415* `reg-width` - register width required to set UART registers
416                (only accepts 1 and 4)
417* `stop-bits` - only accepts 1 or 2 for the number of stop bits
418
419The following are examples of correct specifications:
420
421    com1=115200,8n1,0x3f8,4
422    com1=115200,8n1,0x3f8,4,reg-width=4,reg-shift=2
423    com1=baud=115200,parity=n,stop-bits=1,io-base=0x3f8,reg-width=4
424
425### conring_size
426> `= <size>`
427
428> Default: `conring_size=16k`
429
430Specify the size of the console ring buffer.
431
432### console
433> `= List of [ vga | com1[H,L] | com2[H,L] | pv | dbgp | ehci | xhci | none ]`
434
435> Default: `console=com1,vga`
436
437Specify which console(s) Xen should use.
438
439`vga` indicates that Xen should try and use the vga graphics adapter.
440
441`com1` and `com2` indicates that Xen should use serial ports 1 and 2
442respectively.  Optionally, these arguments may be followed by an `H` or
443`L`.  `H` indicates that transmitted characters will have their MSB
444set, while received characters must have their MSB set.  `L` indicates
445the converse; transmitted and received characters will have their MSB
446cleared.  This allows a single port to be shared by two subsystems
447(e.g. console and debugger).
448
449`pv` indicates that Xen should use Xen's PV console. This option is
450only available when used together with `pv-in-pvh`.
451
452`dbgp` or `ehci` indicates that Xen should use a USB2 debug port.
453
454`xhci` indicates that Xen should use a USB3 debug port.
455
456`none` indicates that Xen should not use a console.  This option only
457makes sense on its own.
458
459### console_timestamps
460> `= none | date | datems | boot | raw`
461
462> Default: `none`
463
464> Can be modified at runtime
465
466Specify which timestamp format Xen should use for each console line.
467
468* `none`: No timestamps
469* `date`: Date and time information
470    * `[YYYY-MM-DD HH:MM:SS]`
471* `datems`: Date and time, with milliseconds
472    * `[YYYY-MM-DD HH:MM:SS.mmm]`
473* `boot`: Seconds and microseconds since boot
474    * `[SSSSSS.uuuuuu]`
475+ `raw`: Raw platform ticks, architecture and implementation dependent
476    * `[XXXXXXXXXXXXXXXX]`
477
478For compatibility with the older boolean parameter, specifying
479`console_timestamps` alone will enable the `date` option.
480
481### console_to_ring
482> `= <boolean>`
483
484> Default: `false`
485
486Flag to indicate whether all guest console output should be copied
487into the console ring buffer.
488
489### conswitch
490> `= <switch char>[x]`
491
492> Default: `conswitch=a`
493
494> Can be modified at runtime
495
496Specify which character should be used to switch serial input between
497Xen and dom0.  The required sequence is CTRL-&lt;switch char&gt; three
498times.
499
500The optional trailing `x` indicates that Xen should not automatically
501switch the console input to dom0 during boot.  Any other value,
502including omission, causes Xen to automatically switch to the dom0
503console during dom0 boot.  Use `conswitch=ax` to keep the default switch
504character, but for xen to keep the console.
505
506### core_parking
507> `= power | performance`
508
509> Default: `power`
510
511### cpu_type (x86)
512> `= arch_perfmon`
513
514If set, force use of the performance counters for oprofile, rather than detecting
515available support.
516
517### cpufreq
518> `= none | {{ <boolean> | xen } { [:[powersave|performance|ondemand|userspace][,[<maxfreq>]][,[<minfreq>]]] } [,verbose]} | dom0-kernel | hwp[:[<hdc>][,verbose]]`
519
520> Default: `xen`
521
522Indicate where the responsibility for driving power states lies.  Note that the
523choice of `dom0-kernel` is deprecated and not supported by all Dom0 kernels.
524
525* Default governor policy is ondemand.
526* `<maxfreq>` and `<minfreq>` are integers which represent max and min processor frequencies
527  respectively.
528* `verbose` option can be included as a string or also as `verbose=<integer>`
529  for `xen`.  It is a boolean for `hwp`.
530* `hwp` selects Hardware-Controlled Performance States (HWP) on supported Intel
531  hardware.  HWP is a Skylake+ feature which provides better CPU power
532  management.  The default is disabled.  If `hwp` is selected, but hardware
533  support is not available, Xen will fallback to cpufreq=xen.
534* `<hdc>` is a boolean to enable Hardware Duty Cycling (HDC).  HDC enables the
535  processor to autonomously force physical package components into idle state.
536  The default is enabled, but the option only applies when `hwp` is enabled.
537
538There is also support for `;`-separated fallback options:
539`cpufreq=hwp;xen,verbose`.  This first tries `hwp` and falls back to `xen` if
540unavailable.  Note: The `verbose` suboption is handled globally.  Setting it
541for either the primary or fallback option applies to both irrespective of where
542it is specified.
543
544Note: grub2 requires to escape or quote ';', so `"cpufreq=hwp;xen"` should be
545specified within double quotes inside grub.cfg.  Refer to the grub2
546documentation for more information.
547
548### cpuid (x86)
549> `= List of comma separated booleans`
550
551This option allows for fine tuning of the facilities Xen will use, after
552accounting for hardware capabilities as enumerated via CPUID.
553
554Unless otherwise noted, options only have any effect in their negative form,
555to hide the named feature(s).  Ignoring a feature using this mechanism will
556cause Xen not to use the feature, nor offer them as usable to guests.
557
558Currently accepted:
559
560The Speculation Control hardware features `srbds-ctrl`, `md-clear`, `ibrsb`,
561`stibp`, `ibpb`, `l1d-flush` and `ssbd` are used by default if available and
562applicable.  They can all be ignored.
563
564`rdrand` and `rdseed` have multiple interactions.
565
566*   For Special Register Buffer Data Sampling (SRBDS, XSA-320, CVE-2020-0543),
567    RDRAND and RDSEED can be ignored.
568
569    Due to the absence of microcode to address SRBDS on IvyBridge client
570    hardware, the RDRAND feature is hidden by default for guests, unless
571    `rdrand` is used in its positive form.  Irrespective of the setting here,
572    VMs can use RDRAND if explicitly enabled in guest config file, and VMs
573    already using RDRAND can migrate in.
574
575*   The RDRAND feature is disabled by default on AMD Fam15/16 systems, due to
576    possible malfunctions after ACPI S3 suspend/resume.  `rdrand` may be used
577    in its positive form to override Xen's default behaviour on these systems,
578    and make the feature fully usable.
579
580### cpuid_mask_cpu
581> `= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b`
582
583> Applicability: AMD
584
585If none of the other **cpuid_mask_\*** options are given, Xen has a set of
586pre-configured masks to make the current processor appear to be
587family/revision specified.
588
589See below for general information on masking.
590
591**Warning: This option is not fully effective on Family 15h processors or
592later.**
593
594### cpuid_mask_ecx
595### cpuid_mask_edx
596### cpuid_mask_ext_ecx
597### cpuid_mask_ext_edx
598### cpuid_mask_l7s0_eax
599### cpuid_mask_l7s0_ebx
600### cpuid_mask_thermal_ecx
601### cpuid_mask_xsave_eax
602> `= <integer>`
603
604> Applicability: x86.  Default: `~0` (all bits set)
605
606The availability of these options are model specific.  Some processors don't
607support any of them, and no processor supports all of them.  Xen will ignore
608options on processors which are lacking support.
609
610These options can be used to alter the features visible via the `CPUID`
611instruction.  Settings applied here take effect globally, including for Xen
612and all guests.
613
614Note: Since Xen 4.7, it is no longer necessary to mask a host to create
615migration safety in heterogeneous scenarios.  All necessary CPUID settings
616should be provided in the VM configuration file.  Furthermore, it is
617recommended not to use this option, as doing so causes an unnecessary
618reduction of features at Xen's disposal to manage guests.
619
620### cpuidle (x86)
621> `= <boolean>`
622
623### cpuinfo (x86)
624> `= <boolean>`
625
626### crash-debug-debugkey
627### crash-debug-hwdom
628### crash-debug-kexeccmd
629### crash-debug-panic
630### crash-debug-watchdog
631> `= <string>`
632
633> Can be modified at runtime
634
635Specify debug-key actions in cases of crashes. Each of the parameters applies
636to a different crash reason. The `<string>` is a sequence of debug key
637characters, with `+` having the special meaning of a 10 millisecond pause.
638
639`crash-debug-debugkey` will be used for crashes induced by the `C` debug
640key (i.e. manually induced crash).
641
642`crash-debug-hwdom` denotes a crash of dom0.
643
644`crash-debug-kexeccmd` is an explicit request of dom0 to continue with the
645kdump kernel via kexec. Only available on hypervisors built with CONFIG_KEXEC.
646
647`crash-debug-panic` is a crash of the hypervisor.
648
649`crash-debug-watchdog` is a crash due to the watchdog timer expiring.
650
651It should be noted that dumping diagnosis data to the console can fail in
652multiple ways (missing data, hanging system, ...) depending on the reason
653of the crash, which might have left the hypervisor in a bad state. In case
654a debug-key action leads to another crash recursion will be avoided, so no
655additional debug-key actions will be performed in this case. A crash in the
656early boot phase will not result in any debug-key action, as the system
657might not yet be in a state where the handlers can work.
658
659So e.g. `crash-debug-watchdog=0+0r` would dump dom0 state twice with 10
660milliseconds between the two state dumps, followed by the run queues of the
661hypervisor, if the system crashes due to a watchdog timeout.
662
663Depending on the reason of the system crash it might happen that triggering
664some debug key action will result in a hang instead of dumping data and then
665doing a reboot or crash dump.
666
667### crashinfo_maxaddr
668> `= <size>`
669
670> Default: `4G`
671
672Specify the maximum address to allocate certain structures, if used in
673combination with the **low_crashinfo** command line option.
674
675### crashkernel
676> `= <ramsize-range>:<size>[,...][{@,<}<offset>]`
677> `= <size>[{@,<}<offset>]`
678> `= <size>,below=offset`
679
680Specify sizes and optionally placement of the crash kernel reservation
681area.  The `<ramsize-range>:<size>` pairs indicate how much memory to
682set aside for a crash kernel (`<size>`) for a given range of installed
683RAM (`<ramsize-range>`).  Each `<ramsize-range>` is of the form
684`<start>-[<end>]`.
685
686A trailing `@<offset>` specifies the exact address this area should be
687placed at, whereas `<` in place of `@` just specifies an upper bound of
688the address range the area should fall into.
689
690< and below are synonyomous, the latter being useful for grub2 systems
691which would otherwise require escaping of the < option
692
693
694### credit2_balance_over
695> `= <integer>`
696
697### credit2_balance_under
698> `= <integer>`
699
700### credit2_cap_period_ms
701> `= <integer>`
702
703> Default: `10`
704
705Domains subject to a cap receive a replenishment of their runtime budget
706once every cap period interval. Default is 10 ms. The amount of budget
707they receive depends on their cap. For instance, a domain with a 50% cap
708will receive 50% of 10 ms, so 5 ms.
709
710### credit2_load_precision_shift
711> `= <integer>`
712
713> Default: `18`
714
715Specify the number of bits to use for the fractional part of the
716values involved in Credit2 load tracking and load balancing math.
717
718### credit2_load_window_shift
719> `= <integer>`
720
721> Default: `30`
722
723Specify the number of bits to use to represent the length of the
724window (in nanoseconds) we use for load tracking inside Credit2.
725This means that, with the default value (30), we use
7262^30 nsec ~= 1 sec long window.
727
728Load tracking is done by means of a variation of exponentially
729weighted moving average (EWMA). The window length defined here
730is what tells for how long we give value to previous history
731of the load itself. In fact, after a full window has passed,
732what happens is that we discard all previous history entirely.
733
734A short window will make the load balancer quick at reacting
735to load changes, but also short-sighted about previous history
736(and hence, e.g., long term load trends). A long window will
737make the load balancer thoughtful of previous history (and
738hence capable of capturing, e.g., long term load trends), but
739also slow in responding to load changes.
740
741The default value of `1 sec` is rather long.
742
743### credit2_runqueue
744> `= cpu | core | socket | node | all`
745
746> Default: `socket`
747
748Specify how host CPUs are arranged in runqueues. Runqueues are kept
749balanced with respect to the load generated by the vCPUs running on
750them. Smaller runqueues (as in with `core`) means more accurate load
751balancing (for instance, it will deal better with hyperthreading),
752but also more overhead.
753
754Available alternatives, with their meaning, are:
755* `cpu`: one runqueue per each logical pCPUs of the host;
756* `core`: one runqueue per each physical core of the host;
757* `socket`: one runqueue per each physical socket (which often,
758            but not always, matches a NUMA node) of the host;
759* `node`: one runqueue per each NUMA node of the host;
760* `all`: just one runqueue shared by all the logical pCPUs of
761         the host
762
763Regardless of the above choice, Xen attempts to respect
764`sched_credit2_max_cpus_runqueue` limit, which may mean more than one runqueue
765for the `all` value. If that isn't intended, raise
766the `sched_credit2_max_cpus_runqueue` value.
767
768### dbgp
769> `= ehci[ <integer> | @pci<bus>:<slot>.<func> ]`
770> `= xhci[ <integer> | @pci<bus>:<slot>.<func> ][,share=<bool>|hwdom]`
771
772Specify the USB controller to use, either by instance number (when going
773over the PCI busses sequentially) or by PCI device (must be on segment 0).
774
775Use `ehci` for EHCI debug port, use `xhci` for XHCI debug capability.
776XHCI driver will wait indefinitely for the debug host to connect - make sure
777the cable is connected.
778The `share` option for xhci controls who else can use the controller:
779* `no`: use the controller exclusively for console, even hardware domain
780  (dom0) cannot use it
781* `hwdom`: hardware domain may use the controller too, ports not used for debug
782  console will be available for normal devices; this is the default
783* `yes`: the controller can be assigned to any domain; it is not safe to assign
784  the controller to untrusted domain
785
786Choosing `share=hwdom` (the default) or `share=yes` allows a domain to reset the
787controller, which may cause small portion of the console output to be lost.
788
789The `share=yes` configuration is not security supported.
790
791### debug_stack_lines
792> `= <integer>`
793
794> Default: `20`
795
796Limits the number lines printed in Xen stack traces.
797
798### debugtrace
799> `= [cpu:]<size>`
800
801> Default: `128`
802
803Specify the size of the console debug trace buffer. By specifying `cpu:`
804additionally a trace buffer of the specified size is allocated per cpu.
805The debug trace feature is only enabled in debugging builds of Xen.
806
807### dit (x86/Intel)
808> `= <boolean>`
809
810> Default: `CONFIG_DIT_DEFAULT`
811
812Specify whether Xen and guests should operate in Data Independent Timing
813mode (Intel calls this DOITM, Data Operand Independent Timing Mode). Note
814that enabling this option cannot guarantee anything beyond what underlying
815hardware guarantees (with, where available and known to Xen, respective
816tweaks applied).
817
818### dma_bits
819> `= <integer>`
820
821Specify the bit width of the DMA heap.
822
823### dom0
824    = List of [ pv | pvh, shadow=<bool>, verbose=<bool>,
825                cpuid-faulting=<bool>, msr-relaxed=<bool>,
826                pf-fixup=<bool> ] (x86)
827
828    = List of [ sve=<integer> ] (Arm64)
829
830Controls for how dom0 is constructed on x86 systems.
831
832*   The `pv` and `pvh` options select the virtualisation mode of dom0.
833
834    The `pv` option is only available when `CONFIG_PV` is compiled in.  The
835    `pvh` option is only available when `CONFIG_HVM` is compiled in.  When
836    both options are compiled in, the default is PV.
837
838    In addition, the following requirements must be met:
839
840    *   The dom0 kernel selected by the boot loader must be capable of the
841        selected mode.
842    *   For a PVH dom0, the hardware must have VT-x/SVM extensions available.
843
844*   The `shadow` boolean allows dom0 to be explicitly constructed using shadow
845    paging.  This option is unavailable when `CONFIG_SHADOW_PAGING` is
846    disabled.
847
848    For PVH, dom0 defaults to using HAP on capable hardware, and falls back to
849    shadow paging otherwise.  A PVH dom0 cannot be used if Xen is compiled
850    without shadow paging support, and the hardware lacks HAP support.
851
852    For PV, the use of dom0 shadow mode is only for development purposes.  PV
853    guests do no require any paging support by default.
854
855*   The `verbose` boolean is intended for diagnostics, and prints out extra
856    information during the dom0 build.  It defaults to the compile time choice
857    of `CONFIG_VERBOSE_DEBUG`.
858
859*   The `cpuid-faulting` boolean is an interim option, is only applicable to
860    PV dom0, and defaults to true.
861
862    Before Xen 4.13, the domain builder logic for guest construction depended
863    on seeing host CPUID values to function correctly.  As a result, CPUID
864    Faulting was never activated for PV dom0's, even on capable hardware.
865
866    In Xen 4.13, the domain builder logic has been fixed, and no longer has
867    this dependency.  As a consequence, CPUID Faulting is activated by default
868    even for PV dom0's.
869
870    However, as PV dom0's have always seen host CPUID data in the past, there
871    is a chance that further dependencies exist.  This boolean can be used to
872    restore the pre-4.13 behaviour.  If specifying `no-cpuid-faulting` fixes
873    an issue in dom0, please report a bug.
874
875*   The `msr-relaxed` boolean is an interim option, and defaults to false.
876
877    In Xen 4.15, the default behaviour for unhandled MSRs has been changed,
878    to avoid leaking host data into guests, and to avoid breaking guest
879    logic which uses \#GP probing to identify the availability of MSRs.
880
881    However, this new stricter behaviour has the possibility to break
882    guests, and a more 4.14-like behaviour can be selected by specifying
883    `dom0=msr-relaxed`.
884
885    If using this option is necessary to fix an issue, please report a bug.
886
887*   The `pf-fixup` boolean is only applicable when using a PVH dom0 and
888    defaults to false.
889
890    When running dom0 in PVH mode the dom0 kernel has no way to map MMIO
891    regions into its physical memory map, such mode relies on Xen dom0 builder
892    populating the physical memory map with all MMIO regions that dom0 should
893    access.  However Xen doesn't have a complete picture of the host memory
894    map, due to not being able to process ACPI dynamic tables.
895
896    The `pf-fixup` option allows Xen to attempt to add missing MMIO regions
897    to the dom0 physical memory map in response to page-faults generated by
898    dom0 trying to access unpopulated entries in the memory map.
899
900Enables features on dom0 on Arm systems.
901
902*   The `sve` integer parameter enables Arm SVE usage for Dom0 and sets the
903    maximum SVE vector length, the option is applicable only to Arm64 Dom0
904    kernels.
905    A value equal to 0 disables the feature, this is the default value.
906    Values below 0 means the feature uses the maximum SVE vector length
907    supported by hardware, if SVE is supported.
908    Values above 0 explicitly set the maximum SVE vector length for Dom0,
909    allowed values are from 128 to maximum 2048, being multiple of 128.
910    Please note that when the user explicitly specifies the value, if that value
911    is above the hardware supported maximum SVE vector length, the domain
912    creation will fail and the system will stop, the same will occur if the
913    option is provided with a positive non zero value, but the platform doesn't
914    support SVE.
915
916### dom0-cpuid
917    = List of comma separated booleans
918
919    Applicability: x86
920
921This option allows for fine tuning of the facilities dom0 will use, after
922accounting for hardware capabilities and Xen settings as enumerated via CPUID.
923
924Options are accepted in positive and negative form, to enable or disable
925specific features.  All selections via this mechanism are subject to normal
926CPU Policy safety and dependency logic.
927
928This option is intended for developers to opt dom0 into non-default features,
929and is not intended for use in production circumstances.  If using this option
930is necessary to fix an issue, please report a bug.
931
932### dom0-iommu
933    = List of [ passthrough=<bool>, strict=<bool>, map-inclusive=<bool>,
934                map-reserved=<bool>, none ]
935
936Controls for the dom0 IOMMU setup.
937
938*   The `passthrough` boolean controls whether IOMMU translation functionality
939    is disabled for devices in dom0 (`passthrough=1`) or whether the IOMMU is
940    used to ensure that dom0 can only DMA to its permitted areas of RAM
941    (`passthrough=0`).
942
943    This option is only applicable to x86 PV dom0's, and defaults to false.
944
945    Some older Intel VT-d hardware isn't capable of disabling translation
946    functionality on a per-device basis, and will cause this option to be
947    ignored and assumed to be 0.  Similar behaviour on such systems is only
948    available by fully disabling all IOMMUs.
949
950    This option is hardwired to false for x86 PVH dom0's (where a non-identity
951    transform is required for dom0 to function), and is ignored for ARM.
952
953*   The `strict` boolean is applicable to x86 PV dom0's only and defaults to
954    false.  It controls whether dom0 can have IOMMU mappings for all domain
955    RAM in the system, or only for its allocated RAM (and grant mappings etc.)
956
957    This option is hardwired to true for x86 PVH dom0's (as RAM belonging to
958    other domains in the system don't live in a compatible address space), and
959    is ignored for ARM.
960
961*   The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up
962    identity IOMMU mappings for all non-RAM regions below 4GB except for
963    unusable ranges, and ranges belonging to Xen.
964
965    Typically, some devices in a system use bits of RAM for communication, and
966    these areas should be listed as reserved in the E820 table and identified
967    via RMRR or IVMD entries in the ACPI tables, so Xen can ensure that they
968    are identity-mapped in the IOMMU.  However, some firmware makes mistakes,
969    and this option is a coarse-grain workaround for those errors.
970
971    Where possible, finer grain corrections should be made with the `rmrr=`,
972    `ivmd=`, `ivrs_hpet[]=`, or `ivrs_ioapic[]=` command line options.
973
974    This option is disabled by default, and deprecated and intended for
975    removal in future versions of Xen.  If specifying `map-inclusive` is the
976    only way to make your system boot, please report a bug.
977
978*   The `map-reserved` functionality is very similar to `map-inclusive`.
979
980    The differences from `map-inclusive` are that `map-reserved` is applicable
981    to both x86 PV and PVH dom0's, is enabled by default, and represents a
982    subset of the correction by only mapping reserved memory regions rather
983    than all non-RAM regions.
984
985*   The `none` option is intended for development purposes only, and skips
986    certain safety checks pertaining to the correct IOMMU configuration for
987    dom0 to boot.
988
989    Incorrect use of this option may result in a malfunctioning system.
990
991### dom0_ioports_disable (x86)
992> `= List of <hex>-<hex>`
993
994Specify a list of IO ports to be excluded from dom0 access.
995
996### dom0-llc-colors (arm64)
997> `= List of [ <integer> | <integer>-<integer> ]`
998
999> Default: `All available LLC colors`
1000
1001Specify dom0 LLC color configuration. This option is available only when
1002`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
1003colors are used.
1004
1005### dom0_max_vcpus
1006
1007Either:
1008
1009> `= <integer>`.
1010
1011The number of VCPUs to give to dom0.  This number of VCPUs can be more
1012than the number of PCPUs on the host.  The default is the number of
1013PCPUs.
1014
1015Or:
1016
1017> `= <min>-<max>` where `<min>` and `<max>` are integers.
1018
1019Gives dom0 a number of VCPUs equal to the number of PCPUs, but always
1020at least `<min>` and no more than `<max>`.  Using `<min>` may give
1021more VCPUs than PCPUs.  `<min>` or `<max>` may be omitted and the
1022defaults of 1 and unlimited respectively are used instead.
1023
1024For example, with `dom0_max_vcpus=4-8`:
1025
1026>        Number of
1027>     PCPUs | Dom0 VCPUs
1028>      2    |  4
1029>      4    |  4
1030>      6    |  6
1031>      8    |  8
1032>     10    |  8
1033
1034### dom0_mem (ARM)
1035> `= <size>`
1036
1037Set the amount of memory for the initial domain (dom0). It must be
1038greater than zero. This parameter is required (and only used) when the initial
1039domain is not described in the Device-Tree.
1040
1041### dom0_mem (x86)
1042> `= List of ( min:<sz> | max:<sz> | <sz> )`
1043
1044Set the amount of memory for the initial domain (dom0). If a size is
1045positive, it represents an absolute value.  If a size is negative, it
1046is subtracted from the total available memory.
1047
1048* `<sz>` specifies the exact amount of memory.
1049* `min:<sz>` specifies the minimum amount of memory.
1050* `max:<sz>` specifies the maximum amount of memory.
1051
1052If `<sz>` is not specified, the default is all the available memory
1053minus some reserve.  The reserve is 1/16 of the available memory or
1054128 MB (whichever is smaller).
1055
1056The amount of memory will be at least the minimum but never more than
1057the maximum (i.e., `max` overrides the `min` option).  If there isn't
1058enough memory then as much as possible is allocated.
1059
1060`max:<sz>` also sets the maximum reservation (the maximum amount of
1061memory dom0 can balloon up to).  If this is omitted then the maximum
1062reservation is unlimited.
1063
1064For example, to set dom0's initial memory allocation to 512MB but
1065allow it to balloon up as far as 1GB use `dom0_mem=512M,max:1G`
1066
1067> `<sz>` is: `<size> | [<size>+]<frac>%`
1068> `<frac>` is an integer < 100
1069
1070* `<frac>` specifies a fraction of host memory size in percent.
1071
1072So `<sz>` being `1G+25%` on a 256 GB host would result in 65 GB.
1073
1074If you use this option then it is highly recommended that you disable
1075any dom0 autoballooning feature present in your toolstack. See the
1076_xl.conf(5)_ man page or [Xen Best
1077Practices](https://wiki.xen.org/wiki/Xen_Best_Practices#Xen_dom0_dedicated_memory_and_preventing_dom0_memory_ballooning).
1078
1079This option doesn't have effect if pv-shim mode is enabled.
1080
1081### dom0_nodes (x86)
1082
1083> `= List of [ <integer> | relaxed | strict ]`
1084
1085> Default: `strict`
1086
1087Specify the NUMA nodes to place Dom0 on. Defaults for vCPU-s created
1088and memory assigned to Dom0 will be adjusted to match the node
1089restrictions set up here. Note that the values to be specified here are
1090ACPI PXM ones, not Xen internal node numbers. `relaxed` sets up vCPU
1091affinities to prefer but be not limited to the specified node(s).
1092
1093### dom0_vcpus_pin
1094> `= <boolean>`
1095
1096> Default: `false`
1097
1098Pin dom0 vcpus to their respective pcpus
1099
1100### dtuart (ARM)
1101> `= path [:options]`
1102
1103> Default: `""`
1104
1105Specify the full path in the device tree for the UART.  If the path doesn't
1106start with `/`, it is assumed to be an alias.  The options are device specific.
1107
1108### e820-mtrr-clip (x86)
1109> `= <boolean>`
1110
1111Flag that specifies if RAM should be clipped to the highest cacheable
1112MTRR.
1113
1114> Default: `true` on Intel CPUs, otherwise `false`
1115
1116### e820-verbose (x86)
1117> `= <boolean>`
1118
1119> Default: `false`
1120
1121Flag that enables verbose output when processing e820 information and
1122applying clipping.
1123
1124### edd (x86)
1125> `= off | on | skipmbr`
1126
1127Control retrieval of Extended Disc Data (EDD) from the BIOS during
1128boot.
1129
1130### edid (x86)
1131> `= no | force`
1132
1133Either force retrieval of monitor EDID information via VESA DDC, or
1134disable it (edid=no). This option should not normally be required
1135except for debugging purposes.
1136
1137### efi
1138    = List of [ rs=<bool>, attr=no|uc ]
1139
1140Controls for interacting with the system Extended Firmware Interface.
1141
1142*   The `rs` boolean controls whether Runtime Services are used.  By default,
1143    Xen uses Runtime Services itself, and proxies certain calls on behalf of
1144    dom0.  Selecting `rs=0` prohibits all use of Runtime Services.
1145
1146*   The `attr=` string exists to specify what to do with memory regions of
1147    unknown/unrecognised cacheability.  `attr=no` is the default and will
1148    leave the memory regions unmapped, while `attr=uc` will map them as fully
1149    uncacheable.
1150
1151### ept
1152> `= List of [ ad=<bool>, pml=<bool>, exec-sp=<bool> ]`
1153
1154> Applicability: Intel
1155
1156Extended Page Tables are a feature of Intel's VT-x technology, whereby
1157hardware manages the virtualisation of HVM guest pagetables.  EPT was
1158introduced with the Nehalem architecture.
1159
1160*   The `ad` boolean controls hardware tracking of Access and Dirty bits in the
1161    EPT pagetables, and was first introduced in Broadwell Server.
1162
1163    By default, Xen will use A/D tracking when available in hardware, except
1164    on Avoton processors affected by erratum AVR41.  Explicitly choosing
1165    `ad=0` will disable the use of A/D tracking on capable hardware, whereas
1166    choosing `ad=1` will cause tracking to be used even on AVR41-affected
1167    hardware.
1168
1169*   The `pml` boolean controls the use of Page Modification Logging, which is
1170    also introduced in Broadwell Server.
1171
1172    PML is a feature whereby the processor generates a list of pages which
1173    have been dirtied.  This is necessary information for operations such as
1174    live migration, and having the processor maintain the list of dirtied
1175    pages is more efficient than traditional software implementations where
1176    all guest writes trap into Xen so the dirty bitmap can be maintained.
1177
1178    By default, Xen will use PML when it is available in hardware.  PML
1179    functionally depends on A/D tracking, so choosing `ad=0` will implicitly
1180    disable PML.  `pml=0` can be used to prevent the use of PML on otherwise
1181    capable hardware.
1182
1183*   The `exec-sp` boolean controls whether EPT superpages with execute
1184    permissions are permitted.  In general this is good for performance.
1185
1186    However, on processors vulnerable CVE-2018-12207, HVM guest kernels can
1187    use executable superpages to crash the host.  By default, executable
1188    superpages are disabled on affected hardware.
1189
1190    If HVM guest kernels are trusted not to mount a DoS against the system,
1191    this option can enabled to regain performance.
1192
1193    This boolean may be modified at runtime using `xl set-parameters
1194    ept=[no-]exec-sp` to switch between fast and secure.
1195
1196    *   When switching from secure to fast, preexisting HVM domains will run
1197        at their current performance until they are rebooted; new domains will
1198        run without any overhead.
1199
1200    *   When switching from fast to secure, all HVM domains will immediately
1201        suffer a performance penalty.
1202
1203    **Warning: No guarantee is made that this runtime option will be retained
1204      indefinitely, or that it will retain this exact behaviour.  It is
1205      intended as an emergency option for people who first chose fast, then
1206      change their minds to secure, and wish not to reboot.**
1207
1208### extra_guest_irqs (x86)
1209> `= [<domU number>][,<dom0 number>]`
1210
1211> Default: `32,<variable>`
1212
1213Change the number of PIRQs available for guests.  The optional first number is
1214common for all domUs, while the optional second number (preceded by a comma)
1215is for dom0.  Changing the setting for domU has no impact on dom0 and vice
1216versa.  For example to change dom0 without changing domU, use
1217`extra_guest_irqs=,512`.  The default value for Dom0 and an eventual separate
1218hardware domain is architecture dependent.  The upper limit for both values on
1219x86 is such that the resulting total number of IRQs can't be higher than 32768.
1220Note that specifying zero as domU value means zero, while for dom0 it means
1221to use the default.  Note further that the Dom0 setting has no useful meaning
1222for the PVH case; use of the option may have an adverse effect there, though.
1223
1224### ext_regions (Arm)
1225> `= <boolean>`
1226
1227> Default : `true`
1228
1229Flag to enable or disable support for extended regions for Dom0 and
1230Dom0less DomUs.
1231
1232Extended regions are ranges of unused address space exposed to the guest
1233as "safe to use" for special memory mappings. Disable if your board
1234device tree is incomplete.
1235
1236### flask
1237> `= permissive | enforcing | late | disabled`
1238
1239> Default: `enforcing`
1240
1241Specify how the FLASK security server should be configured.  This option is only
1242available if the hypervisor was compiled with FLASK support.  This can be
1243enabled by running either:
1244- make -C xen config and enabling XSM and FLASK.
1245- make -C xen menuconfig and enabling 'FLux Advanced Security Kernel support' and 'Xen Security Modules support'
1246
1247* `permissive`: This is intended for development and is not suitable for use
1248  with untrusted guests.  If a policy is provided by the bootloader, it will be
1249  loaded; errors will be reported to the ring buffer but will not prevent
1250  booting.  The policy can be changed to enforcing mode using "xl setenforce".
1251* `enforcing`: This will cause the security server to enter enforcing mode prior
1252  to the creation of domain 0.  If an valid policy is not provided by the
1253  bootloader and no built-in policy is present, the hypervisor will not continue
1254  booting.
1255* `late`: This disables loading of the built-in security policy or the policy
1256  provided by the bootloader.  FLASK will be enabled but will not enforce access
1257  controls until a policy is loaded by a domain using "xl loadpolicy".  Once a
1258  policy is loaded, FLASK will run in enforcing mode unless "xl setenforce" has
1259  changed that setting.
1260* `disabled`: This causes the XSM framework to revert to the dummy module.  The
1261  dummy module provides the same security policy as is used when compiling the
1262  hypervisor without support for XSM.  The xsm_op hypercall can also be used to
1263  switch to this mode after boot, but there is no way to re-enable FLASK once
1264  the dummy module is loaded.
1265
1266### font
1267> `= <height>` where height is `8x8 | 8x14 | 8x16`
1268
1269Specify the font size when using the VESA console driver.
1270
1271### force-ept (Intel)
1272> `= <boolean>`
1273
1274> Default: `false`
1275
1276Allow EPT to be enabled when VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is not
1277present.
1278
1279*Warning:*
1280Due to CVE-2013-2212, VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is by default
1281required as a prerequisite for using EPT.  If you are not using PCI Passthrough,
1282or trust the guest administrator who would be using passthrough, then the
1283requirement can be relaxed.  This option is particularly useful for nested
1284virtualization, to allow the L1 hypervisor to use EPT even if the L0 hypervisor
1285does not provide `VM_ENTRY_LOAD_GUEST_PAT`.
1286
1287### gnttab
1288> `= List of [ max-ver:<integer>, transitive=<bool>, transfer=<bool> ]`
1289
1290> Default (Arm): `gnttab=max-ver:1`
1291> Default (x86,PV): `gnttab=max-ver:2,transitive,transfer`
1292> Default (x86,HVM): `gnttab=max-ver:2,transitive`
1293
1294Control various aspects of the grant table behaviour available to guests.
1295
1296* `max-ver` Select the maximum grant table version to offer to guests.  Valid
1297version are 1 and 2.
1298* `transitive` Permit or disallow the use of transitive grants.  Note that the
1299use of grant table v2 without transitive grants is an ABI breakage from the
1300guests point of view.
1301* `transfer` Permit or disallow the GNTTABOP_transfer operation of the
1302grant table hypercall.  Note that disallowing GNTTABOP_transfer is an ABI
1303breakage from the guests point of view.  This option is only available on
1304hypervisors configured to support PV guests.
1305
1306The usage of gnttab v2 is not security supported on ARM platforms.
1307
1308### gnttab_max_frames
1309> `= <integer>`
1310
1311> Default: `64`
1312
1313> Can be modified at runtime
1314
1315Specify the default upper bound on the number of frames which any domain may
1316use as part of its grant table unless a different value is specified at domain
1317creation.
1318
1319Note this value is the effective upper bound for dom0.
1320
1321### gnttab_max_maptrack_frames
1322> `= <integer>`
1323
1324> Default: `1024`
1325
1326> Can be modified at runtime
1327
1328Specify the default upper bound on the number of frames which any domain may
1329use as part of its maptrack array unless a different value is specified at
1330domain creation.
1331
1332Note this value is the effective upper bound for dom0.
1333
1334### global-pages
1335    = <boolean>
1336
1337    Applicability: x86
1338    Default: true unless running virtualized on AMD or Hygon hardware
1339
1340Control whether to use global pages for PV guests, and thus the need to
1341perform TLB flushes by writing to CR4.  This is a performance trade-off.
1342
1343AMD SVM does not support selective trapping of CR4 writes, which means that a
1344global TLB flush (two CR4 writes) takes two VMExits, and massively outweigh
1345the benefit of using global pages to begin with.  This case is easy for Xen to
1346spot, and is accounted for in the default setting.
1347
1348Other cases where this option might be a benefit is on VT-x hardware when
1349selective CR4 writes are not supported/enabled by the hypervisor, or in any
1350virtualised case using shadow paging.  These are not easy for Xen to spot, so
1351are not accounted for in the default setting.
1352
1353### guest_loglvl
1354> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all`
1355
1356> Default: `guest_loglvl=none/warning`
1357
1358> Can be modified at runtime
1359
1360Set the logging level for Xen guests.  Any log message with equal more
1361more importance will be printed.
1362
1363The optional `<rate-limited level>` option instructs which severities
1364should be rate limited.
1365
1366### hap (x86)
1367> `= <boolean>`
1368
1369> Default: `true`
1370
1371Flag to globally enable or disable support for Hardware Assisted
1372Paging (HAP)
1373
1374### hap_1gb (x86)
1375> `= <boolean>`
1376
1377> Default: `true`
1378
1379Flag to enable 1 GB host page table support for Hardware Assisted
1380Paging (HAP).
1381
1382### hap_2mb (x86)
1383> `= <boolean>`
1384
1385> Default: `true`
1386
1387Flag to enable 2 MB host page table support for Hardware Assisted
1388Paging (HAP).
1389
1390### hardware_dom
1391> `= <domid>`
1392
1393> Default: `0`
1394
1395Enable late hardware domain creation using the specified domain ID.  This is
1396intended to be used when domain 0 is a stub domain which builds a disaggregated
1397system including a hardware domain with the specified domain ID.  This option is
1398supported only when compiled with XSM on x86.
1399
1400### hest_disable
1401> ` = <boolean>`
1402
1403> Default: `false`
1404
1405Control Xens use of the APEI Hardware Error Source Table, should one be found.
1406
1407### highmem-start (x86)
1408> `= <size>`
1409
1410Specify the memory boundary past which memory will be treated as highmem (x86
1411debug hypervisor only).
1412
1413### hmp-unsafe (arm)
1414> `= <boolean>`
1415
1416> Default : `false`
1417
1418Say yes at your own risk if you want to enable heterogenous computing
1419(such as big.LITTLE). This may result to an unstable and insecure
1420platform, unless you manually specify the cpu affinity of all domains so
1421that all vcpus are scheduled on the same class of pcpus (big or LITTLE
1422but not both). vcpu migration between big cores and LITTLE cores is not
1423supported. See docs/misc/arm/big.LITTLE.txt for more information.
1424
1425When the hmp-unsafe option is disabled (default), CPUs that are not
1426identical to the boot CPU will be parked and not used by Xen.
1427
1428### hpet
1429    = List of [ <bool> | broadcast=<bool> | legacy-replacement=<bool> ]
1430
1431    Applicability: x86
1432
1433Controls Xen's use of the system's High Precision Event Timer.  By default,
1434Xen will use an HPET when available and not subject to errata.  Use of the
1435HPET can be disabled by specifying `hpet=0`.
1436
1437 * The `broadcast` boolean is disabled by default, but forces Xen to keep
1438   using the broadcast for CPUs in deep C-states even when an RTC interrupt is
1439   enabled.  This then also affects raising of the RTC interrupt.
1440
1441 * The `legacy-replacement` boolean allows for control over whether Legacy
1442   Replacement mode is enabled.
1443
1444   Legacy Replacement mode is intended for hardware which does not have an
1445   8254 PIT, and allows the HPET to be configured into a compatible mode.
1446   Intel chipsets from Skylake/ApolloLake onwards can turn the PIT off for
1447   power saving reasons, and there is no platform-agnostic mechanism for
1448   discovering this.
1449
1450   By default, Xen will not change hardware configuration, unless the PIT
1451   appears to be absent, at which point Xen will try to enable Legacy
1452   Replacement mode before falling back to pre-IO-APIC interrupt routing
1453   options.
1454
1455   This behaviour can be inhibited by specifying `legacy-replacement=0`.
1456   Alternatively, this mode can be enabled unconditionally (if available) by
1457   specifying `legacy-replacement=1`.
1458
1459### hpetbroadcast (x86)
1460> `= <boolean>`
1461
1462Deprecated alternative of `hpet=broadcast`.
1463
1464### hvm_debug (x86)
1465> `= <integer>`
1466
1467The specified value is a bit mask with the individual bits having the
1468following meaning:
1469
1470>     Bit  0 - debug level 0 (unused at present)
1471>     Bit  1 - debug level 1 (Control Register logging)
1472>     Bit  2 - debug level 2 (VMX logging of MSR restores when context switching)
1473>     Bit  3 - debug level 3 (unused at present)
1474>     Bit  4 - I/O operation logging
1475>     Bit  5 - vMMU logging
1476>     Bit  6 - vLAPIC general logging
1477>     Bit  7 - vLAPIC timer logging
1478>     Bit  8 - vLAPIC interrupt logging
1479>     Bit  9 - vIOAPIC logging
1480>     Bit 10 - hypercall logging
1481>     Bit 11 - MSR operation logging
1482
1483Recognized in debug builds of the hypervisor only.
1484
1485### hvm_fep (x86)
1486> `= <boolean>`
1487
1488> Default: `false`
1489
1490Allow use of the Forced Emulation Prefix in HVM guests, to allow emulation of
1491arbitrary instructions.
1492
1493This option is intended for development and testing purposes.
1494
1495*Warning*
1496As this feature opens up the instruction emulator to arbitrary
1497instruction from an HVM guest, don't use this in production system. No
1498security support is provided when this flag is set.
1499
1500### hvm_port80 (x86)
1501> `= <boolean>`
1502
1503> Default: `true`
1504
1505Specify whether guests are to be given access to physical port 80
1506(often used for debugging purposes), to override the DMI based
1507detection of systems known to misbehave upon accesses to that port.
1508
1509### idle_latency_factor (x86)
1510> `= <integer>`
1511
1512### ioapic_ack (x86)
1513> `= old | new`
1514
1515> Default: `new` unless directed-EOI is supported
1516
1517### iommu
1518    = List of [ <bool>, verbose, debug, force, required,
1519                quarantine=<bool>|scratch-page,
1520                sharept, superpages, intremap, intpost, crash-disable,
1521                snoop, qinval, igfx, amd-iommu-perdev-intremap,
1522                dom0-{passthrough,strict} ]
1523
1524    All sub-options are boolean in nature.
1525
1526I/O Memory Memory Units perform a function similar to the CPU MMU (hence the
1527name), but typically exist as a discrete device, integrated as part of a PCI
1528Root Complex.  The most common configuration is to have one IOMMU per package
1529(for on-die PCIe devices and directly attached PCIe lanes), and one IOMMU
1530covering the remaining I/O in the system.
1531
1532The functionality in an IOMMU commonly falls into two orthogonal categories:
1533
15341.  DMA remapping which uses a pagetable-like hierarchical structure and maps
1535    I/O Virtual Addresses (DFNs - Device Frame Numbers in Xen's terminology)
1536    to System Physical Addresses (MFNs - Machine Frame Numbers in Xen's
1537    terminology).
1538
15392.  Interrupt Remapping, which controls incoming Message Signalled Interrupt
1540    requests, including their routing to specific CPUs.
1541
1542IOMMU functionality can be used to provide a translation which the hardware
1543device driver isn't aware of (e.g. PCI Passthrough and a native driver inside
1544the guest) and/or to enforce fine-grained control over the memory and
1545interrupts which a device is attempting to access.
1546
1547By default, IOMMUs are configured for use if they are available.  An overall
1548boolean (e.g. `iommu=no`) can override this and leave the IOMMUs disabled.
1549
1550*   The `verbose` and `debug` booleans can be used to print additional
1551    diagnostic information.  Neither are active by default.
1552
1553*   The `force` and `required` booleans are synonymous and, when requested,
1554    will prevent Xen from booting if IOMMUs aren't discovered and enabled
1555    successfully.
1556
1557*   The `quarantine` option can be used to control Xen's behavior when
1558    de-assigning devices from guests.  The default behaviour is chosen at
1559    compile time, and is one of `CONFIG_IOMMU_QUARANTINE_{NONE,BASIC,SCRATCH_PAGE}`.
1560
1561    When a PCI device is assigned to an untrusted domain, it is possible
1562    for that domain to program the device to DMA to an arbitrary address.
1563    The IOMMU is used to protect the host from malicious DMA by making
1564    sure that the device addresses can only target memory assigned to the
1565    guest.  However, when the guest domain is torn down, assigning the
1566    device back to the hardware domain would allow any in-flight DMA to
1567    potentially target critical host data.  To avoid this, quarantining
1568    should be enabled.  Quarantining can be done in two ways: In its basic
1569    form, all in-flight DMA will simply be forced to encounter IOMMU
1570    faults.  Since there are systems where doing so can cause host lockup,
1571    an alternative form is available where accesses to memory will be directed
1572    to a scratch page. The implication here is that such accesses will go
1573    unnoticed, i.e. an admin may not become aware of the underlying problem.
1574
1575    Therefore, if this option is set to true (the default), Xen always
1576    quarantines such devices; they must be explicitly assigned back to Dom0
1577    before they can be used there again.  If set to "scratch-page", still
1578    active DMA operations will additionally be directed to a "scratch" page.  If
1579    set to false, Xen will only quarantine devices the toolstack has arranged
1580    for getting quarantined, and only in the "basic" form.
1581
1582    This option is only valid on builds supporting PCI.
1583
1584*   The `sharept` boolean controls whether the IOMMU pagetables are shared
1585    with the CPU-side HAP pagetables, or allocated separately.  Sharing
1586    reduces the memory overhead, but doesn't work in combination with CPU-side
1587    pagefault-based features, e.g. dirty VRAM tracking when a PCI device is
1588    assigned.
1589
1590    Due to implementation choices, sharing pagetables doesn't work on AMD
1591    hardware, and this option is ignored.  It is enabled by default on Intel
1592    systems.
1593
1594    This option is ignored on ARM, and the pagetables are always shared.
1595
1596*   The `superpages` boolean controls whether superpage mappings may be used
1597    in IOMMU page tables.  If using this option is necessary to fix an issue,
1598    please report a bug.
1599
1600    This option is only valid on x86.
1601
1602*   The `intremap` boolean controls the Interrupt Remapping sub-feature, and
1603    is active by default on compatible hardware.  On x86 systems, the first
1604    generation of IOMMUs only supported DMA remapping, and Interrupt Remapping
1605    appeared in the second generation.
1606
1607    This option is only valid on x86.
1608
1609*   The `intpost` boolean controls the Posted Interrupt sub-feature.  In
1610    combination with APIC acceleration (VT-x APICV, SVM AVIC), the IOMMU can
1611    be configured to deliver interrupts from assigned PCI devices directly
1612    into the guest, without trapping out into hypervisor context.
1613
1614    This option depends on `intremap`, and is disabled by default due to some
1615    corner cases in the implementation which have yet to be resolved.
1616
1617    This option is only valid on x86, and only builds of Xen with HVM support.
1618
1619*   The `crash-disable` boolean controls disabling IOMMU functionality (DMAR/IR/QI)
1620    before switching to a crash kernel. This option is inactive by default and
1621    is for compatibility with older kdump kernels only. Modern kernels copy
1622    all the necessary tables from the previous one following kexec which makes
1623    the transition transparent for them with IOMMU functions still on.
1624
1625The following options are specific to Intel VT-d hardware:
1626
1627*   The `snoop` boolean controls the Snoop Control sub-feature, and is active
1628    by default on compatible hardware.
1629
1630    An incoming DMA request may specify _Snooped_ (query the CPU caches for
1631    the appropriate lines) or _Non-Snooped_ (don't query the CPU caches).
1632    _Non-Snooped_ accesses incur less latency, but behind-the-scenes
1633    hypervisor activity can invalidate the expectations of the device driver,
1634    and Snoop Control allows the hypervisor to force DMA requests to be
1635    _Snooped_ when they would otherwise not be.
1636
1637*   The `qinval` boolean controls the Queued Invalidation sub-feature, and is
1638    active by default on compatible hardware.  Queued Invalidation is a
1639    feature in second-generation IOMMUs and is a functional prerequisite for
1640    Interrupt Remapping. Note that Xen disregards this setting for Intel VT-d
1641    version 6 and greater as Registered-Based Invalidation isn't supported
1642    by them.
1643
1644*   The `igfx` boolean is active by default, and controls whether IOMMUs in
1645    front of solely graphics devices get enabled or not.
1646
1647    It is intended as a debugging mechanism for graphics issues, and to be
1648    similar to Linux's `intel_iommu=igfx_off` option.  If specifying `no-igfx`
1649    fixes anything, please report the problem.
1650
1651The following options are specific to AMD-Vi hardware:
1652
1653*   The `amd-iommu-perdev-intremap` boolean controls whether the interrupt
1654    remapping table is per device (the default), or a single global table for
1655    the entire system.
1656
1657    Using a global table is not security supported as it allows all devices to
1658    impersonate each other as far as interrupts as concerned (see XSA-36), but
1659    it is a workaround for SP5100 Erratum 28.
1660
1661**WARNING: The `dom0-passthrough` and `dom0-strict` booleans are both
1662deprecated, and superseded by _dom0-iommu={passthrough,strict}_ respectively -
1663using both the old and new command line options in combination is undefined.**
1664
1665### iommu_dev_iotlb_timeout
1666> `= <integer>`
1667
1668> Default: `1000`
1669
1670Specify the timeout of the device IOTLB invalidation in milliseconds.
1671By default, the timeout is 1000 ms. When you see error 'Queue invalidate
1672wait descriptor timed out', try increasing this value.
1673
1674### iommu_inclusive_mapping
1675> `= <boolean>`
1676
1677**WARNING: This command line option is deprecated, and superseded by
1678_dom0-iommu=map-inclusive_ - using both options in combination is undefined.**
1679
1680### irq-max-guests (x86)
1681> `= <integer>`
1682
1683> Default: `32`
1684
1685Maximum number of guests any individual IRQ could be shared between,
1686i.e. a limit on the number of guests it is possible to start each having
1687assigned a device sharing a common interrupt line.  Accepts values between
16881 and 255.
1689
1690### irq_ratelimit (x86)
1691> `= <integer>`
1692
1693### irq_vector_map (x86)
1694
1695### ivmd (x86)
1696> `= <start>[-<end>][=<bdf1>[-<bdf1'>][,<bdf2>[-<bdf2'>][,...]]][;<start>...]`
1697
1698Define IVMD-like ranges that are missing from ACPI tables along with the
1699device(s) they belong to, and use them for 1:1 mapping.  End addresses can be
1700omitted when exactly one page is meant.  The ranges are inclusive when start
1701and end are specified.  Note that only PCI segment 0 is supported at this time,
1702but it is fine to specify it explicitly.
1703
1704'start' and 'end' values are page numbers (not full physical addresses),
1705in hexadecimal format (can optionally be preceded by "0x").
1706
1707Omitting the optional (range of) BDF spcifiers signals that the range is to
1708be applied to all devices.
1709
1710Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be
1711reserved, and devices 0:0:1a.0...0:0:1a.3 collectively require three pages
1712(0xd5d46 thru 0xd5d48) to be reserved, one usage would be:
1713
1714ivmd=d5d45=0:1d.0;0xd5d46-0xd5d48=0:1a.0-0:1a.3
1715
1716Note: grub2 requires to escape or quote special characters, like ';' when
1717multiple ranges are specified - refer to the grub2 documentation.
1718
1719### ivrs_hpet[`<hpet>`] (AMD)
1720> `=[<seg>:]<bus>:<device>.<func>`
1721
1722Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of HPET
1723`<hpet>` instead of the one specified by the IVHD sub-tables of the IVRS
1724ACPI table.
1725
1726### ivrs_ioapic[`<ioapic>`] (AMD)
1727> `=[<seg>:]<bus>:<device>.<func>`
1728
1729Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of IO-APIC
1730`<ioapic>` instead of the one specified by the IVHD sub-tables of the IVRS
1731ACPI table.
1732
1733### lapic (x86)
1734> `= <boolean>`
1735
1736Force the use of use of the local APIC on a uniprocessor system, even
1737if left disabled by the BIOS.
1738
1739### lapic_timer_c2_ok (x86)
1740> `= <boolean>`
1741
1742### ler (x86)
1743> `= <boolean>`
1744
1745> Default: false
1746
1747This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
1748in hypervisor context to be able to dump the Last Interrupt/Exception To/From
1749record with other registers.
1750
1751### llc-coloring (arm64)
1752> `= <boolean>`
1753
1754> Default: `false`
1755
1756Flag to enable or disable LLC coloring support at runtime. This option is
1757available only when `CONFIG_LLC_COLORING` is enabled. See the general
1758cache coloring documentation for more info.
1759
1760### llc-nr-ways (arm64)
1761> `= <integer>`
1762
1763> Default: `Obtained from hardware`
1764
1765Specify the number of ways of the Last Level Cache. This option is available
1766only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
1767to find the number of supported cache colors. By default the value is
1768automatically computed by probing the hardware, but in case of specific needs,
1769it can be manually set. Those include failing probing and debugging/testing
1770purposes so that it's possible to emulate platforms with different number of
1771supported colors. If set, also "llc-size" must be set, otherwise the default
1772will be used. Note that using both options implies "llc-coloring=on" unless an
1773earlier "llc-coloring=off" is there.
1774
1775### llc-size (arm64)
1776> `= <size>`
1777
1778> Default: `Obtained from hardware`
1779
1780Specify the size of the Last Level Cache. This option is available only when
1781`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
1782the number of supported cache colors. By default the value is automatically
1783computed by probing the hardware, but in case of specific needs, it can be
1784manually set. Those include failing probing and debugging/testing purposes so
1785that it's possible to emulate platforms with different number of supported
1786colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
1787used. Note that using both options implies "llc-coloring=on" unless an
1788earlier "llc-coloring=off" is there.
1789
1790### lock-depth-size
1791> `= <integer>`
1792
1793> Default: `lock-depth-size=64`
1794
1795Specifies the maximum number of nested locks tested for illegal recursions.
1796Higher nesting levels still work, but recursion testing is omitted for those
1797levels. In case an illegal recursion is detected the system will crash
1798immediately. Specifying `0` will disable all testing of illegal lock nesting.
1799
1800This option is available for hypervisors built with CONFIG_DEBUG_LOCKS only.
1801
1802### loglvl
1803> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all`
1804
1805> Default: `loglvl=info`
1806
1807> Can be modified at runtime
1808
1809Set the logging level for Xen.  Any log message with equal more more
1810importance will be printed.
1811
1812The optional `<rate-limited level>` option instructs which severities
1813should be rate limited.
1814
1815### low_crashinfo
1816> `= none | min | all`
1817
1818> Default: `none` if not specified at all, or to `min` if **low_crashinfo** is present without qualification.
1819
1820This option is only useful for hosts with a 32bit dom0 kernel, wishing
1821to use kexec functionality in the case of a crash.  It represents
1822which data structures should be deliberately allocated in low memory,
1823so the crash kernel may find find them.  Should be used in combination
1824with **crashinfo_maxaddr**.
1825
1826### low_mem_virq_limit
1827> `= <size>`
1828
1829> Default: `64M`
1830
1831Specify the threshold below which Xen will inform dom0 that the quantity of
1832free memory is getting low.  Specifying `0` will disable this notification.
1833
1834### maxcpus
1835> `= <integer>`
1836
1837Specify the maximum number of CPUs that should be brought up.
1838
1839This option is ignored in **pv-shim** mode.
1840
1841**WARNING: On Arm big.LITTLE systems, when `hmp-unsafe` option is enabled, this command line
1842option does not guarantee on which CPU types will be used.**
1843
1844### max_cstate (x86)
1845> `= <integer>[,<integer>]`
1846
1847Specify the deepest C-state CPUs are permitted to be placed in, and
1848optionally the maximum sub C-state to be used used.  The latter only applies
1849to the highest permitted C-state.
1850
1851### max_gsi_irqs (x86)
1852> `= <integer>`
1853
1854Specifies the number of interrupts to be use for pin (IO-APIC or legacy PIC)
1855based interrupts. Any higher IRQs will be available for use via PCI MSI.
1856
1857### max_lpi_bits (arm)
1858> `= <integer>`
1859
1860Specifies the number of ARM GICv3 LPI interrupts to allocate on the host,
1861presented as the number of bits needed to encode it. This must be at least
186214 and not exceed 32, and each LPI requires one byte (configuration) and
1863one pending bit to be allocated.
1864Defaults to 20 bits (to cover at most 1048576 interrupts).
1865
1866### mce (x86)
1867> `= <boolean>`
1868
1869> Default: `true`
1870
1871Allows to disable the use of Machine Check Exceptions.  Note that doing
1872so may result in silent shutdown of the system in case an event occurs
1873which would have resulted in raising a Machine Check Exception.  Silent
1874here is as far as Xen is concerned; firmware may offer to retrieve some
1875collected data.
1876
1877### mce_fb (Intel)
1878> `= <boolean>`
1879
1880> Default: `false`
1881
1882Force broadcasting of Machine Check Exceptions, suppressing the use of
1883Local MCE functionality available in newer Intel hardware.
1884
1885### mce_verbosity (x86)
1886> `= verbose`
1887
1888Specify verbose machine check output.
1889
1890### mem (x86)
1891> `= <size>`
1892
1893Specify the maximum address of physical RAM.  Any RAM beyond this
1894limit is ignored by Xen.
1895
1896### memop-max-order
1897> `= [<domU>][,[<ctldom>][,[<hwdom>][,<ptdom>]]]`
1898
1899> x86 default: `9,18,12,12`
1900> ARM default: `9,18,10,10`
1901
1902Change the maximum order permitted for allocation (or allocation-like)
1903requests issued by the various kinds of domains (in this order:
1904ordinary DomU, control domain, hardware domain, and - when supported
1905by the platform - DomU with pass-through device assigned).
1906
1907### mmcfg (x86)
1908> `= <boolean>[,amd-fam10]`
1909
1910> Default: `1`
1911
1912Specify if the MMConfig space should be enabled.
1913
1914### mmio-relax (x86)
1915> `= <boolean> | all`
1916
1917> Default: `false`
1918
1919By default, domains may not create cached mappings to MMIO regions.
1920This option relaxes the check for Domain 0 (or when using `all`, all PV
1921domains), to permit the use of cacheable MMIO mappings.
1922
1923### msi (x86)
1924> `= <boolean>`
1925
1926> Default: `true`
1927
1928Force Xen to (not) use PCI-MSI, even if ACPI FADT says otherwise.
1929
1930### mtrr.show (x86)
1931> `= <boolean>`
1932
1933> Default: `false`
1934
1935Print boot time MTRR state.
1936
1937### mwait-idle (x86)
1938> `= <boolean>`
1939
1940> Default: `true`
1941
1942Use the MWAIT idle driver (with model specific C-state knowledge) instead
1943of the ACPI based one.
1944
1945### nmi (x86)
1946> `= ignore | dom0 | fatal`
1947
1948> Default: `fatal` for a debug build, or `dom0` for a non-debug build
1949
1950Specify what Xen should do in the event of an NMI parity or I/O error.
1951`ignore` discards the error; `dom0` causes Xen to report the error to
1952dom0, while 'fatal' causes Xen to print diagnostics and then hang.
1953
1954### noapic (x86)
1955
1956Instruct Xen to ignore any IOAPICs that are present in the system, and
1957instead continue to use the legacy PIC. This is _not_ recommended with
1958pvops type kernels.
1959
1960Because responsibility for APIC setup is shared between Xen and the
1961domain 0 kernel this option is automatically propagated to the domain
19620 command line.
1963
1964### invpcid (x86)
1965> `= <boolean>`
1966
1967> Default: `true`
1968
1969By default, Xen will use the INVPCID instruction for TLB management if
1970it is available.  This option can be used to cause Xen to fall back to
1971older mechanisms, which are generally slower.
1972
1973### load-balance-ratelimit
1974> `= <integer>`
1975
1976The minimum interval between load balancing events on a given pcpu, in
1977microseconds.  A value of '0' will disable rate limiting.  Maximum
1978value 1 second. At the moment only credit honors this parameter.
1979Default 1ms.
1980
1981### noirqbalance (x86)
1982> `= <boolean>`
1983
1984Disable software IRQ balancing and affinity. This can be used on
1985systems such as Dell 1850/2850 that have workarounds in hardware for
1986IRQ routing issues.
1987
1988### nolapic (x86)
1989> `= <boolean>`
1990
1991> Default: `false`
1992
1993Ignore the local APIC on a uniprocessor system, even if enabled by the
1994BIOS.
1995
1996### no-real-mode (x86)
1997> `= <boolean>`
1998
1999Do not execute real-mode bootstrap code when booting Xen. This option
2000should not be used except for debugging. It will effectively disable
2001the **vga** option, which relies on real mode to set the video mode.
2002
2003### noreboot
2004> `= <boolean>`
2005
2006Do not automatically reboot after an error.  This is useful for
2007catching debug output.  Defaults to automatically reboot after 5
2008seconds.
2009
2010### nosmp (x86)
2011> `= <boolean>`
2012
2013Disable SMP support.  No secondary processors will be booted.
2014Defaults to booting secondary processors.
2015
2016This option is ignored in **pv-shim** mode.
2017
2018### nr_irqs (x86)
2019> `= <integer>`
2020
2021### numa (x86)
2022> `= on | off | fake=<integer> | noacpi`
2023
2024> Default: `on`
2025
2026### partial-emulation (arm)
2027> `= <boolean>`
2028
2029> Default: `false`
2030
2031Flag to enable or disable partial emulation of system/coprocessor registers.
2032Only effective if CONFIG_PARTIAL_EMULATION is enabled.
2033
2034**WARNING: Enabling this option might result in unwanted/non-spec compliant
2035behavior.**
2036
2037### pci
2038    = List of [ serr=<bool>, perr=<bool> ]
2039
2040    Default: Signaling left as set by firmware.
2041
2042Override the firmware settings, and explicitly enable or disable the
2043signalling of PCI System and Parity errors.
2044
2045### pci-phantom
2046> `=[<seg>:]<bus>:<device>,<stride>`
2047
2048Mark a group of PCI devices as using phantom functions without actually
2049advertising so, so the IOMMU can create translation contexts for them.
2050
2051All numbers specified must be hexadecimal ones.
2052
2053This option can be specified more than once (up to 8 times at present).
2054
2055### pci-passthrough (arm)
2056> `= <boolean>`
2057
2058> Default: `false`
2059
2060Flag to enable or disable support for PCI passthrough
2061
2062### pcid (x86)
2063> `= <boolean> | xpti=<bool>`
2064
2065> Default: `xpti`
2066
2067> Can be modified at runtime (change takes effect only for domains created
2068  afterwards)
2069
2070If available, control usage of the PCID feature of the processor for
207164-bit pv-domains. PCID can be used either for no domain at all (`false`),
2072for all of them (`true`), only for those subject to XPTI (`xpti`) or for
2073those not subject to XPTI (`no-xpti`). The feature is used only in case
2074INVPCID is supported and not disabled via `invpcid=false`.
2075
2076### pdx-compress
2077> `= <boolean>`
2078
2079> Default: `true` if CONFIG_PDX_NONE is unset
2080
2081Only relevant when the hypervisor is build with PFN PDX compression. Controls
2082whether Xen will engage in PFN compression.  The algorithm used for PFN
2083compression is selected at build time from Kconfig.
2084
2085### ple_gap
2086> `= <integer>`
2087
2088### ple_window (Intel)
2089> `= <integer>`
2090
2091### preferred-cstates (x86)
2092> `= ( <integer> | List of ( C1 | C1E | C2 | ... )`
2093
2094This is a mask of C-states which are to be used preferably.  This option is
2095applicable only on hardware were certain C-states are exclusive of one another.
2096
2097### probe-port-aliases (x86)
2098> `= <boolean>`
2099
2100> Default: `true` outside of shim mode, `false` in shim mode
2101
2102Certain devices accessible by I/O ports may be accessible also through "alias"
2103ports (originally a result of incomplete address decoding).  When such devices
2104are solely under Xen's control, Xen disallows even Dom0 access to the "primary"
2105ports.  When alias probing is active and aliases are detected, "alias" ports
2106would then be treated similar to the "primary" ones.
2107
2108### psr (Intel)
2109> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> | cdp:<boolean> )`
2110
2111> Default: `psr=cmt:0,rmid_max:255,cat:0,cos_max:255,cdp:0`
2112
2113Platform Shared Resource(PSR) Services.  Intel Haswell and later server
2114platforms offer information about the sharing of resources.
2115
2116To use the PSR monitoring service for a certain domain, a Resource
2117Monitoring ID(RMID) is used to bind the domain to corresponding shared
2118resource.  RMID is a hardware-provided layer of abstraction between software
2119and logical processors.
2120
2121To use the PSR cache allocation service for a certain domain, a capacity
2122bitmasks(CBM) is used to bind the domain to corresponding shared resource.
2123CBM represents cache capacity and indicates the degree of overlap and isolation
2124between domains. In hypervisor a Class of Service(COS) ID is allocated for each
2125unique CBM.
2126
2127The following resources are available:
2128
2129* Cache Monitoring Technology (Haswell and later).  Information regarding the
2130  L3 cache occupancy.
2131  * `cmt` instructs Xen to enable/disable Cache Monitoring Technology.
2132  * `rmid_max` indicates the max value for rmid.
2133* Memory Bandwidth Monitoring (Broadwell and later). Information regarding the
2134  total/local memory bandwidth. Follow the same options with Cache Monitoring
2135  Technology.
2136
2137* Cache Allocation Technology (Broadwell and later).  Information regarding
2138  the cache allocation.
2139  * `cat` instructs Xen to enable/disable Cache Allocation Technology.
2140  * `cos_max` indicates the max value for COS ID.
2141* Code and Data Prioritization Technology (Broadwell and later). Information
2142  regarding the code cache and the data cache allocation. CDP is based on CAT.
2143  * `cdp` instructs Xen to enable/disable Code and Data Prioritization. Note
2144    that `cos_max` of CDP is a little different from `cos_max` of CAT. With
2145    CDP, one COS will corespond two CBMs other than one with CAT, due to the
2146    sum of CBMs is fixed, that means actual `cos_max` in use will automatically
2147    reduce to half when CDP is enabled.
2148
2149### pv
2150    = List of [ 32=<bool> ]
2151
2152    Applicability: x86
2153
2154Controls for aspects of PV guest support.
2155
2156*   The `32` boolean controls whether 32bit PV guests can be created.  It
2157    defaults to `true`, and is ignored when `CONFIG_PV32` is compiled out.
2158
2159    32bit PV guests are incompatible with CET Shadow Stacks.  If Xen is using
2160    shadow stacks, this option will be overridden to `false`.  Backwards
2161    compatibility can be maintained with the `pv-shim` mechanism.
2162
2163### pv-linear-pt (x86)
2164> `= <boolean>`
2165
2166> Default: `true`
2167
2168Only available if Xen is compiled with `CONFIG_PV_LINEAR_PT` support
2169enabled.
2170
2171Allow PV guests to have pagetable entries pointing to other pagetables
2172of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
2173This technique is often called "linear pagetables", and is sometimes
2174used to allow operating systems a simple way to consistently map the
2175current process's pagetables into its own virtual address space.
2176
2177Linux and MiniOS don't use this technique.  NetBSD and Novell Netware
2178do; there may be other custom operating systems which do.  If you're
2179certain you don't plan on having PV guests which use this feature,
2180turning it off can reduce the attack surface.
2181
2182### pv-l1tf (x86)
2183> `= List of [ <bool>, dom0=<bool>, domu=<bool> ]`
2184
2185> Default: `false` on believed-unaffected hardware, or in pv-shim mode.
2186>          `domu`  on believed-affected hardware.
2187
2188Mitigations for L1TF / XSA-273 / CVE-2018-3620 for PV guests.
2189
2190For backwards compatibility, we may not alter an architecturally-legitimate
2191pagetable entry a PV guest chooses to write.  We can however force such a
2192guest into shadow mode so that Xen controls the PTEs which are reachable by
2193the CPU pagewalk.
2194
2195Shadowing is performed at the point where a PV guest first tries to write an
2196L1TF-vulnerable PTE.  Therefore, a PV guest kernel which has been updated with
2197its own L1TF mitigations will not trigger shadow mode if it is well behaved.
2198
2199If `CONFIG_SHADOW_PAGING` is not compiled in, this mitigation instead crashes
2200the guest when an L1TF-vulnerable PTE is written, which still allows updated,
2201well-behaved PV guests to run, despite Shadow being compiled out.
2202
2203In the pv-shim case, Shadow is expected to be compiled out, and a malicious
2204guest kernel can only leak data from the shim Xen, rather than the host Xen.
2205
2206### pv-shim (x86)
2207> `= <boolean>`
2208
2209> Default: `false`
2210
2211This option is intended for use by a toolstack, when choosing to run a PV
2212guest compatibly inside an HVM container.
2213
2214In this mode, the kernel and initrd passed as modules to the hypervisor are
2215constructed into a plain unprivileged PV domain.
2216
2217### rcu-idle-timer-period-ms
2218> `= <integer>`
2219
2220> Default: `10`
2221
2222How frequently a CPU which has gone idle, but with pending RCU callbacks,
2223should be woken up to check if the grace period has completed, and the
2224callbacks are safe to be executed. Expressed in milliseconds; maximum is
2225100, and it can't be 0.
2226
2227### reboot (x86)
2228> `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`
2229
2230> Default: system dependent
2231
2232Specify the host reboot method.
2233
2234`warm` instructs Xen to not set the cold reboot flag.
2235
2236`cold` instructs Xen to set the cold reboot flag.
2237
2238`no` instructs Xen to not automatically reboot after panics or crashes.
2239
2240`triple` instructs Xen to reboot the host by causing a triple fault.
2241
2242`kbd` instructs Xen to reboot the host via the keyboard controller.
2243
2244`acpi` instructs Xen to reboot the host using RESET_REG in the ACPI FADT (this
2245is default mode if available).
2246
2247`pci` instructs Xen to reboot the host using PCI reset register (port CF9).
2248
2249`Power` instructs Xen to power-cycle the host using PCI reset register (port CF9).
2250
2251`efi` instructs Xen to reboot using the EFI reboot call.
2252
2253`xen` instructs Xen to reboot using Xen's SCHEDOP hypercall (this is the default
2254when running nested Xen)
2255
2256### rmrr
2257> `= start<-end>=[s1]bdf1[,[s1]bdf2[,...]];start<-end>=[s2]bdf1[,[s2]bdf2[,...]]`
2258
2259Define RMRR units that are missing from ACPI table along with device they
2260belong to and use them for 1:1 mapping. End addresses can be omitted and one
2261page will be mapped. The ranges are inclusive when start and end are specified.
2262If segment of the first device is not specified, segment zero will be used.
2263If other segments are not specified, first device segment will be used.
2264If a segment is specified for other than the first device and it does not match
2265the one specified for the first one, an error will be reported.
2266
2267'start' and 'end' values are page numbers (not full physical addresses),
2268in hexadecimal format (can optionally be preceded by "0x").
2269
2270Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be
2271reserved, and device 0:0:1a.0 requires three pages (0xd5d46 thru 0xd5d48)
2272to be reserved, one usage would be:
2273
2274rmrr=d5d45=0:0:1d.0;0xd5d46-0xd5d48=0:0:1a.0
2275
2276Note: grub2 requires to escape or use quotations if special characters are used,
2277namely ';', refer to the grub2 documentation if multiple ranges are specified.
2278
2279### ro-hpet (x86)
2280> `= <boolean>`
2281
2282> Default: `true`
2283
2284Map the HPET page as read only in Dom0. If disabled the page will be mapped
2285with read and write permissions.
2286
2287### sched
2288> `= credit | credit2 | arinc653 | rtds | null`
2289
2290> Default: `sched=credit2`
2291
2292Choose the default scheduler. Note the default scheduler is selectable via
2293Kconfig and depends on enabled schedulers. Check
2294`CONFIG_SCHED_DEFAULT` to see which scheduler is the default.
2295
2296### sched_credit2_max_cpus_runqueue
2297> `= <integer>`
2298
2299> Default: `16`
2300
2301Defines how many CPUs will be put, at most, in each Credit2 runqueue.
2302
2303Runqueues are still arranged according to the host topology (and following
2304what indicated by the 'credit2_runqueue' parameter). But we also have a cap
2305to the number of CPUs that share each runqueues.
2306
2307A value that is a submultiple of the number of online CPUs is recommended,
2308as that would likely produce a perfectly balanced runqueue configuration.
2309
2310### sched_credit2_migrate_resist
2311> `= <integer>`
2312
2313### sched_credit_tslice_ms
2314> `= <integer>`
2315
2316Set the timeslice of the credit1 scheduler, in milliseconds.  The
2317default is 30ms.  Reasonable values may include 10, 5, or even 1 for
2318very latency-sensitive workloads.
2319
2320### sched-gran (x86)
2321> `= cpu | core | socket`
2322
2323> Default: `sched-gran=cpu`
2324
2325Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
2326`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
2327statically to a "scheduling unit" which will then be subject to scheduling.
2328This assignment of vcpus to scheduling units is fixed.
2329
2330`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a
2331hyperthread using x86/Intel terminology)
2332
2333`core`: As many vcpus as there are cpus on a physical core are scheduled
2334together on a physical core.
2335
2336`socket`: As many vcpus as there are cpus on a physical sockets are scheduled
2337together on a physical socket.
2338
2339Note: a value other than `cpu` will result in rejecting a runtime modification
2340attempt of the "smt" setting.
2341
2342Note: for AMD x86 processors before Fam17 the terminology in the official data
2343sheets is different: a cpu is named "core" and multiple "cores" are running
2344in the same "compute unit". As from Fam17 on AMD is using the same names as
2345Intel ("thread" and "core") the topology levels are named "cpu", "core" and
2346"socket" even on older AMD processors.
2347
2348### sched_ratelimit_us
2349> `= <integer>`
2350
2351In order to limit the rate of context switching, set the minimum
2352amount of time that a vcpu can be scheduled for before preempting it,
2353in microseconds.  The default is 1000us (1ms).  Setting this to 0
2354disables it altogether.
2355
2356### sched_smt_power_savings
2357> `= <boolean>`
2358
2359Normally Xen will try to maximize performance and cache utilization by
2360spreading out vcpus across as many different divisions as possible
2361(i.e, numa nodes, sockets, cores threads, &c).  This often maximizes
2362throughput, but also maximizes energy usage, since it reduces the
2363depth to which a processor can sleep.
2364
2365This option inverts the logic, so that the scheduler in effect tries
2366to keep the vcpus on the smallest amount of silicon possible; i.e.,
2367first fill up sibling threads, then sibling cores, then sibling
2368sockets, &c.  This will reduce performance somewhat, particularly on
2369systems with hyperthreading enabled, but should reduce power by
2370enabling more sockets and cores to go into deeper sleep states.
2371
2372### scrub-domheap
2373> `= <boolean>`
2374
2375> Default: `false`
2376
2377Scrub domains' freed pages. This is a safety net against a (buggy) domain
2378accidentally leaking secrets by releasing pages without proper sanitization.
2379
2380### serial_tx_buffer
2381> `= <size>`
2382
2383> Default: `CONFIG_SERIAL_TX_BUFSIZE`
2384
2385Set the serial transmit buffer size.
2386
2387### serrors (ARM)
2388> `= diverse | panic`
2389
2390> Default: `diverse`
2391
2392This parameter is provided to administrators to determine how the hypervisor
2393handles SErrors.
2394
2395* `diverse`:
2396  The hypervisor will distinguish guest SErrors from hypervisor SErrors:
2397    - The guest generated SErrors will be forwarded to the currently running
2398      guest.
2399    - The hypervisor generated SErrors will cause the whole system to crash
2400
2401* `panic`:
2402  All SErrors will cause the whole system to crash. This option should only
2403  be used if you trust all your guests and/or they don't have a gadget (e.g.
2404  device) to generate SErrors in normal run.
2405
2406### shim_mem (x86)
2407> `= List of ( min:<size> | max:<size> | <size> )`
2408
2409Set the amount of memory that xen-shim uses. Only has effect if pv-shim mode is
2410enabled. Note that this value accounts for the memory used by the shim itself
2411plus the free memory slack given to the shim for runtime allocations.
2412
2413* `min:<size>` specifies the minimum amount of memory. Ignored if greater
2414   than max.
2415* `max:<size>` specifies the maximum amount of memory.
2416* `<size>` specifies the exact amount of memory. Overrides both min and max.
2417
2418By default, the amount of free memory slack given to the shim for runtime usage
2419is 1MB.
2420
2421### smap (x86)
2422> `= <boolean> | hvm`
2423
2424> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware
2425
2426Flag to enable Supervisor Mode Access Prevention
2427Use `smap=hvm` to allow SMAP use by HVM guests only.
2428
2429In PV shim mode on AMD or Hygon hardware due to significant performance impact
2430in some cases and generally lower security risk the option defaults to false.
2431
2432### smep (x86)
2433> `= <boolean> | hvm`
2434
2435> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware
2436
2437Flag to enable Supervisor Mode Execution Protection
2438Use `smep=hvm` to allow SMEP use by HVM guests only.
2439
2440In PV shim mode on AMD or Hygon hardware due to significant performance impact
2441in some cases and generally lower security risk the option defaults to false.
2442
2443### smt (x86)
2444> `= <boolean>`
2445
2446Default: `true`
2447
2448Control bring up of multiple hyper-threads per CPU core.
2449
2450### snb_igd_quirk
2451> `= <boolean> | cap | <integer>`
2452
2453A true boolean value enables legacy behavior (1s timeout), while `cap`
2454enforces the maximum theoretically necessary timeout of 670ms. Any number
2455is being interpreted as a custom timeout in milliseconds. Zero or boolean
2456false disable the quirk workaround, which is also the default.
2457
2458### spec-ctrl (Arm)
2459> `= List of [ ssbd=force-disable|runtime|force-enable ]`
2460
2461Controls for speculative execution sidechannel mitigations.
2462
2463The option `ssbd=` is used to control the state of Speculative Store
2464Bypass Disable (SSBD) mitigation.
2465
2466* `ssbd=force-disable` will keep the mitigation permanently off. The guest
2467will not be able to control the state of the mitigation.
2468* `ssbd=runtime` will always turn on the mitigation when running in the
2469hypervisor context. The guest will be to turn on/off the mitigation for
2470itself by using the firmware interface `ARCH_WORKAROUND_2`.
2471* `ssbd=force-enable` will keep the mitigation permanently on. The guest will
2472not be able to control the state of the mitigation.
2473
2474By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`).
2475
2476### spec-ctrl (x86)
2477> `= List of [ <bool>, xen=<bool>, {pv,hvm}=<bool>,
2478>              {msr-sc,rsb,verw,{ibpb,bhb}-entry}=<bool>|{pv,hvm}=<bool>,
2479>              bti-thunk=retpoline|lfence|jmp,bhb-seq=short|tsx|long,
2480>              {ibrs,ibpb,ssbd,psfd,
2481>              eager-fpu,l1d-flush,branch-harden,srb-lock,
2482>              unpriv-mmio,gds-mit,div-scrub,lock-harden,
2483>              bhi-dis-s,bp-spec-reduce,ibpb-alt}=<bool> ]`
2484
2485Controls for speculative execution sidechannel mitigations.  By default, Xen
2486will pick the most appropriate mitigations based on compiled in support,
2487loaded microcode, and hardware details, and will virtualise appropriate
2488mitigations for guests to use.
2489
2490**WARNING: Any use of this option may interfere with heuristics.  Use with
2491extreme care.**
2492
2493An overall boolean value, `spec-ctrl=no`, can be specified to turn off all
2494mitigations, including pieces of infrastructure used to virtualise certain
2495mitigation features for guests.  This also includes settings which `xpti`,
2496`smt`, `pv-l1tf`, `tsx` control, unless the respective option(s) have been
2497specified earlier on the command line.
2498
2499Alternatively, a slightly more restricted `spec-ctrl=no-xen` can be used to
2500turn off all of Xen's mitigations, while leaving the virtualisation support
2501in place for guests to use.
2502
2503Use of a positive boolean value for either of these options is invalid.
2504
2505The `pv=`, `hvm=`, `msr-sc=`, `rsb=`, `verw=`, `ibpb-entry=` and `bhb-entry=`
2506options offer fine grained control over the primitives by Xen.  These impact
2507Xen's ability to protect itself, and/or Xen's ability to virtualise support
2508for guests to use.
2509
2510* `pv=` and `hvm=` offer control over all suboptions for PV and HVM guests
2511  respectively.
2512* Each other option can be used either as a plain boolean
2513  (e.g. `spec-ctrl=rsb` to control both the PV and HVM sub-options), or with
2514  `pv=` or `hvm=` subsuboptions (e.g. `spec-ctrl=rsb=no-hvm` to disable HVM
2515  RSB only).
2516
2517* `msr-sc=` offers control over Xen's support for manipulating `MSR_SPEC_CTRL`
2518  on entry and exit.  These blocks are necessary to virtualise support for
2519  guests and if disabled, guests will be unable to use IBRS/STIBP/SSBD/etc.
2520* `rsb=` offers control over whether to overwrite the Return Stack Buffer /
2521  Return Address Stack on entry to Xen and on idle.
2522* `verw=` offers control over whether to use VERW for its scrubbing side
2523  effects at appropriate privilege transitions.  The exact side effects are
2524  microarchitecture and microcode specific.  *Note: `md-clear=` is accepted as
2525  a deprecated alias.  For compatibility with development versions of XSA-297,
2526  `mds=` is also accepted on Xen 4.12 and earlier as an alias.  Consult vendor
2527  documentation in preference to here.*
2528* `ibpb-entry=` offers control over whether IBPB (Indirect Branch Prediction
2529  Barrier) is used on entry to Xen.  This is used by default on hardware
2530  vulnerable to Branch Type Confusion, and hardware vulnerable to Speculative
2531  Return Stack Overflow if appropriate microcode has been loaded, but for
2532  performance reasons dom0 is unprotected by default.  If it is necessary to
2533  protect dom0 too, boot with `spec-ctrl=ibpb-entry`.
2534* `bhb-entry=` offers control over whether BHB-clearing (Branch History
2535  Buffer) sequences are used on entry to Xen.  This is used by default on
2536  hardware vulnerable to Branch History Injection, when the BHI_DIS_S control
2537  is not available (see `bhi-dis-s`).  The choice of scrubbing sequence can be
2538  selected using the `bhb-seq=` option.  If it is necessary to protect dom0
2539  too, boot with `spec-ctrl=bhb-entry`.
2540
2541If Xen was compiled with `CONFIG_INDIRECT_THUNK` support, `bti-thunk=` can be
2542used to select which of the thunks gets patched into the
2543`__x86_indirect_thunk_%reg` locations.  The default thunk is `retpoline`
2544(generally preferred), with the alternatives being `jmp` (a `jmp *%reg` gadget,
2545minimal overhead), and `lfence` (an `lfence; jmp *%reg` gadget).
2546
2547On all hardware, `bhb-seq=` can be used to select which of the BHB-clearing
2548sequences gets used.  This interacts with the `bhb-entry=` and `bhi-dis-s=`
2549options in order to mitigate Branch History Injection on affected hardware.
2550The default sequence is `short`, with `tsx` as an alternative available
2551capable hardware, and `long` that can be opted in to.
2552
2553On hardware supporting IBRS (Indirect Branch Restricted Speculation), the
2554`ibrs=` option can be used to force or prevent Xen using the feature itself.
2555If Xen is not using IBRS itself, functionality is still set up so IBRS can be
2556virtualised for guests.
2557
2558On hardware supporting STIBP (Single Thread Indirect Branch Predictors), the
2559`stibp=` option can be used to force or prevent Xen using the feature itself.
2560By default, Xen will use STIBP when IBRS is in use (IBRS implies STIBP), and
2561when hardware hints recommend using it as a blanket setting.
2562
2563On hardware supporting SSBD (Speculative Store Bypass Disable), the `ssbd=`
2564option can be used to force or prevent Xen using the feature itself.  The
2565feature is virtualised for guests, independently of Xen's choice of setting.
2566On AMD hardware, disabling Xen SSBD usage on the command line (`ssbd=0` which
2567is the default value) can lead to Xen running with the guest SSBD selection
2568depending on hardware support, on the same hardware setting `ssbd=1` will
2569result in SSBD always being enabled, regardless of guest choice.
2570
2571On hardware supporting PSFD (Predictive Store Forwarding Disable), the `psfd=`
2572option can be used to force or prevent Xen using the feature itself.  By
2573default, Xen will not use PSFD.  PSFD is implied by SSBD, and SSBD is off by
2574default.
2575
2576On hardware supporting BHI_DIS_S (Branch History Injection Disable
2577Supervisor), the `bhi-dis-s=` option can be used to force or prevent Xen using
2578the feature itself.  By default Xen will use BHI_DIS_S on hardware susceptible
2579to Branch History Injection.
2580
2581On hardware supporting IBPB (Indirect Branch Prediction Barrier), the `ibpb=`
2582option can be used to force (the default) or prevent Xen from issuing branch
2583prediction barriers on vcpu context switches.
2584
2585On all hardware, the `eager-fpu=` option can be used to force or prevent Xen
2586from using fully eager FPU context switches.  This is currently implemented as
2587a global control.  By default, Xen will choose to use fully eager context
2588switches on hardware believed to speculate past #NM exceptions.
2589
2590On hardware supporting L1D_FLUSH, the `l1d-flush=` option can be used to force
2591or prevent Xen from issuing an L1 data cache flush on each VMEntry.
2592Irrespective of Xen's setting, the feature is virtualised for HVM guests to
2593use.  By default, Xen will enable this mitigation on hardware believed to be
2594vulnerable to L1TF.
2595
2596If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_BRANCH`, the
2597`branch-harden=` boolean can be used to force or prevent Xen from using
2598speculation barriers to protect selected conditional branches.  By default,
2599Xen will enable this mitigation.
2600
2601On hardware supporting SRBDS_CTRL, the `srb-lock=` option can be used to force
2602or prevent Xen from protect the Special Register Buffer from leaking stale
2603data. By default, Xen will enable this mitigation, except on parts where MDS
2604is fixed and TAA is fixed/mitigated and there are no unprivileged MMIO
2605mappings (in which case, there is believed to be no way for an attacker to
2606obtain stale data).
2607
2608The `unpriv-mmio=` boolean indicates whether the system has (or will have)
2609less than fully privileged domains granted access to MMIO devices.  By
2610default, this option is disabled.  If enabled, Xen will use the `FB_CLEAR`
2611and/or `SRBDS_CTRL` functionality available in the Intel May 2022 microcode
2612release to mitigate cross-domain leakage of data via the MMIO Stale Data
2613vulnerabilities.
2614
2615On all hardware, the `gds-mit=` option can be used to force or prevent Xen
2616from mitigating the GDS (Gather Data Sampling) vulnerability.  By default, Xen
2617will mitigate GDS on hardware believed to be vulnerable.  On hardware
2618supporting GDS_CTRL (requires the August 2023 microcode), and where firmware
2619has elected not to lock the configuration, Xen will use GDS_CTRL to mitigate
2620GDS with.  Otherwise, Xen will mitigate by disabling AVX, which blocks the use
2621of the AVX2 Gather instructions.
2622
2623On all hardware, the `div-scrub=` option can be used to force or prevent Xen
2624from mitigating the DIV-leakage vulnerability.  By default, Xen will mitigate
2625DIV-leakage on hardware believed to be vulnerable.
2626
2627If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_LOCK`, the `lock-harden=`
2628boolean can be used to force or prevent Xen from using speculation barriers to
2629protect lock critical regions.  This mitigation won't be engaged by default,
2630and needs to be explicitly enabled on the command line.
2631
2632On hardware supporting SRSO_MSR_FIX, the `bp-spec-reduce=` option can be used
2633to force or prevent Xen from using MSR_BP_CFG.BP_SPEC_REDUCE to mitigate the
2634SRSO (Speculative Return Stack Overflow) vulnerability.  Xen will use
2635bp-spec-reduce when available, as it is preferable to using `ibpb-entry=hvm`
2636to mitigate SRSO for HVM guests, and because it is a prerequisite to advertise
2637SRSO_U/S_NO to PV guests.
2638
2639On Sappire and Emerald Rapids CPUs with May 2025 microcode or later, the
2640`ibpb-alt=` option can be used to switch to the alternative mitigation for
2641Intel SA-00982.  Intel suggest that some workloads will benefit from this.
2642
2643### sync_console
2644> `= <boolean>`
2645
2646> Default: `false`
2647
2648Flag to force synchronous console output.  Useful for debugging, but
2649not suitable for production environments due to incurred overhead.
2650
2651### tboot (x86)
2652> `= 0x<phys_addr>`
2653
2654Specify the physical address of the trusted boot shared page.
2655
2656### tbuf_size
2657> `= <integer>`
2658
2659Specify the per-cpu trace buffer size in pages.
2660
2661### tdt (x86)
2662> `= <boolean>`
2663
2664> Default: `true`
2665
2666Flag to enable TSC deadline as the APIC timer mode.
2667
2668### tee (arm)
2669> `= <string>`
2670
2671Specify the TEE mediator to be probed and use.
2672
2673The default behaviour is to probe all TEEs supported by Xen and use
2674the first one successfully probed. When this parameter is passed, Xen will
2675probe only the TEE mediator passed as argument and boot will fail if this
2676mediator is not properly probed or if the requested TEE is not supported by
2677Xen.
2678
2679This parameter can be set to `optee` or `ffa` if the corresponding mediators
2680are compiled in.
2681
2682### tevt_mask
2683> `= <integer>`
2684
2685Specify a mask for Xen event tracing. This allows Xen tracing to be
2686enabled at boot. Refer to the xentrace(8) documentation for a list of
2687valid event mask values. In order to enable tracing, a buffer size (in
2688pages) must also be specified via the tbuf_size parameter.
2689
2690### tickle_one_idle_cpu
2691> `= <boolean>`
2692
2693### timer_slop
2694> `= <integer>`
2695
2696### tsc (x86)
2697> `= unstable | skewed | stable:socket`
2698
2699### tsx
2700    = <bool>
2701
2702    Applicability: x86 with CONFIG_INTEL active
2703    Default: false on parts vulnerable to TAA, true otherwise
2704
2705Controls for the use of Transactional Synchronization eXtensions.
2706
2707Several microcode updates are relevant:
2708
2709 * March 2019, fixing the TSX memory ordering errata on all TSX-enabled CPUs
2710   to date.  Introduced MSR_TSX_FORCE_ABORT on SKL/SKX/KBL/WHL/CFL parts.  The
2711   errata workaround uses Performance Counter 3, so the user can select
2712   between working TSX and working perfcounters.
2713
2714 * November 2019, fixing the TSX Async Abort speculative vulnerability.
2715   Introduced MSR_TSX_CTRL on all TSX-enabled MDS_NO parts to date,
2716   CLX/WHL-R/CFL-R, with the controls becoming architectural moving forward
2717   and formally retiring HLE from the architecture.  The user can disable TSX
2718   to mitigate TAA, and elect to hide the HLE/RTM CPUID bits.  Also causes
2719   VERW to once-again flush the microarchiectural buffers in case a TAA
2720   mitigation is wanted along with TSX being enabled.
2721
2722 * June 2021, removing the workaround for March 2019 on client CPUs and
2723   formally de-featured TSX on SKL/KBL/WHL/CFL (Note: SKX still retains the
2724   March 2019 fix).  Introduced the ability to hide the HLE/RTM CPUID bits.
2725   PCR3 works fine, and TSX is disabled by default, but the user can re-enable
2726   TSX at their own risk, accepting that the memory order erratum is unfixed.
2727
2728 * February 2022, removing the VERW flushing workaround from November 2019 on
2729   client CPUs and formally de-featuring TSX on WHL-R/CFL-R (Note: CLX still
2730   retains the VERW flushing workaround).  TSX defaults to disabled, and is
2731   locked off when SGX is enabled in the BIOS.  When SGX is not enabled, TSX
2732   can be re-enabled at the users own risk, as it reintroduces the TSX Async
2733   Abort speculative vulnerability.
2734
2735On systems with the ability to configure TSX, this boolean offers system wide
2736control of whether TSX is enabled or disabled.
2737
2738When TSX is disabled, transactions unconditionally abort.  This is compatible
2739with the TSX spec, which requires software to have a non-transactional path as
2740a fallback.  The RTM and HLE CPUID bits are hidden from VMs by default, but
2741can be re-enabled if required.  This allows VMs which previously saw RTM/HLE
2742to be migrated in, although any TSX-enabled software will run with reduced
2743performance.
2744
2745 * When TSX is locked off by firmware, `tsx=` is ignored and treated as
2746   `false`.
2747
2748 * An explicit `tsx=` choice is honoured, even if it is `true` and would
2749   result in a vulnerable system.
2750
2751 * When no explicit `tsx=` choice is given, parts vulnerable to TAA will be
2752   mitigated by disabling TSX, as this is the lowest overhead option.
2753
2754 * When no explicit `tsx=` option is given, parts susceptible to the memory
2755   ordering errata default to `true` to enable working TSX.  Alternatively,
2756   selecting `tsx=0` will disable TSX and restore PCR3 to a working state.
2757
2758   SKX and SKL/KBL/WHL/CFL on pre-June 2021 microcode default to `true`.
2759   Alternatively, selecting `tsx=0` will disable TSX and restore PCR3 to a
2760   working state.
2761
2762   SKL/KBL/WHL/CFL on the June 2021 microcode or later default to `false`.
2763   Alternatively, selecting `tsx=1` will re-enable TSX at the users own risk.
2764
2765### ucode
2766> `= List of [ <integer> | scan=<bool>, nmi=<bool>, digest-check=<bool> ]`
2767
2768    Applicability: x86
2769    Default: `scan` is selectable via Kconfig, `nmi,digest-check`
2770
2771Controls for CPU microcode loading. For early loading, this parameter can
2772specify how and where to find the microcode update blob. For late loading,
2773this parameter specifies if the update happens within a NMI handler.
2774
2775'integer' specifies the CPU microcode update blob module index. When positive,
2776this specifies the n-th module (in the GrUB entry, zero based) to be used
2777for updating CPU micrcode. When negative, counting starts at the end of
2778the modules in the GrUB entry (so with the blob commonly being last,
2779one could specify `ucode=-1`). Note that the value of zero is not valid
2780here (entry zero, i.e. the first module, is always the Dom0 kernel
2781image). Note further that use of this option has an unspecified effect
2782when used with xen.efi (there the concept of modules doesn't exist, and
2783the blob gets specified via the `ucode=<filename>` config file/section
2784entry; see [EFI configuration file description](efi.html)).
2785
2786'scan' instructs the hypervisor to scan the multiboot images for an cpio
2787image that contains microcode. Depending on the platform the blob with the
2788microcode in the cpio name space must be:
2789  - on Intel: kernel/x86/microcode/GenuineIntel.bin
2790  - on AMD  : kernel/x86/microcode/AuthenticAMD.bin
2791When using xen.efi, the `ucode=<filename>` config file setting takes
2792precedence over `scan`. The default value for `scan` is set with
2793`CONFIG_UCODE_SCAN_DEFAULT`.
2794
2795'nmi' determines late loading is performed in NMI handler or just in
2796stop_machine context. In NMI handler, even NMIs are blocked, which is
2797considered safer. The default value is `true`.
2798
2799The `digest-check=` option is active by default and controls whether to
2800perform additional authenticity checks.  Collisions in the signature algorithm
2801used by AMD Fam17h/19h processors have been found.  Xen contains a table of
2802digests of microcode patches with known-good provenance, and will block
2803loading of patches that do not match.
2804
2805### unrestricted_guest (Intel)
2806> `= <boolean>`
2807
2808### vcpu_migration_delay
2809> `= <integer>`
2810
2811> Default: `0`
2812
2813Specify a delay, in microseconds, between migrations of a VCPU between
2814PCPUs when using the credit1 scheduler. This prevents rapid fluttering
2815of a VCPU between CPUs, and reduces the implicit overheads such as
2816cache-warming. 1ms (1000) has been measured as a good value.
2817
2818### vesa-ram
2819> `= <integer>`
2820
2821> Default: `0`
2822
2823This allows to override the amount of video RAM, in MiB, determined to be
2824present.
2825
2826### vga
2827> `= ( ask | current | text-80x<rows> | gfx-<width>x<height>x<depth> | mode-<mode> )[,keep]`
2828
2829`ask` causes Xen to display a menu of available modes and request the
2830user to choose one of them.
2831
2832`current` causes Xen to use the graphics adapter in its current state,
2833without further setup.
2834
2835`text-80x<rows>` instructs Xen to set up text mode.  Valid values for
2836`<rows>` are `25, 28, 30, 34, 43, 50, 80`
2837
2838`gfx-<width>x<height>x<depth>` instructs Xen to set up graphics mode
2839with the specified width, height and depth.
2840
2841`mode-<mode>` instructs Xen to use a specific mode, as shown with the
2842`ask` option.  (N.B menu modes are displayed in hex, so `<mode>`
2843should be a hexadecimal number)
2844
2845The optional `keep` parameter causes Xen to continue using the vga
2846console even after dom0 has been started.  The default behaviour is to
2847relinquish control to dom0.
2848
2849### viridian-spinlock-retry-count (x86)
2850> `= <integer>`
2851
2852> Default: `2047`
2853
2854Specify the maximum number of retries before an enlightened Windows
2855guest will notify Xen that it has failed to acquire a spinlock.
2856
2857### viridian-version (x86)
2858> `= [<major>],[<minor>],[<build>]`
2859
2860> Default: `6,0,0x1772`
2861
2862<major>, <minor> and <build> must be integers. The values will be
2863encoded in guest CPUID 0x40000002 if viridian enlightenments are enabled.
2864
2865### vm-notify-window (Intel)
2866> `= <integer>`
2867
2868> Default: `0`
2869
2870Specify the value of the VM Notify window used to detect locked VMs. Set to -1
2871to disable the feature.  Value is in units of crystal clock cycles.
2872
2873Note the hardware might add a threshold to the provided value in order to make
2874it safe, and hence using 0 is fine.
2875
2876### vpid (Intel)
2877> `= <boolean>`
2878
2879> Default: `true`
2880
2881Use Virtual Processor ID support if available.  This prevents the need for TLB
2882flushes on VM entry and exit, increasing performance.
2883
2884### vpmu (x86)
2885    = List of [ <bool>, bts, ipc, arch, rtm-abort=<bool> ]
2886
2887    Applicability: x86.  Default: false
2888
2889Controls for Performance Monitoring Unit virtualisation.
2890
2891Performance monitoring facilities tend to be very hardware specific, and
2892provide access to a wealth of low level processor information.
2893
2894*   An overall boolean can be used to enable or disable vPMU support.  vPMU is
2895    disabled by default.
2896
2897    When enabled, guests have full access to all performance counter settings,
2898    including model specific functionality.  This is a superset of the
2899    functionality offered by `ipc` and/or `arch`, but a subset of the
2900    functionality offered by `bts`.
2901
2902    Xen's watchdog functionality is implemented using performance counters.
2903    As a result, use of the **watchdog** option will override and disable
2904    vPMU.
2905
2906*   The `bts` option enables performance monitoring, and permits additional
2907    access to the Branch Trace Store controls.  BTS is an Intel feature where
2908    the processor can write data into a buffer whenever a branch occurs.
2909    However, as this feature isn't virtualised, a misconfiguration by the
2910    guest can lock the entire system up.
2911
2912*   The `ipc` option allows access to the most minimal set of counters
2913    possible: instructions, cycles, and reference cycles.  These can be used
2914    to calculate instructions per cycle (IPC).
2915
2916*   The `arch` option allows access to the pre-defined architectural events.
2917
2918*   The `rtm-abort` boolean has been superseded.  Use `tsx=0` instead.
2919
2920*Warning:*
2921As the virtualisation is not 100% safe, don't use the vpmu flag on
2922production systems (see https://xenbits.xen.org/xsa/advisory-163.html)!
2923
2924### vwfi (arm)
2925> `= trap | native`
2926
2927> Default: `trap`
2928
2929WFI is the ARM instruction to "wait for interrupt". WFE is similar and
2930means "wait for event". This option, which is ARM specific, changes the
2931way guest WFI and WFE are implemented in Xen. By default, Xen traps both
2932instructions. In the case of WFI, Xen blocks the guest vcpu; in the case
2933of WFE, Xen yield the guest vcpu. When setting vwfi to `native`, Xen
2934doesn't trap either instruction, running them in guest context. Setting
2935vwfi to `native` reduces irq latency significantly. It can also lead to
2936suboptimal scheduling decisions, but only when the system is
2937oversubscribed (i.e., in total there are more vCPUs than pCPUs).
2938
2939### wallclock (x86)
2940> `= auto | xen | cmos | efi`
2941
2942> Default: `auto`
2943
2944Allow forcing the usage of a specific wallclock source.
2945
2946 * `auto` let the hypervisor select the clocksource based on internal
2947   heuristics.
2948
2949 * `xen` force usage of the Xen shared_info wallclock when booted as a Xen
2950   guest.  This option is only available if the hypervisor was compiled with
2951   `CONFIG_XEN_GUEST` enabled.
2952
2953 * `cmos` force usage of the CMOS RTC wallclock.
2954
2955 * `efi` force usage of the EFI_GET_TIME run-time method when booted from EFI
2956   firmware.
2957
2958If the selected option is invalid or not available Xen will default to `auto`.
2959
2960### watchdog (x86)
2961> `= force | <boolean>`
2962
2963> Default: `false`
2964
2965Run an NMI watchdog on each processor.  If a processor is stuck for
2966longer than the **watchdog_timeout**, a panic occurs.  When `force` is
2967specified, in addition to running an NMI watchdog on each processor,
2968unknown NMIs will still be processed.
2969
2970### watchdog_timeout (x86)
2971> `= <integer>`
2972
2973> Default: `5`
2974
2975Set the NMI watchdog timeout in seconds.  Specifying `0` will turn off
2976the watchdog.
2977
2978### x2apic (x86)
2979> `= <boolean>`
2980
2981> Default: `true`
2982
2983Permit use of x2apic setup for SMP environments.
2984
2985### x2apic-mode (x86)
2986> `= physical | mixed`
2987
2988> Default: `physical` if **FADT** mandates physical mode, otherwise set at
2989>          build time by CONFIG_X2APIC_{PHYSICAL,MIXED}.
2990
2991In the case that x2apic is in use, this option switches between modes to
2992address APICs in the system as interrupt destinations.
2993
2994### x2apic_phys (x86)
2995> `= <boolean>`
2996
2997> Default: `true` if **FADT** mandates physical mode or if interrupt remapping
2998>          is not available, `false` otherwise.
2999
3000In the case that x2apic is in use, this option switches between physical and
3001clustered mode.  The default, given no hint from the **FADT**, is cluster
3002mode.
3003
3004**WARNING: `x2apic_phys` is deprecated and superseded by `x2apic-mode`.
3005The latter takes precedence if both are set.**
3006
3007### xen-llc-colors (arm64)
3008> `= List of [ <integer> | <integer>-<integer> ]`
3009
3010> Default: `0: the lowermost color`
3011
3012Specify Xen LLC color configuration. This options is available only when
3013`CONFIG_LLC_COLORING` is enabled.
3014Two colors are most likely needed on platforms where private caches are
3015physically indexed, e.g. the L1 instruction cache of the Arm Cortex-A57.
3016
3017### xenheap_megabytes (arm32)
3018> `= <size>`
3019
3020> Default: `0` (1/32 of RAM)
3021
3022Amount of RAM to set aside for the Xenheap. Must be an integer multiple of 32.
3023
3024By default will use 1/32 of the RAM up to a maximum of 1GB and with a
3025minimum of 32M, subject to a suitably aligned and sized contiguous
3026region of memory being available.
3027
3028### xpti (x86)
3029> `= List of [ default | <boolean> | dom0=<bool> | domu=<bool> ]`
3030
3031> Default: `false` on hardware known not to be vulnerable to Meltdown (e.g. AMD)
3032> Default: `true` everywhere else
3033
3034Override default selection of whether to isolate 64-bit PV guest page
3035tables.
3036
3037`true` activates page table isolation even on hardware not vulnerable by
3038Meltdown for all domains.
3039
3040`false` deactivates page table isolation on all systems for all domains.
3041
3042`default` sets the default behaviour.
3043
3044With `dom0` and `domu` it is possible to control page table isolation
3045for dom0 or guest domains only.
3046
3047### xsave (x86)
3048> `= <boolean>`
3049
3050> Default: `true`
3051
3052Permit use of the `xsave/xrstor` instructions.
3053
3054### xsm
3055> `= dummy | flask | silo`
3056
3057> Default: selectable via Kconfig.  Depends on enabled XSM modules.
3058
3059Specify which XSM module should be enabled.  This option is only available if
3060the hypervisor was compiled with `CONFIG_XSM` enabled.
3061
3062* `dummy`: this is the default choice.  Basic restriction for common deployment
3063  (the dummy module) will be applied.  It's also used when XSM is compiled out.
3064* `flask`: this is the policy based access control.  To choose this, the
3065  separated option in kconfig must also be enabled.
3066* `silo`: this will deny any unmediated communication channels between
3067  unprivileged VMs.  To choose this, the separated option in kconfig must also
3068  be enabled.
3069