docs/misc/xen-command-line.pandoc

# Xen Hypervisor Command Line Options

This document covers the command line options which the Xen
Hypervisor.

## Types of parameter

Most parameters take the form `option=value`.  Different options on
the command line should be space delimited.  All options are case
sensitive, as are all values unless explicitly noted.

### Boolean (`<boolean>`)

All boolean option may be explicitly enabled using a `value` of
> `yes`, `on`, `true`, `enable` or `1`

They may be explicitly disabled using a `value` of
> `no`, `off`, `false`, `disable` or `0`

In addition, a boolean option may be enabled by simply stating its
name, and may be disabled by prefixing its name with `no-`.

####Examples

Enable noreboot mode
> `noreboot=true`

Disable x2apic support (if present)
> `x2apic=off`

Enable synchronous console mode
> `sync_console`

Explicitly specifying any value other than those listed above is
undefined, as is stacking a `no-` prefix with an explicit value.

### Integer (`<integer>`)

An integer parameter will default to decimal and may be prefixed with
a `-` for negative numbers.  Alternatively, a hexadecimal number may be
used by prefixing the number with `0x`, or an octal number may be used
if a leading `0` is present.

Providing a string which does not validly convert to an integer is
undefined.

### Size (`<size>`)

A size parameter may be any integer, with a single size suffix

* `T` or `t`: TiB (2^40)
* `G` or `g`: GiB (2^30)
* `M` or `m`: MiB (2^20)
* `K` or `k`: KiB (2^10)
* `B` or `b`: Bytes

Without a size suffix, the default will be kilo.  Providing a suffix
other than those listed above is undefined.

### String

Many parameters are more complicated and require more intricate
configuration.  The detailed description of each individual parameter
specify which values are valid.

### List

Some options take a comma separated list of values.

### Combination

Some parameters act as combinations of the above, most commonly a mix
of Boolean and String.  These are noted in the relevant sections.

## Parameter details

### acpi
> `= force | ht | noirq | <boolean> | verbose`

**String**, or **Boolean** to disable.

By default, Xen will scan the DMI data and blacklist certain systems
which are known to have broken ACPI setups.  Providing `acpi=force`
will cause Xen to ignore the blacklist and attempt to use all ACPI
features.

Using `acpi=ht` causes Xen to parse the ACPI tables enough to
enumerate all CPUs, but will not use other ACPI features.  This is not
common, and only has an effect if your system is blacklisted.

The `acpi=noirq` option causes Xen to not parse the ACPI MADT table
looking for IO-APIC entries.  This is also not common, and any system
which requires this option to function should be blacklisted.
Additionally, this will not prevent Xen from finding IO-APIC entries
from the MP tables.

Further, any of the boolean false options can be used to disable ACPI
usage entirely.

Because responsibility for ACPI processing is shared between Xen and
the domain 0 kernel this option is automatically propagated to the
domain 0 command line.

Finally, `acpi=verbose` will enable per-processor information logging
which may otherwise be too noisy in particular on large systems.

### acpi_apic_instance
> `= <integer>`

Specify which ACPI MADT table to parse for APIC information, if more
than one is present.

### acpi_pstate_strict (x86)
> `= <boolean>`

> Default: `false`

Enforce checking that P-state transitions by the ACPI cpufreq driver
actually result in the nominated frequency to be established. A warning
message will be logged if that isn't the case.

### acpi_skip_timer_override (x86)
> `= <boolean>`

Instruct Xen to ignore timer-interrupt override.

### acpi_sleep (x86)
> `= s3_bios | s3_mode`

`s3_bios` instructs Xen to invoke video BIOS initialization during S3
resume.

`s3_mode` instructs Xen to set up the boot time (option `vga=`) video
mode during S3 resume.

### allow_unsafe (x86)
> `= <boolean>`

> Default: `false`

Force boot on potentially unsafe systems. By default Xen will refuse
to boot on systems with the following errata:

* AMD Erratum 121. Processors with this erratum are subject to a guest
  triggerable Denial of Service. Override only if you trust all of
  your PV guests.

### altp2m (Intel)
> `= <boolean>`

> Default: `false`

Permit multiple copies of host p2m.

### apic (x86)
> `= bigsmp | default`

Override Xen's logic for choosing the APIC driver.  By default, if
there are more than 8 CPUs, Xen will switch to `bigsmp` over
`default`.

### apicv (Intel)
> `= <boolean>`

> Default: `true`

Permit Xen to use APIC Virtualisation Extensions.  This is an optimisation
available as part of VT-x, and allows hardware to take care of the guests APIC
handling, rather than requiring emulation in Xen.

### apic_verbosity (x86)
> `= verbose | debug`

Increase the verbosity of the APIC code from the default value.

### arat (x86)
> `= <boolean>`

> Default: `true`

Permit Xen to use "Always Running APIC Timer" support on compatible hardware
in combination with cpuidle.  This option is only expected to be useful for
developers wishing Xen to fall back to older timing methods on newer hardware.

### argo
    = List of [ <bool>, mac-permissive=<bool> ]

Controls for the Argo hypervisor-mediated interdomain communication service.

The functionality that this option controls is only available when Xen has been
compiled with the build setting for Argo enabled in the build configuration.

Argo is a interdomain communication mechanism, where Xen acts as the central
point of authority.  Guests may register memory rings to recieve messages,
query the status of other domains, and send messages by hypercall, all subject
to appropriate auditing by Xen.  Argo is disabled by default.

*   The `mac-permissive` boolean controls whether wildcard receive rings may be
    registered (`mac-permissive=1`) or may not be registered
    (`mac-permissive=0`).

    This option is disabled by default, to protect domains from a DoS by a
    buggy or malicious other domain spamming the ring.

### asid (x86)
> `= <boolean>`

> Default: `true`

Permit Xen to use Address Space Identifiers.  This is an optimisation which
tags the TLB entries with an ID per vcpu.  This allows for guest TLB flushes
to be performed without the overhead of a complete TLB flush.

### async-show-all (x86)
> `= <boolean>`

> Default: `false`

Forces all CPUs' full state to be logged upon certain fatal asynchronous
exceptions (watchdog NMIs and unexpected MCEs).

### ats (x86)
> `= <boolean>`

> Default: `false`

Permits Xen to set up and use PCI Address Translation Services.  This is a
performance optimisation for PCI Passthrough.

**WARNING: Xen cannot currently safely use ATS because of its synchronous wait
loops for Queued Invalidation completions.**

### availmem
> `= <size>`

> Default: `0` (no limit)

Specify a maximum amount of available memory, to which Xen will clamp
the e820 table.

### badpage
> `= List of [ <integer> | <integer>-<integer> ]`

Specify that certain pages, or certain ranges of pages contain bad
bytes and should not be used.  For example, if your memory tester says
that byte `0x12345678` is bad, you would place `badpage=0x12345` on
Xen's command line.

### bootscrub
> `= idle | <boolean>`

> Default: `idle`

Scrub free RAM during boot.  This is a safety feature to prevent
accidentally leaking sensitive VM data into other VMs if Xen crashes
and reboots.

In `idle` mode, RAM is scrubbed in background on all CPUs during idle-loop
with a guarantee that memory allocations always provide scrubbed pages.
This option reduces boot time on machines with a large amount of RAM while
still providing security benefits.

### bootscrub_chunk
> `= <size>`

> Default: `128M`

Maximum RAM block size chunks to be scrubbed whilst holding the page heap lock
and not running softirqs. Reduce this if softirqs are not being run frequently
enough. Setting this to a high value may cause boot failure, particularly if
the NMI watchdog is also enabled.

### buddy-alloc-size (arm64)
> `= <size>`

> Default: `64M`

Amount of memory reserved for the buddy allocator when colored allocator is
active. This option is available only when `CONFIG_LLC_COLORING` is enabled.
The colored allocator is meant as an alternative to the buddy allocator,
because its allocation policy is by definition incompatible with the generic
one. Since the Xen heap systems is not colored yet, we need to support the
coexistence of the two allocators for now. This parameter, which is optional
and for expert only, it's used to set the amount of memory reserved to the
buddy allocator.

### cet
    = List of [ <bool>, shstk=<bool>, ibt=<bool> ]

    Applicability: x86

Controls for the use of Control-flow Enforcement Technology.  CET is group a
of hardware features designed to combat Return-oriented Programming (ROP, also
call/jmp COP/JOP) attacks.

CET is incompatible with 32bit PV guests.  If any CET sub-options are active,
they will override the `pv=32` boolean to `false`.  Backwards compatibility
can be maintained with the pv-shim mechanism.

*   An unqualified boolean is a shorthand for setting all suboptions at once.

*   The `shstk=` boolean controls whether Xen uses Shadow Stacks for its own
    protection.

    The option is available when `CONFIG_XEN_SHSTK` is compiled in, and
    generally defaults to `true` on hardware supporting CET-SS.  Specifying
    `cet=no-shstk` will cause Xen not to use Shadow Stacks even when support
    is available in hardware.

    Some hardware suffers from an issue known as Supervisor Shadow Stack
    Fracturing.  On such hardware, Xen will default to not using Shadow Stacks
    when virtualised.  Specifying `cet=shstk` will override this heuristic and
    enable Shadow Stacks unilaterally.

*   The `ibt=` boolean controls whether Xen uses Indirect Branch Tracking for
    its own protection.

    The option is available when `CONFIG_XEN_IBT` is compiled in, and defaults
    to `true` on hardware supporting CET-IBT.  Specifying `cet=no-ibt` will
    cause Xen not to use Indirect Branch Tracking even when support is
    available in hardware.

### clocksource (x86)
> `= pit | hpet | acpi | tsc`

If set, override Xen's default choice for the platform timer.
Having TSC as platform timer requires being explicitly set. This is because
TSC can only be safely used if CPU hotplug isn't performed on the system. On
some platforms, the "maxcpus" option may need to be used to further adjust
the number of allowed CPUs.  When running on platforms that can guarantee a
monotonic TSC across sockets you may want to adjust the "tsc" command line
parameter to "stable:socket".

### cmci-threshold (Intel)
> `= <integer>`

> Default: `2`

Specify the event count threshold for raising Corrected Machine Check
Interrupts.  Specifying zero disables CMCI handling.

### cmos-rtc-probe (x86)
> `= <boolean>`

> Default: `false`

Flag to indicate whether to probe for a CMOS Real Time Clock irrespective of
ACPI indicating none to be there.

### com1 (x86)
### com2 (x86)
> `= <baud>[/<base-baud>][,[DPS][,[<io-base>|pci|amt][,[<irq>|msi][,[<port-bdf>][,[<bridge-bdf>]]]]]]`

Both option `com1` and `com2` follow the same format.

* `<baud>` may be either an integer baud rate, or the string `auto` if
  the bootloader or other earlier firmware has already set it up.
* Optionally, the base baud rate (usually the highest baud rate the
  device can communicate at) can be specified.
* `DPS` represents the number of data bits, the parity, and the number
  of stop bits.
  * `D` is an integer between 5 and 8 for the number of data bits.
  * `P` is a single character representing the type of parity:
      * `n` No
      * `o` Odd
      * `e` Even
      * `m` Mark
      * `s` Space
  * `S` is an integer 1 or 2 for the number of stop bits.
* `<io-base>` is an integer which specifies the IO base port for UART
  registers.
* `<irq>` is the IRQ number to use, or `0` to use the UART in poll
  mode only, or `msi` to set up a Message Signaled Interrupt.
* `<port-bdf>` is the PCI location of the UART, in
  `<bus>:<device>.<function>` notation.
* `<bridge-bdf>` is the PCI bridge behind which is the UART, in
  `<bus>:<device>.<function>` notation.
* `pci` indicates that Xen should scan the PCI bus for the UART,
  avoiding Intel AMT devices.
* `amt` indicated that Xen should scan the PCI bus for the UART,
  including Intel AMT devices if present.

A typical setup for most situations might be `com1=115200,8n1`

In addition to the above positional specification for UART parameters,
name=value pair specfications are also supported. This is used to add
flexibility for UART devices which require additional UART parameter
configurations.

The comma separation still delineates positional parameters. Hence,
unless the parameter is explicitly specified with name=value option, it
will be considered a positional parameter.

The syntax consists of
com1=(comma-separated positional parameters),(comma separated name-value pairs)

The accepted name keywords for name=value pairs are:

* `baud` - accepts integer baud rate (eg. 115200) or `auto`
* `bridge`- Similar to bridge-bdf in positional parameters.
            Used to determine the PCI bridge to access the UART device.
            Notation is xx:xx.x `<bus>:<device>.<function>`
* `clock-hz`- accepts large integers to setup UART clock frequencies.
              Do note - these values are multiplied by 16.
* `data-bits` - integer between 5 and 8
* `dev` - accepted values are `pci` OR `amt`. If this option
          is used to specify if the serial device is pci-based. The io_base
          cannot be specified when `dev=pci` or `dev=amt` is used.
* `io-base` - accepts integer which specified IO base port for UART registers
* `irq` - IRQ number to use
* `parity` - accepted values are same as positional parameters
* `port` - Used to specify which port the PCI serial device is located on
           Notation is xx:xx.x `<bus>:<device>.<function>`
* `reg-shift` - register shifts required to set UART registers
* `reg-width` - register width required to set UART registers
                (only accepts 1 and 4)
* `stop-bits` - only accepts 1 or 2 for the number of stop bits

The following are examples of correct specifications:

    com1=115200,8n1,0x3f8,4
    com1=115200,8n1,0x3f8,4,reg-width=4,reg-shift=2
    com1=baud=115200,parity=n,stop-bits=1,io-base=0x3f8,reg-width=4

### conring_size
> `= <size>`

> Default: `conring_size=16k`

Specify the size of the console ring buffer.

### console
> `= List of [ vga | com1[H,L] | com2[H,L] | pv | dbgp | ehci | xhci | none ]`

> Default: `console=com1,vga`

Specify which console(s) Xen should use.

`vga` indicates that Xen should try and use the vga graphics adapter.

`com1` and `com2` indicates that Xen should use serial ports 1 and 2
respectively.  Optionally, these arguments may be followed by an `H` or
`L`.  `H` indicates that transmitted characters will have their MSB
set, while received characters must have their MSB set.  `L` indicates
the converse; transmitted and received characters will have their MSB
cleared.  This allows a single port to be shared by two subsystems
(e.g. console and debugger).

`pv` indicates that Xen should use Xen's PV console. This option is
only available when used together with `pv-in-pvh`.

`dbgp` or `ehci` indicates that Xen should use a USB2 debug port.

`xhci` indicates that Xen should use a USB3 debug port.

`none` indicates that Xen should not use a console.  This option only
makes sense on its own.

### console_timestamps
> `= none | date | datems | boot | raw`

> Default: `none`

> Can be modified at runtime

Specify which timestamp format Xen should use for each console line.

* `none`: No timestamps
* `date`: Date and time information
    * `[YYYY-MM-DD HH:MM:SS]`
* `datems`: Date and time, with milliseconds
    * `[YYYY-MM-DD HH:MM:SS.mmm]`
* `boot`: Seconds and microseconds since boot
    * `[SSSSSS.uuuuuu]`
+ `raw`: Raw platform ticks, architecture and implementation dependent
    * `[XXXXXXXXXXXXXXXX]`

For compatibility with the older boolean parameter, specifying
`console_timestamps` alone will enable the `date` option.

### console_to_ring
> `= <boolean>`

> Default: `false`

Flag to indicate whether all guest console output should be copied
into the console ring buffer.

### conswitch
> `= <switch char>[x]`

> Default: `conswitch=a`

> Can be modified at runtime

Specify which character should be used to switch serial input between
Xen and dom0.  The required sequence is CTRL-&lt;switch char&gt; three
times.

The optional trailing `x` indicates that Xen should not automatically
switch the console input to dom0 during boot.  Any other value,
including omission, causes Xen to automatically switch to the dom0
console during dom0 boot.  Use `conswitch=ax` to keep the default switch
character, but for xen to keep the console.

### core_parking
> `= power | performance`

> Default: `power`

### cpu_type (x86)
> `= arch_perfmon`

If set, force use of the performance counters for oprofile, rather than detecting
available support.

### cpufreq
> `= none | {{ <boolean> | xen } { [:[powersave|performance|ondemand|userspace][,[<maxfreq>]][,[<minfreq>]]] } [,verbose]} | dom0-kernel | hwp[:[<hdc>][,verbose]]`

> Default: `xen`

Indicate where the responsibility for driving power states lies.  Note that the
choice of `dom0-kernel` is deprecated and not supported by all Dom0 kernels.

* Default governor policy is ondemand.
* `<maxfreq>` and `<minfreq>` are integers which represent max and min processor frequencies
  respectively.
* `verbose` option can be included as a string or also as `verbose=<integer>`
  for `xen`.  It is a boolean for `hwp`.
* `hwp` selects Hardware-Controlled Performance States (HWP) on supported Intel
  hardware.  HWP is a Skylake+ feature which provides better CPU power
  management.  The default is disabled.  If `hwp` is selected, but hardware
  support is not available, Xen will fallback to cpufreq=xen.
* `<hdc>` is a boolean to enable Hardware Duty Cycling (HDC).  HDC enables the
  processor to autonomously force physical package components into idle state.
  The default is enabled, but the option only applies when `hwp` is enabled.

There is also support for `;`-separated fallback options:
`cpufreq=hwp;xen,verbose`.  This first tries `hwp` and falls back to `xen` if
unavailable.  Note: The `verbose` suboption is handled globally.  Setting it
for either the primary or fallback option applies to both irrespective of where
it is specified.

Note: grub2 requires to escape or quote ';', so `"cpufreq=hwp;xen"` should be
specified within double quotes inside grub.cfg.  Refer to the grub2
documentation for more information.

### cpuid (x86)
> `= List of comma separated booleans`

This option allows for fine tuning of the facilities Xen will use, after
accounting for hardware capabilities as enumerated via CPUID.

Unless otherwise noted, options only have any effect in their negative form,
to hide the named feature(s).  Ignoring a feature using this mechanism will
cause Xen not to use the feature, nor offer them as usable to guests.

Currently accepted:

The Speculation Control hardware features `srbds-ctrl`, `md-clear`, `ibrsb`,
`stibp`, `ibpb`, `l1d-flush` and `ssbd` are used by default if available and
applicable.  They can all be ignored.

`rdrand` and `rdseed` have multiple interactions.

*   For Special Register Buffer Data Sampling (SRBDS, XSA-320, CVE-2020-0543),
    RDRAND and RDSEED can be ignored.

    Due to the absence of microcode to address SRBDS on IvyBridge client
    hardware, the RDRAND feature is hidden by default for guests, unless
    `rdrand` is used in its positive form.  Irrespective of the setting here,
    VMs can use RDRAND if explicitly enabled in guest config file, and VMs
    already using RDRAND can migrate in.

*   The RDRAND feature is disabled by default on AMD Fam15/16 systems, due to
    possible malfunctions after ACPI S3 suspend/resume.  `rdrand` may be used
    in its positive form to override Xen's default behaviour on these systems,
    and make the feature fully usable.

### cpuid_mask_cpu
> `= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b`

> Applicability: AMD

If none of the other **cpuid_mask_\*** options are given, Xen has a set of
pre-configured masks to make the current processor appear to be
family/revision specified.

See below for general information on masking.

**Warning: This option is not fully effective on Family 15h processors or
later.**

### cpuid_mask_ecx
### cpuid_mask_edx
### cpuid_mask_ext_ecx
### cpuid_mask_ext_edx
### cpuid_mask_l7s0_eax
### cpuid_mask_l7s0_ebx
### cpuid_mask_thermal_ecx
### cpuid_mask_xsave_eax
> `= <integer>`

> Applicability: x86.  Default: `~0` (all bits set)

The availability of these options are model specific.  Some processors don't
support any of them, and no processor supports all of them.  Xen will ignore
options on processors which are lacking support.

These options can be used to alter the features visible via the `CPUID`
instruction.  Settings applied here take effect globally, including for Xen
and all guests.

Note: Since Xen 4.7, it is no longer necessary to mask a host to create
migration safety in heterogeneous scenarios.  All necessary CPUID settings
should be provided in the VM configuration file.  Furthermore, it is
recommended not to use this option, as doing so causes an unnecessary
reduction of features at Xen's disposal to manage guests.

### cpuidle (x86)
> `= <boolean>`

### cpuinfo (x86)
> `= <boolean>`

### crash-debug-debugkey
### crash-debug-hwdom
### crash-debug-kexeccmd
### crash-debug-panic
### crash-debug-watchdog
> `= <string>`

> Can be modified at runtime

Specify debug-key actions in cases of crashes. Each of the parameters applies
to a different crash reason. The `<string>` is a sequence of debug key
characters, with `+` having the special meaning of a 10 millisecond pause.

`crash-debug-debugkey` will be used for crashes induced by the `C` debug
key (i.e. manually induced crash).

`crash-debug-hwdom` denotes a crash of dom0.

`crash-debug-kexeccmd` is an explicit request of dom0 to continue with the
kdump kernel via kexec. Only available on hypervisors built with CONFIG_KEXEC.

`crash-debug-panic` is a crash of the hypervisor.

`crash-debug-watchdog` is a crash due to the watchdog timer expiring.

It should be noted that dumping diagnosis data to the console can fail in
multiple ways (missing data, hanging system, ...) depending on the reason
of the crash, which might have left the hypervisor in a bad state. In case
a debug-key action leads to another crash recursion will be avoided, so no
additional debug-key actions will be performed in this case. A crash in the
early boot phase will not result in any debug-key action, as the system
might not yet be in a state where the handlers can work.

So e.g. `crash-debug-watchdog=0+0r` would dump dom0 state twice with 10
milliseconds between the two state dumps, followed by the run queues of the
hypervisor, if the system crashes due to a watchdog timeout.

Depending on the reason of the system crash it might happen that triggering
some debug key action will result in a hang instead of dumping data and then
doing a reboot or crash dump.

### crashinfo_maxaddr
> `= <size>`

> Default: `4G`

Specify the maximum address to allocate certain structures, if used in
combination with the **low_crashinfo** command line option.

### crashkernel
> `= <ramsize-range>:<size>[,...][{@,<}<offset>]`
> `= <size>[{@,<}<offset>]`
> `= <size>,below=offset`

Specify sizes and optionally placement of the crash kernel reservation
area.  The `<ramsize-range>:<size>` pairs indicate how much memory to
set aside for a crash kernel (`<size>`) for a given range of installed
RAM (`<ramsize-range>`).  Each `<ramsize-range>` is of the form
`<start>-[<end>]`.

A trailing `@<offset>` specifies the exact address this area should be
placed at, whereas `<` in place of `@` just specifies an upper bound of
the address range the area should fall into.

< and below are synonyomous, the latter being useful for grub2 systems
which would otherwise require escaping of the < option


### credit2_balance_over
> `= <integer>`

### credit2_balance_under
> `= <integer>`

### credit2_cap_period_ms
> `= <integer>`

> Default: `10`

Domains subject to a cap receive a replenishment of their runtime budget
once every cap period interval. Default is 10 ms. The amount of budget
they receive depends on their cap. For instance, a domain with a 50% cap
will receive 50% of 10 ms, so 5 ms.

### credit2_load_precision_shift
> `= <integer>`

> Default: `18`

Specify the number of bits to use for the fractional part of the
values involved in Credit2 load tracking and load balancing math.

### credit2_load_window_shift
> `= <integer>`

> Default: `30`

Specify the number of bits to use to represent the length of the
window (in nanoseconds) we use for load tracking inside Credit2.
This means that, with the default value (30), we use
2^30 nsec ~= 1 sec long window.

Load tracking is done by means of a variation of exponentially
weighted moving average (EWMA). The window length defined here
is what tells for how long we give value to previous history
of the load itself. In fact, after a full window has passed,
what happens is that we discard all previous history entirely.

A short window will make the load balancer quick at reacting
to load changes, but also short-sighted about previous history
(and hence, e.g., long term load trends). A long window will
make the load balancer thoughtful of previous history (and
hence capable of capturing, e.g., long term load trends), but
also slow in responding to load changes.

The default value of `1 sec` is rather long.

### credit2_runqueue
> `= cpu | core | socket | node | all`

> Default: `socket`

Specify how host CPUs are arranged in runqueues. Runqueues are kept
balanced with respect to the load generated by the vCPUs running on
them. Smaller runqueues (as in with `core`) means more accurate load
balancing (for instance, it will deal better with hyperthreading),
but also more overhead.

Available alternatives, with their meaning, are:
* `cpu`: one runqueue per each logical pCPUs of the host;
* `core`: one runqueue per each physical core of the host;
* `socket`: one runqueue per each physical socket (which often,
            but not always, matches a NUMA node) of the host;
* `node`: one runqueue per each NUMA node of the host;
* `all`: just one runqueue shared by all the logical pCPUs of
         the host

Regardless of the above choice, Xen attempts to respect
`sched_credit2_max_cpus_runqueue` limit, which may mean more than one runqueue
for the `all` value. If that isn't intended, raise
the `sched_credit2_max_cpus_runqueue` value.

### dbgp
> `= ehci[ <integer> | @pci<bus>:<slot>.<func> ]`
> `= xhci[ <integer> | @pci<bus>:<slot>.<func> ][,share=<bool>|hwdom]`

Specify the USB controller to use, either by instance number (when going
over the PCI busses sequentially) or by PCI device (must be on segment 0).

Use `ehci` for EHCI debug port, use `xhci` for XHCI debug capability.
XHCI driver will wait indefinitely for the debug host to connect - make sure
the cable is connected.
The `share` option for xhci controls who else can use the controller:
* `no`: use the controller exclusively for console, even hardware domain
  (dom0) cannot use it
* `hwdom`: hardware domain may use the controller too, ports not used for debug
  console will be available for normal devices; this is the default
* `yes`: the controller can be assigned to any domain; it is not safe to assign
  the controller to untrusted domain

Choosing `share=hwdom` (the default) or `share=yes` allows a domain to reset the
controller, which may cause small portion of the console output to be lost.

The `share=yes` configuration is not security supported.

### debug_stack_lines
> `= <integer>`

> Default: `20`

Limits the number lines printed in Xen stack traces.

### debugtrace
> `= [cpu:]<size>`

> Default: `128`

Specify the size of the console debug trace buffer. By specifying `cpu:`
additionally a trace buffer of the specified size is allocated per cpu.
The debug trace feature is only enabled in debugging builds of Xen.

### dit (x86/Intel)
> `= <boolean>`

> Default: `CONFIG_DIT_DEFAULT`

Specify whether Xen and guests should operate in Data Independent Timing
mode (Intel calls this DOITM, Data Operand Independent Timing Mode). Note
that enabling this option cannot guarantee anything beyond what underlying
hardware guarantees (with, where available and known to Xen, respective
tweaks applied).

### dma_bits
> `= <integer>`

Specify the bit width of the DMA heap.

### dom0
    = List of [ pv | pvh, shadow=<bool>, verbose=<bool>,
                cpuid-faulting=<bool>, msr-relaxed=<bool>,
                pf-fixup=<bool> ] (x86)

    = List of [ sve=<integer> ] (Arm64)

Controls for how dom0 is constructed on x86 systems.

*   The `pv` and `pvh` options select the virtualisation mode of dom0.

    The `pv` option is only available when `CONFIG_PV` is compiled in.  The
    `pvh` option is only available when `CONFIG_HVM` is compiled in.  When
    both options are compiled in, the default is PV.

    In addition, the following requirements must be met:

    *   The dom0 kernel selected by the boot loader must be capable of the
        selected mode.
    *   For a PVH dom0, the hardware must have VT-x/SVM extensions available.

*   The `shadow` boolean allows dom0 to be explicitly constructed using shadow
    paging.  This option is unavailable when `CONFIG_SHADOW_PAGING` is
    disabled.

    For PVH, dom0 defaults to using HAP on capable hardware, and falls back to
    shadow paging otherwise.  A PVH dom0 cannot be used if Xen is compiled
    without shadow paging support, and the hardware lacks HAP support.

    For PV, the use of dom0 shadow mode is only for development purposes.  PV
    guests do no require any paging support by default.

*   The `verbose` boolean is intended for diagnostics, and prints out extra
    information during the dom0 build.  It defaults to the compile time choice
    of `CONFIG_VERBOSE_DEBUG`.

*   The `cpuid-faulting` boolean is an interim option, is only applicable to
    PV dom0, and defaults to true.

    Before Xen 4.13, the domain builder logic for guest construction depended
    on seeing host CPUID values to function correctly.  As a result, CPUID
    Faulting was never activated for PV dom0's, even on capable hardware.

    In Xen 4.13, the domain builder logic has been fixed, and no longer has
    this dependency.  As a consequence, CPUID Faulting is activated by default
    even for PV dom0's.

    However, as PV dom0's have always seen host CPUID data in the past, there
    is a chance that further dependencies exist.  This boolean can be used to
    restore the pre-4.13 behaviour.  If specifying `no-cpuid-faulting` fixes
    an issue in dom0, please report a bug.

*   The `msr-relaxed` boolean is an interim option, and defaults to false.

    In Xen 4.15, the default behaviour for unhandled MSRs has been changed,
    to avoid leaking host data into guests, and to avoid breaking guest
    logic which uses \#GP probing to identify the availability of MSRs.

    However, this new stricter behaviour has the possibility to break
    guests, and a more 4.14-like behaviour can be selected by specifying
    `dom0=msr-relaxed`.

    If using this option is necessary to fix an issue, please report a bug.

*   The `pf-fixup` boolean is only applicable when using a PVH dom0 and
    defaults to false.

    When running dom0 in PVH mode the dom0 kernel has no way to map MMIO
    regions into its physical memory map, such mode relies on Xen dom0 builder
    populating the physical memory map with all MMIO regions that dom0 should
    access.  However Xen doesn't have a complete picture of the host memory
    map, due to not being able to process ACPI dynamic tables.

    The `pf-fixup` option allows Xen to attempt to add missing MMIO regions
    to the dom0 physical memory map in response to page-faults generated by
    dom0 trying to access unpopulated entries in the memory map.

Enables features on dom0 on Arm systems.

*   The `sve` integer parameter enables Arm SVE usage for Dom0 and sets the
    maximum SVE vector length, the option is applicable only to Arm64 Dom0
    kernels.
    A value equal to 0 disables the feature, this is the default value.
    Values below 0 means the feature uses the maximum SVE vector length
    supported by hardware, if SVE is supported.
    Values above 0 explicitly set the maximum SVE vector length for Dom0,
    allowed values are from 128 to maximum 2048, being multiple of 128.
    Please note that when the user explicitly specifies the value, if that value
    is above the hardware supported maximum SVE vector length, the domain
    creation will fail and the system will stop, the same will occur if the
    option is provided with a positive non zero value, but the platform doesn't
    support SVE.

### dom0-cpuid
    = List of comma separated booleans

    Applicability: x86

This option allows for fine tuning of the facilities dom0 will use, after
accounting for hardware capabilities and Xen settings as enumerated via CPUID.

Options are accepted in positive and negative form, to enable or disable
specific features.  All selections via this mechanism are subject to normal
CPU Policy safety and dependency logic.

This option is intended for developers to opt dom0 into non-default features,
and is not intended for use in production circumstances.  If using this option
is necessary to fix an issue, please report a bug.

### dom0-iommu
    = List of [ passthrough=<bool>, strict=<bool>, map-inclusive=<bool>,
                map-reserved=<bool>, none ]

Controls for the dom0 IOMMU setup.

*   The `passthrough` boolean controls whether IOMMU translation functionality
    is disabled for devices in dom0 (`passthrough=1`) or whether the IOMMU is
    used to ensure that dom0 can only DMA to its permitted areas of RAM
    (`passthrough=0`).

    This option is only applicable to x86 PV dom0's, and defaults to false.

    Some older Intel VT-d hardware isn't capable of disabling translation
    functionality on a per-device basis, and will cause this option to be
    ignored and assumed to be 0.  Similar behaviour on such systems is only
    available by fully disabling all IOMMUs.

    This option is hardwired to false for x86 PVH dom0's (where a non-identity
    transform is required for dom0 to function), and is ignored for ARM.

*   The `strict` boolean is applicable to x86 PV dom0's only and defaults to
    false.  It controls whether dom0 can have IOMMU mappings for all domain
    RAM in the system, or only for its allocated RAM (and grant mappings etc.)

    This option is hardwired to true for x86 PVH dom0's (as RAM belonging to
    other domains in the system don't live in a compatible address space), and
    is ignored for ARM.

*   The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up
    identity IOMMU mappings for all non-RAM regions below 4GB except for
    unusable ranges, and ranges belonging to Xen.

    Typically, some devices in a system use bits of RAM for communication, and
    these areas should be listed as reserved in the E820 table and identified
    via RMRR or IVMD entries in the ACPI tables, so Xen can ensure that they
    are identity-mapped in the IOMMU.  However, some firmware makes mistakes,
    and this option is a coarse-grain workaround for those errors.

    Where possible, finer grain corrections should be made with the `rmrr=`,
    `ivmd=`, `ivrs_hpet[]=`, or `ivrs_ioapic[]=` command line options.

    This option is disabled by default, and deprecated and intended for
    removal in future versions of Xen.  If specifying `map-inclusive` is the
    only way to make your system boot, please report a bug.

*   The `map-reserved` functionality is very similar to `map-inclusive`.

    The differences from `map-inclusive` are that `map-reserved` is applicable
    to both x86 PV and PVH dom0's, is enabled by default, and represents a
    subset of the correction by only mapping reserved memory regions rather
    than all non-RAM regions.

*   The `none` option is intended for development purposes only, and skips
    certain safety checks pertaining to the correct IOMMU configuration for
    dom0 to boot.

    Incorrect use of this option may result in a malfunctioning system.

### dom0_ioports_disable (x86)
> `= List of <hex>-<hex>`

Specify a list of IO ports to be excluded from dom0 access.

### dom0-llc-colors (arm64)
> `= List of [ <integer> | <integer>-<integer> ]`

> Default: `All available LLC colors`

Specify dom0 LLC color configuration. This option is available only when
`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
colors are used.

### dom0_max_vcpus

Either:

> `= <integer>`.

The number of VCPUs to give to dom0.  This number of VCPUs can be more
than the number of PCPUs on the host.  The default is the number of
PCPUs.

Or:

> `= <min>-<max>` where `<min>` and `<max>` are integers.

Gives dom0 a number of VCPUs equal to the number of PCPUs, but always
at least `<min>` and no more than `<max>`.  Using `<min>` may give
more VCPUs than PCPUs.  `<min>` or `<max>` may be omitted and the
defaults of 1 and unlimited respectively are used instead.

For example, with `dom0_max_vcpus=4-8`:

>        Number of
>     PCPUs | Dom0 VCPUs
>      2    |  4
>      4    |  4
>      6    |  6
>      8    |  8
>     10    |  8

### dom0_mem (ARM)
> `= <size>`

Set the amount of memory for the initial domain (dom0). It must be
greater than zero. This parameter is required (and only used) when the initial
domain is not described in the Device-Tree.

### dom0_mem (x86)
> `= List of ( min:<sz> | max:<sz> | <sz> )`

Set the amount of memory for the initial domain (dom0). If a size is
positive, it represents an absolute value.  If a size is negative, it
is subtracted from the total available memory.

* `<sz>` specifies the exact amount of memory.
* `min:<sz>` specifies the minimum amount of memory.
* `max:<sz>` specifies the maximum amount of memory.

If `<sz>` is not specified, the default is all the available memory
minus some reserve.  The reserve is 1/16 of the available memory or
128 MB (whichever is smaller).

The amount of memory will be at least the minimum but never more than
the maximum (i.e., `max` overrides the `min` option).  If there isn't
enough memory then as much as possible is allocated.

`max:<sz>` also sets the maximum reservation (the maximum amount of
memory dom0 can balloon up to).  If this is omitted then the maximum
reservation is unlimited.

For example, to set dom0's initial memory allocation to 512MB but
allow it to balloon up as far as 1GB use `dom0_mem=512M,max:1G`

> `<sz>` is: `<size> | [<size>+]<frac>%`
> `<frac>` is an integer < 100

* `<frac>` specifies a fraction of host memory size in percent.

So `<sz>` being `1G+25%` on a 256 GB host would result in 65 GB.

If you use this option then it is highly recommended that you disable
any dom0 autoballooning feature present in your toolstack. See the
_xl.conf(5)_ man page or [Xen Best
Practices](https://wiki.xen.org/wiki/Xen_Best_Practices#Xen_dom0_dedicated_memory_and_preventing_dom0_memory_ballooning).

This option doesn't have effect if pv-shim mode is enabled.

### dom0_nodes (x86)

> `= List of [ <integer> | relaxed | strict ]`

> Default: `strict`

Specify the NUMA nodes to place Dom0 on. Defaults for vCPU-s created
and memory assigned to Dom0 will be adjusted to match the node
restrictions set up here. Note that the values to be specified here are
ACPI PXM ones, not Xen internal node numbers. `relaxed` sets up vCPU
affinities to prefer but be not limited to the specified node(s).

### dom0_vcpus_pin
> `= <boolean>`

> Default: `false`

Pin dom0 vcpus to their respective pcpus

### dtuart (ARM)
> `= path [:options]`

> Default: `""`

Specify the full path in the device tree for the UART.  If the path doesn't
start with `/`, it is assumed to be an alias.  The options are device specific.

### e820-mtrr-clip (x86)
> `= <boolean>`

Flag that specifies if RAM should be clipped to the highest cacheable
MTRR.

> Default: `true` on Intel CPUs, otherwise `false`

### e820-verbose (x86)
> `= <boolean>`

> Default: `false`

Flag that enables verbose output when processing e820 information and
applying clipping.

### edd (x86)
> `= off | on | skipmbr`

Control retrieval of Extended Disc Data (EDD) from the BIOS during
boot.

### edid (x86)
> `= no | force`

Either force retrieval of monitor EDID information via VESA DDC, or
disable it (edid=no). This option should not normally be required
except for debugging purposes.

### efi
    = List of [ rs=<bool>, attr=no|uc ]

Controls for interacting with the system Extended Firmware Interface.

*   The `rs` boolean controls whether Runtime Services are used.  By default,
    Xen uses Runtime Services itself, and proxies certain calls on behalf of
    dom0.  Selecting `rs=0` prohibits all use of Runtime Services.

*   The `attr=` string exists to specify what to do with memory regions of
    unknown/unrecognised cacheability.  `attr=no` is the default and will
    leave the memory regions unmapped, while `attr=uc` will map them as fully
    uncacheable.

### ept
> `= List of [ ad=<bool>, pml=<bool>, exec-sp=<bool> ]`

> Applicability: Intel

Extended Page Tables are a feature of Intel's VT-x technology, whereby
hardware manages the virtualisation of HVM guest pagetables.  EPT was
introduced with the Nehalem architecture.

*   The `ad` boolean controls hardware tracking of Access and Dirty bits in the
    EPT pagetables, and was first introduced in Broadwell Server.

    By default, Xen will use A/D tracking when available in hardware, except
    on Avoton processors affected by erratum AVR41.  Explicitly choosing
    `ad=0` will disable the use of A/D tracking on capable hardware, whereas
    choosing `ad=1` will cause tracking to be used even on AVR41-affected
    hardware.

*   The `pml` boolean controls the use of Page Modification Logging, which is
    also introduced in Broadwell Server.

    PML is a feature whereby the processor generates a list of pages which
    have been dirtied.  This is necessary information for operations such as
    live migration, and having the processor maintain the list of dirtied
    pages is more efficient than traditional software implementations where
    all guest writes trap into Xen so the dirty bitmap can be maintained.

    By default, Xen will use PML when it is available in hardware.  PML
    functionally depends on A/D tracking, so choosing `ad=0` will implicitly
    disable PML.  `pml=0` can be used to prevent the use of PML on otherwise
    capable hardware.

*   The `exec-sp` boolean controls whether EPT superpages with execute
    permissions are permitted.  In general this is good for performance.

    However, on processors vulnerable CVE-2018-12207, HVM guest kernels can
    use executable superpages to crash the host.  By default, executable
    superpages are disabled on affected hardware.

    If HVM guest kernels are trusted not to mount a DoS against the system,
    this option can enabled to regain performance.

    This boolean may be modified at runtime using `xl set-parameters
    ept=[no-]exec-sp` to switch between fast and secure.

    *   When switching from secure to fast, preexisting HVM domains will run
        at their current performance until they are rebooted; new domains will
        run without any overhead.

    *   When switching from fast to secure, all HVM domains will immediately
        suffer a performance penalty.

    **Warning: No guarantee is made that this runtime option will be retained
      indefinitely, or that it will retain this exact behaviour.  It is
      intended as an emergency option for people who first chose fast, then
      change their minds to secure, and wish not to reboot.**

### extra_guest_irqs (x86)
> `= [<domU number>][,<dom0 number>]`

> Default: `32,<variable>`

Change the number of PIRQs available for guests.  The optional first number is
common for all domUs, while the optional second number (preceded by a comma)
is for dom0.  Changing the setting for domU has no impact on dom0 and vice
versa.  For example to change dom0 without changing domU, use
`extra_guest_irqs=,512`.  The default value for Dom0 and an eventual separate
hardware domain is architecture dependent.  The upper limit for both values on
x86 is such that the resulting total number of IRQs can't be higher than 32768.
Note that specifying zero as domU value means zero, while for dom0 it means
to use the default.  Note further that the Dom0 setting has no useful meaning
for the PVH case; use of the option may have an adverse effect there, though.

### ext_regions (Arm)
> `= <boolean>`

> Default : `true`

Flag to enable or disable support for extended regions for Dom0 and
Dom0less DomUs.

Extended regions are ranges of unused address space exposed to the guest
as "safe to use" for special memory mappings. Disable if your board
device tree is incomplete.

### flask
> `= permissive | enforcing | late | disabled`

> Default: `enforcing`

Specify how the FLASK security server should be configured.  This option is only
available if the hypervisor was compiled with FLASK support.  This can be
enabled by running either:
- make -C xen config and enabling XSM and FLASK.
- make -C xen menuconfig and enabling 'FLux Advanced Security Kernel support' and 'Xen Security Modules support'

* `permissive`: This is intended for development and is not suitable for use
  with untrusted guests.  If a policy is provided by the bootloader, it will be
  loaded; errors will be reported to the ring buffer but will not prevent
  booting.  The policy can be changed to enforcing mode using "xl setenforce".
* `enforcing`: This will cause the security server to enter enforcing mode prior
  to the creation of domain 0.  If an valid policy is not provided by the
  bootloader and no built-in policy is present, the hypervisor will not continue
  booting.
* `late`: This disables loading of the built-in security policy or the policy
  provided by the bootloader.  FLASK will be enabled but will not enforce access
  controls until a policy is loaded by a domain using "xl loadpolicy".  Once a
  policy is loaded, FLASK will run in enforcing mode unless "xl setenforce" has
  changed that setting.
* `disabled`: This causes the XSM framework to revert to the dummy module.  The
  dummy module provides the same security policy as is used when compiling the
  hypervisor without support for XSM.  The xsm_op hypercall can also be used to
  switch to this mode after boot, but there is no way to re-enable FLASK once
  the dummy module is loaded.

### font
> `= <height>` where height is `8x8 | 8x14 | 8x16`

Specify the font size when using the VESA console driver.

### force-ept (Intel)
> `= <boolean>`

> Default: `false`

Allow EPT to be enabled when VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is not
present.

*Warning:*
Due to CVE-2013-2212, VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is by default
required as a prerequisite for using EPT.  If you are not using PCI Passthrough,
or trust the guest administrator who would be using passthrough, then the
requirement can be relaxed.  This option is particularly useful for nested
virtualization, to allow the L1 hypervisor to use EPT even if the L0 hypervisor
does not provide `VM_ENTRY_LOAD_GUEST_PAT`.

### gnttab
> `= List of [ max-ver:<integer>, transitive=<bool>, transfer=<bool> ]`

> Default (Arm): `gnttab=max-ver:1`
> Default (x86,PV): `gnttab=max-ver:2,transitive,transfer`
> Default (x86,HVM): `gnttab=max-ver:2,transitive`

Control various aspects of the grant table behaviour available to guests.

* `max-ver` Select the maximum grant table version to offer to guests.  Valid
version are 1 and 2.
* `transitive` Permit or disallow the use of transitive grants.  Note that the
use of grant table v2 without transitive grants is an ABI breakage from the
guests point of view.
* `transfer` Permit or disallow the GNTTABOP_transfer operation of the
grant table hypercall.  Note that disallowing GNTTABOP_transfer is an ABI
breakage from the guests point of view.  This option is only available on
hypervisors configured to support PV guests.

The usage of gnttab v2 is not security supported on ARM platforms.

### gnttab_max_frames
> `= <integer>`

> Default: `64`

> Can be modified at runtime

Specify the default upper bound on the number of frames which any domain may
use as part of its grant table unless a different value is specified at domain
creation.

Note this value is the effective upper bound for dom0.

### gnttab_max_maptrack_frames
> `= <integer>`

> Default: `1024`

> Can be modified at runtime

Specify the default upper bound on the number of frames which any domain may
use as part of its maptrack array unless a different value is specified at
domain creation.

Note this value is the effective upper bound for dom0.

### global-pages
    = <boolean>

    Applicability: x86
    Default: true unless running virtualized on AMD or Hygon hardware

Control whether to use global pages for PV guests, and thus the need to
perform TLB flushes by writing to CR4.  This is a performance trade-off.

AMD SVM does not support selective trapping of CR4 writes, which means that a
global TLB flush (two CR4 writes) takes two VMExits, and massively outweigh
the benefit of using global pages to begin with.  This case is easy for Xen to
spot, and is accounted for in the default setting.

Other cases where this option might be a benefit is on VT-x hardware when
selective CR4 writes are not supported/enabled by the hypervisor, or in any
virtualised case using shadow paging.  These are not easy for Xen to spot, so
are not accounted for in the default setting.

### guest_loglvl
> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all`

> Default: `guest_loglvl=none/warning`

> Can be modified at runtime

Set the logging level for Xen guests.  Any log message with equal more
more importance will be printed.

The optional `<rate-limited level>` option instructs which severities
should be rate limited.

### hap (x86)
> `= <boolean>`

> Default: `true`

Flag to globally enable or disable support for Hardware Assisted
Paging (HAP)

### hap_1gb (x86)
> `= <boolean>`

> Default: `true`

Flag to enable 1 GB host page table support for Hardware Assisted
Paging (HAP).

### hap_2mb (x86)
> `= <boolean>`

> Default: `true`

Flag to enable 2 MB host page table support for Hardware Assisted
Paging (HAP).

### hardware_dom
> `= <domid>`

> Default: `0`

Enable late hardware domain creation using the specified domain ID.  This is
intended to be used when domain 0 is a stub domain which builds a disaggregated
system including a hardware domain with the specified domain ID.  This option is
supported only when compiled with XSM on x86.

### hest_disable
> ` = <boolean>`

> Default: `false`

Control Xens use of the APEI Hardware Error Source Table, should one be found.

### highmem-start (x86)
> `= <size>`

Specify the memory boundary past which memory will be treated as highmem (x86
debug hypervisor only).

### hmp-unsafe (arm)
> `= <boolean>`

> Default : `false`

Say yes at your own risk if you want to enable heterogenous computing
(such as big.LITTLE). This may result to an unstable and insecure
platform, unless you manually specify the cpu affinity of all domains so
that all vcpus are scheduled on the same class of pcpus (big or LITTLE
but not both). vcpu migration between big cores and LITTLE cores is not
supported. See docs/misc/arm/big.LITTLE.txt for more information.

When the hmp-unsafe option is disabled (default), CPUs that are not
identical to the boot CPU will be parked and not used by Xen.

### hpet
    = List of [ <bool> | broadcast=<bool> | legacy-replacement=<bool> ]

    Applicability: x86

Controls Xen's use of the system's High Precision Event Timer.  By default,
Xen will use an HPET when available and not subject to errata.  Use of the
HPET can be disabled by specifying `hpet=0`.

 * The `broadcast` boolean is disabled by default, but forces Xen to keep
   using the broadcast for CPUs in deep C-states even when an RTC interrupt is
   enabled.  This then also affects raising of the RTC interrupt.

 * The `legacy-replacement` boolean allows for control over whether Legacy
   Replacement mode is enabled.

   Legacy Replacement mode is intended for hardware which does not have an
   8254 PIT, and allows the HPET to be configured into a compatible mode.
   Intel chipsets from Skylake/ApolloLake onwards can turn the PIT off for
   power saving reasons, and there is no platform-agnostic mechanism for
   discovering this.

   By default, Xen will not change hardware configuration, unless the PIT
   appears to be absent, at which point Xen will try to enable Legacy
   Replacement mode before falling back to pre-IO-APIC interrupt routing
   options.

   This behaviour can be inhibited by specifying `legacy-replacement=0`.
   Alternatively, this mode can be enabled unconditionally (if available) by
   specifying `legacy-replacement=1`.

### hpetbroadcast (x86)
> `= <boolean>`

Deprecated alternative of `hpet=broadcast`.

### hvm_debug (x86)
> `= <integer>`

The specified value is a bit mask with the individual bits having the
following meaning:

>     Bit  0 - debug level 0 (unused at present)
>     Bit  1 - debug level 1 (Control Register logging)
>     Bit  2 - debug level 2 (VMX logging of MSR restores when context switching)
>     Bit  3 - debug level 3 (unused at present)
>     Bit  4 - I/O operation logging
>     Bit  5 - vMMU logging
>     Bit  6 - vLAPIC general logging
>     Bit  7 - vLAPIC timer logging
>     Bit  8 - vLAPIC interrupt logging
>     Bit  9 - vIOAPIC logging
>     Bit 10 - hypercall logging
>     Bit 11 - MSR operation logging

Recognized in debug builds of the hypervisor only.

### hvm_fep (x86)
> `= <boolean>`

> Default: `false`

Allow use of the Forced Emulation Prefix in HVM guests, to allow emulation of
arbitrary instructions.

This option is intended for development and testing purposes.

*Warning*
As this feature opens up the instruction emulator to arbitrary
instruction from an HVM guest, don't use this in production system. No
security support is provided when this flag is set.

### hvm_port80 (x86)
> `= <boolean>`

> Default: `true`

Specify whether guests are to be given access to physical port 80
(often used for debugging purposes), to override the DMI based
detection of systems known to misbehave upon accesses to that port.

### idle_latency_factor (x86)
> `= <integer>`

### ioapic_ack (x86)
> `= old | new`

> Default: `new` unless directed-EOI is supported

### iommu
    = List of [ <bool>, verbose, debug, force, required,
                quarantine=<bool>|scratch-page,
                sharept, superpages, intremap, intpost, crash-disable,
                snoop, qinval, igfx, amd-iommu-perdev-intremap,
                dom0-{passthrough,strict} ]

    All sub-options are boolean in nature.

I/O Memory Memory Units perform a function similar to the CPU MMU (hence the
name), but typically exist as a discrete device, integrated as part of a PCI
Root Complex.  The most common configuration is to have one IOMMU per package
(for on-die PCIe devices and directly attached PCIe lanes), and one IOMMU
covering the remaining I/O in the system.

The functionality in an IOMMU commonly falls into two orthogonal categories:

1.  DMA remapping which uses a pagetable-like hierarchical structure and maps
    I/O Virtual Addresses (DFNs - Device Frame Numbers in Xen's terminology)
    to System Physical Addresses (MFNs - Machine Frame Numbers in Xen's
    terminology).

2.  Interrupt Remapping, which controls incoming Message Signalled Interrupt
    requests, including their routing to specific CPUs.

IOMMU functionality can be used to provide a translation which the hardware
device driver isn't aware of (e.g. PCI Passthrough and a native driver inside
the guest) and/or to enforce fine-grained control over the memory and
interrupts which a device is attempting to access.

By default, IOMMUs are configured for use if they are available.  An overall
boolean (e.g. `iommu=no`) can override this and leave the IOMMUs disabled.

*   The `verbose` and `debug` booleans can be used to print additional
    diagnostic information.  Neither are active by default.

*   The `force` and `required` booleans are synonymous and, when requested,
    will prevent Xen from booting if IOMMUs aren't discovered and enabled
    successfully.

*   The `quarantine` option can be used to control Xen's behavior when
    de-assigning devices from guests.  The default behaviour is chosen at
    compile time, and is one of `CONFIG_IOMMU_QUARANTINE_{NONE,BASIC,SCRATCH_PAGE}`.

    When a PCI device is assigned to an untrusted domain, it is possible
    for that domain to program the device to DMA to an arbitrary address.
    The IOMMU is used to protect the host from malicious DMA by making
    sure that the device addresses can only target memory assigned to the
    guest.  However, when the guest domain is torn down, assigning the
    device back to the hardware domain would allow any in-flight DMA to
    potentially target critical host data.  To avoid this, quarantining
    should be enabled.  Quarantining can be done in two ways: In its basic
    form, all in-flight DMA will simply be forced to encounter IOMMU
    faults.  Since there are systems where doing so can cause host lockup,
    an alternative form is available where accesses to memory will be directed
    to a scratch page. The implication here is that such accesses will go
    unnoticed, i.e. an admin may not become aware of the underlying problem.

    Therefore, if this option is set to true (the default), Xen always
    quarantines such devices; they must be explicitly assigned back to Dom0
    before they can be used there again.  If set to "scratch-page", still
    active DMA operations will additionally be directed to a "scratch" page.  If
    set to false, Xen will only quarantine devices the toolstack has arranged
    for getting quarantined, and only in the "basic" form.

    This option is only valid on builds supporting PCI.

*   The `sharept` boolean controls whether the IOMMU pagetables are shared
    with the CPU-side HAP pagetables, or allocated separately.  Sharing
    reduces the memory overhead, but doesn't work in combination with CPU-side
    pagefault-based features, e.g. dirty VRAM tracking when a PCI device is
    assigned.

    Due to implementation choices, sharing pagetables doesn't work on AMD
    hardware, and this option is ignored.  It is enabled by default on Intel
    systems.

    This option is ignored on ARM, and the pagetables are always shared.

*   The `superpages` boolean controls whether superpage mappings may be used
    in IOMMU page tables.  If using this option is necessary to fix an issue,
    please report a bug.

    This option is only valid on x86.

*   The `intremap` boolean controls the Interrupt Remapping sub-feature, and
    is active by default on compatible hardware.  On x86 systems, the first
    generation of IOMMUs only supported DMA remapping, and Interrupt Remapping
    appeared in the second generation.

    This option is only valid on x86.

*   The `intpost` boolean controls the Posted Interrupt sub-feature.  In
    combination with APIC acceleration (VT-x APICV, SVM AVIC), the IOMMU can
    be configured to deliver interrupts from assigned PCI devices directly
    into the guest, without trapping out into hypervisor context.

    This option depends on `intremap`, and is disabled by default due to some
    corner cases in the implementation which have yet to be resolved.

    This option is only valid on x86, and only builds of Xen with HVM support.

*   The `crash-disable` boolean controls disabling IOMMU functionality (DMAR/IR/QI)
    before switching to a crash kernel. This option is inactive by default and
    is for compatibility with older kdump kernels only. Modern kernels copy
    all the necessary tables from the previous one following kexec which makes
    the transition transparent for them with IOMMU functions still on.

The following options are specific to Intel VT-d hardware:

*   The `snoop` boolean controls the Snoop Control sub-feature, and is active
    by default on compatible hardware.

    An incoming DMA request may specify _Snooped_ (query the CPU caches for
    the appropriate lines) or _Non-Snooped_ (don't query the CPU caches).
    _Non-Snooped_ accesses incur less latency, but behind-the-scenes
    hypervisor activity can invalidate the expectations of the device driver,
    and Snoop Control allows the hypervisor to force DMA requests to be
    _Snooped_ when they would otherwise not be.

*   The `qinval` boolean controls the Queued Invalidation sub-feature, and is
    active by default on compatible hardware.  Queued Invalidation is a
    feature in second-generation IOMMUs and is a functional prerequisite for
    Interrupt Remapping. Note that Xen disregards this setting for Intel VT-d
    version 6 and greater as Registered-Based Invalidation isn't supported
    by them.

*   The `igfx` boolean is active by default, and controls whether IOMMUs in
    front of solely graphics devices get enabled or not.

    It is intended as a debugging mechanism for graphics issues, and to be
    similar to Linux's `intel_iommu=igfx_off` option.  If specifying `no-igfx`
    fixes anything, please report the problem.

The following options are specific to AMD-Vi hardware:

*   The `amd-iommu-perdev-intremap` boolean controls whether the interrupt
    remapping table is per device (the default), or a single global table for
    the entire system.

    Using a global table is not security supported as it allows all devices to
    impersonate each other as far as interrupts as concerned (see XSA-36), but
    it is a workaround for SP5100 Erratum 28.

**WARNING: The `dom0-passthrough` and `dom0-strict` booleans are both
deprecated, and superseded by _dom0-iommu={passthrough,strict}_ respectively -
using both the old and new command line options in combination is undefined.**

### iommu_dev_iotlb_timeout
> `= <integer>`

> Default: `1000`

Specify the timeout of the device IOTLB invalidation in milliseconds.
By default, the timeout is 1000 ms. When you see error 'Queue invalidate
wait descriptor timed out', try increasing this value.

### iommu_inclusive_mapping
> `= <boolean>`

**WARNING: This command line option is deprecated, and superseded by
_dom0-iommu=map-inclusive_ - using both options in combination is undefined.**

### irq-max-guests (x86)
> `= <integer>`

> Default: `32`

Maximum number of guests any individual IRQ could be shared between,
i.e. a limit on the number of guests it is possible to start each having
assigned a device sharing a common interrupt line.  Accepts values between
1 and 255.

### irq_ratelimit (x86)
> `= <integer>`

### irq_vector_map (x86)

### ivmd (x86)
> `= <start>[-<end>][=<bdf1>[-<bdf1'>][,<bdf2>[-<bdf2'>][,...]]][;<start>...]`

Define IVMD-like ranges that are missing from ACPI tables along with the
device(s) they belong to, and use them for 1:1 mapping.  End addresses can be
omitted when exactly one page is meant.  The ranges are inclusive when start
and end are specified.  Note that only PCI segment 0 is supported at this time,
but it is fine to specify it explicitly.

'start' and 'end' values are page numbers (not full physical addresses),
in hexadecimal format (can optionally be preceded by "0x").

Omitting the optional (range of) BDF spcifiers signals that the range is to
be applied to all devices.

Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be
reserved, and devices 0:0:1a.0...0:0:1a.3 collectively require three pages
(0xd5d46 thru 0xd5d48) to be reserved, one usage would be:

ivmd=d5d45=0:1d.0;0xd5d46-0xd5d48=0:1a.0-0:1a.3

Note: grub2 requires to escape or quote special characters, like ';' when
multiple ranges are specified - refer to the grub2 documentation.

### ivrs_hpet[`<hpet>`] (AMD)
> `=[<seg>:]<bus>:<device>.<func>`

Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of HPET
`<hpet>` instead of the one specified by the IVHD sub-tables of the IVRS
ACPI table.

### ivrs_ioapic[`<ioapic>`] (AMD)
> `=[<seg>:]<bus>:<device>.<func>`

Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of IO-APIC
`<ioapic>` instead of the one specified by the IVHD sub-tables of the IVRS
ACPI table.

### lapic (x86)
> `= <boolean>`

Force the use of use of the local APIC on a uniprocessor system, even
if left disabled by the BIOS.

### lapic_timer_c2_ok (x86)
> `= <boolean>`

### ler (x86)
> `= <boolean>`

> Default: false

This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
in hypervisor context to be able to dump the Last Interrupt/Exception To/From
record with other registers.

### llc-coloring (arm64)
> `= <boolean>`

> Default: `false`

Flag to enable or disable LLC coloring support at runtime. This option is
available only when `CONFIG_LLC_COLORING` is enabled. See the general
cache coloring documentation for more info.

### llc-nr-ways (arm64)
> `= <integer>`

> Default: `Obtained from hardware`

Specify the number of ways of the Last Level Cache. This option is available
only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
to find the number of supported cache colors. By default the value is
automatically computed by probing the hardware, but in case of specific needs,
it can be manually set. Those include failing probing and debugging/testing
purposes so that it's possible to emulate platforms with different number of
supported colors. If set, also "llc-size" must be set, otherwise the default
will be used. Note that using both options implies "llc-coloring=on" unless an
earlier "llc-coloring=off" is there.

### llc-size (arm64)
> `= <size>`

> Default: `Obtained from hardware`

Specify the size of the Last Level Cache. This option is available only when
`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
the number of supported cache colors. By default the value is automatically
computed by probing the hardware, but in case of specific needs, it can be
manually set. Those include failing probing and debugging/testing purposes so
that it's possible to emulate platforms with different number of supported
colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
used. Note that using both options implies "llc-coloring=on" unless an
earlier "llc-coloring=off" is there.

### lock-depth-size
> `= <integer>`

> Default: `lock-depth-size=64`

Specifies the maximum number of nested locks tested for illegal recursions.
Higher nesting levels still work, but recursion testing is omitted for those
levels. In case an illegal recursion is detected the system will crash
immediately. Specifying `0` will disable all testing of illegal lock nesting.

This option is available for hypervisors built with CONFIG_DEBUG_LOCKS only.

### loglvl
> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all`

> Default: `loglvl=info`

> Can be modified at runtime

Set the logging level for Xen.  Any log message with equal more more
importance will be printed.

The optional `<rate-limited level>` option instructs which severities
should be rate limited.

### low_crashinfo
> `= none | min | all`

> Default: `none` if not specified at all, or to `min` if **low_crashinfo** is present without qualification.

This option is only useful for hosts with a 32bit dom0 kernel, wishing
to use kexec functionality in the case of a crash.  It represents
which data structures should be deliberately allocated in low memory,
so the crash kernel may find find them.  Should be used in combination
with **crashinfo_maxaddr**.

### low_mem_virq_limit
> `= <size>`

> Default: `64M`

Specify the threshold below which Xen will inform dom0 that the quantity of
free memory is getting low.  Specifying `0` will disable this notification.

### maxcpus
> `= <integer>`

Specify the maximum number of CPUs that should be brought up.

This option is ignored in **pv-shim** mode.

**WARNING: On Arm big.LITTLE systems, when `hmp-unsafe` option is enabled, this command line
option does not guarantee on which CPU types will be used.**

### max_cstate (x86)
> `= <integer>[,<integer>]`

Specify the deepest C-state CPUs are permitted to be placed in, and
optionally the maximum sub C-state to be used used.  The latter only applies
to the highest permitted C-state.

### max_gsi_irqs (x86)
> `= <integer>`

Specifies the number of interrupts to be use for pin (IO-APIC or legacy PIC)
based interrupts. Any higher IRQs will be available for use via PCI MSI.

### max_lpi_bits (arm)
> `= <integer>`

Specifies the number of ARM GICv3 LPI interrupts to allocate on the host,
presented as the number of bits needed to encode it. This must be at least
14 and not exceed 32, and each LPI requires one byte (configuration) and
one pending bit to be allocated.
Defaults to 20 bits (to cover at most 1048576 interrupts).

### mce (x86)
> `= <boolean>`

> Default: `true`

Allows to disable the use of Machine Check Exceptions.  Note that doing
so may result in silent shutdown of the system in case an event occurs
which would have resulted in raising a Machine Check Exception.  Silent
here is as far as Xen is concerned; firmware may offer to retrieve some
collected data.

### mce_fb (Intel)
> `= <boolean>`

> Default: `false`

Force broadcasting of Machine Check Exceptions, suppressing the use of
Local MCE functionality available in newer Intel hardware.

### mce_verbosity (x86)
> `= verbose`

Specify verbose machine check output.

### mem (x86)
> `= <size>`

Specify the maximum address of physical RAM.  Any RAM beyond this
limit is ignored by Xen.

### memop-max-order
> `= [<domU>][,[<ctldom>][,[<hwdom>][,<ptdom>]]]`

> x86 default: `9,18,12,12`
> ARM default: `9,18,10,10`

Change the maximum order permitted for allocation (or allocation-like)
requests issued by the various kinds of domains (in this order:
ordinary DomU, control domain, hardware domain, and - when supported
by the platform - DomU with pass-through device assigned).

### mmcfg (x86)
> `= <boolean>[,amd-fam10]`

> Default: `1`

Specify if the MMConfig space should be enabled.

### mmio-relax (x86)
> `= <boolean> | all`

> Default: `false`

By default, domains may not create cached mappings to MMIO regions.
This option relaxes the check for Domain 0 (or when using `all`, all PV
domains), to permit the use of cacheable MMIO mappings.

### msi (x86)
> `= <boolean>`

> Default: `true`

Force Xen to (not) use PCI-MSI, even if ACPI FADT says otherwise.

### mtrr.show (x86)
> `= <boolean>`

> Default: `false`

Print boot time MTRR state.

### mwait-idle (x86)
> `= <boolean>`

> Default: `true`

Use the MWAIT idle driver (with model specific C-state knowledge) instead
of the ACPI based one.

### nmi (x86)
> `= ignore | dom0 | fatal`

> Default: `fatal` for a debug build, or `dom0` for a non-debug build

Specify what Xen should do in the event of an NMI parity or I/O error.
`ignore` discards the error; `dom0` causes Xen to report the error to
dom0, while 'fatal' causes Xen to print diagnostics and then hang.

### noapic (x86)

Instruct Xen to ignore any IOAPICs that are present in the system, and
instead continue to use the legacy PIC. This is _not_ recommended with
pvops type kernels.

Because responsibility for APIC setup is shared between Xen and the
domain 0 kernel this option is automatically propagated to the domain
0 command line.

### invpcid (x86)
> `= <boolean>`

> Default: `true`

By default, Xen will use the INVPCID instruction for TLB management if
it is available.  This option can be used to cause Xen to fall back to
older mechanisms, which are generally slower.

### load-balance-ratelimit
> `= <integer>`

The minimum interval between load balancing events on a given pcpu, in
microseconds.  A value of '0' will disable rate limiting.  Maximum
value 1 second. At the moment only credit honors this parameter.
Default 1ms.

### noirqbalance (x86)
> `= <boolean>`

Disable software IRQ balancing and affinity. This can be used on
systems such as Dell 1850/2850 that have workarounds in hardware for
IRQ routing issues.

### nolapic (x86)
> `= <boolean>`

> Default: `false`

Ignore the local APIC on a uniprocessor system, even if enabled by the
BIOS.

### no-real-mode (x86)
> `= <boolean>`

Do not execute real-mode bootstrap code when booting Xen. This option
should not be used except for debugging. It will effectively disable
the **vga** option, which relies on real mode to set the video mode.

### noreboot
> `= <boolean>`

Do not automatically reboot after an error.  This is useful for
catching debug output.  Defaults to automatically reboot after 5
seconds.

### nosmp (x86)
> `= <boolean>`

Disable SMP support.  No secondary processors will be booted.
Defaults to booting secondary processors.

This option is ignored in **pv-shim** mode.

### nr_irqs (x86)
> `= <integer>`

### numa (x86)
> `= on | off | fake=<integer> | noacpi`

> Default: `on`

### partial-emulation (arm)
> `= <boolean>`

> Default: `false`

Flag to enable or disable partial emulation of system/coprocessor registers.
Only effective if CONFIG_PARTIAL_EMULATION is enabled.

**WARNING: Enabling this option might result in unwanted/non-spec compliant
behavior.**

### pci
    = List of [ serr=<bool>, perr=<bool> ]

    Default: Signaling left as set by firmware.

Override the firmware settings, and explicitly enable or disable the
signalling of PCI System and Parity errors.

### pci-phantom
> `=[<seg>:]<bus>:<device>,<stride>`

Mark a group of PCI devices as using phantom functions without actually
advertising so, so the IOMMU can create translation contexts for them.

All numbers specified must be hexadecimal ones.

This option can be specified more than once (up to 8 times at present).

### pci-passthrough (arm)
> `= <boolean>`

> Default: `false`

Flag to enable or disable support for PCI passthrough

### pcid (x86)
> `= <boolean> | xpti=<bool>`

> Default: `xpti`

> Can be modified at runtime (change takes effect only for domains created
  afterwards)

If available, control usage of the PCID feature of the processor for
64-bit pv-domains. PCID can be used either for no domain at all (`false`),
for all of them (`true`), only for those subject to XPTI (`xpti`) or for
those not subject to XPTI (`no-xpti`). The feature is used only in case
INVPCID is supported and not disabled via `invpcid=false`.

### pdx-compress
> `= <boolean>`

> Default: `true` if CONFIG_PDX_NONE is unset

Only relevant when the hypervisor is build with PFN PDX compression. Controls
whether Xen will engage in PFN compression.  The algorithm used for PFN
compression is selected at build time from Kconfig.

### ple_gap
> `= <integer>`

### ple_window (Intel)
> `= <integer>`

### preferred-cstates (x86)
> `= ( <integer> | List of ( C1 | C1E | C2 | ... )`

This is a mask of C-states which are to be used preferably.  This option is
applicable only on hardware were certain C-states are exclusive of one another.

### probe-port-aliases (x86)
> `= <boolean>`

> Default: `true` outside of shim mode, `false` in shim mode

Certain devices accessible by I/O ports may be accessible also through "alias"
ports (originally a result of incomplete address decoding).  When such devices
are solely under Xen's control, Xen disallows even Dom0 access to the "primary"
ports.  When alias probing is active and aliases are detected, "alias" ports
would then be treated similar to the "primary" ones.

### psr (Intel)
> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> | cdp:<boolean> )`

> Default: `psr=cmt:0,rmid_max:255,cat:0,cos_max:255,cdp:0`

Platform Shared Resource(PSR) Services.  Intel Haswell and later server
platforms offer information about the sharing of resources.

To use the PSR monitoring service for a certain domain, a Resource
Monitoring ID(RMID) is used to bind the domain to corresponding shared
resource.  RMID is a hardware-provided layer of abstraction between software
and logical processors.

To use the PSR cache allocation service for a certain domain, a capacity
bitmasks(CBM) is used to bind the domain to corresponding shared resource.
CBM represents cache capacity and indicates the degree of overlap and isolation
between domains. In hypervisor a Class of Service(COS) ID is allocated for each
unique CBM.

The following resources are available:

* Cache Monitoring Technology (Haswell and later).  Information regarding the
  L3 cache occupancy.
  * `cmt` instructs Xen to enable/disable Cache Monitoring Technology.
  * `rmid_max` indicates the max value for rmid.
* Memory Bandwidth Monitoring (Broadwell and later). Information regarding the
  total/local memory bandwidth. Follow the same options with Cache Monitoring
  Technology.

* Cache Allocation Technology (Broadwell and later).  Information regarding
  the cache allocation.
  * `cat` instructs Xen to enable/disable Cache Allocation Technology.
  * `cos_max` indicates the max value for COS ID.
* Code and Data Prioritization Technology (Broadwell and later). Information
  regarding the code cache and the data cache allocation. CDP is based on CAT.
  * `cdp` instructs Xen to enable/disable Code and Data Prioritization. Note
    that `cos_max` of CDP is a little different from `cos_max` of CAT. With
    CDP, one COS will corespond two CBMs other than one with CAT, due to the
    sum of CBMs is fixed, that means actual `cos_max` in use will automatically
    reduce to half when CDP is enabled.

### pv
    = List of [ 32=<bool> ]

    Applicability: x86

Controls for aspects of PV guest support.

*   The `32` boolean controls whether 32bit PV guests can be created.  It
    defaults to `true`, and is ignored when `CONFIG_PV32` is compiled out.

    32bit PV guests are incompatible with CET Shadow Stacks.  If Xen is using
    shadow stacks, this option will be overridden to `false`.  Backwards
    compatibility can be maintained with the `pv-shim` mechanism.

### pv-linear-pt (x86)
> `= <boolean>`

> Default: `true`

Only available if Xen is compiled with `CONFIG_PV_LINEAR_PT` support
enabled.

Allow PV guests to have pagetable entries pointing to other pagetables
of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
This technique is often called "linear pagetables", and is sometimes
used to allow operating systems a simple way to consistently map the
current process's pagetables into its own virtual address space.

Linux and MiniOS don't use this technique.  NetBSD and Novell Netware
do; there may be other custom operating systems which do.  If you're
certain you don't plan on having PV guests which use this feature,
turning it off can reduce the attack surface.

### pv-l1tf (x86)
> `= List of [ <bool>, dom0=<bool>, domu=<bool> ]`

> Default: `false` on believed-unaffected hardware, or in pv-shim mode.
>          `domu`  on believed-affected hardware.

Mitigations for L1TF / XSA-273 / CVE-2018-3620 for PV guests.

For backwards compatibility, we may not alter an architecturally-legitimate
pagetable entry a PV guest chooses to write.  We can however force such a
guest into shadow mode so that Xen controls the PTEs which are reachable by
the CPU pagewalk.

Shadowing is performed at the point where a PV guest first tries to write an
L1TF-vulnerable PTE.  Therefore, a PV guest kernel which has been updated with
its own L1TF mitigations will not trigger shadow mode if it is well behaved.

If `CONFIG_SHADOW_PAGING` is not compiled in, this mitigation instead crashes
the guest when an L1TF-vulnerable PTE is written, which still allows updated,
well-behaved PV guests to run, despite Shadow being compiled out.

In the pv-shim case, Shadow is expected to be compiled out, and a malicious
guest kernel can only leak data from the shim Xen, rather than the host Xen.

### pv-shim (x86)
> `= <boolean>`

> Default: `false`

This option is intended for use by a toolstack, when choosing to run a PV
guest compatibly inside an HVM container.

In this mode, the kernel and initrd passed as modules to the hypervisor are
constructed into a plain unprivileged PV domain.

### rcu-idle-timer-period-ms
> `= <integer>`

> Default: `10`

How frequently a CPU which has gone idle, but with pending RCU callbacks,
should be woken up to check if the grace period has completed, and the
callbacks are safe to be executed. Expressed in milliseconds; maximum is
100, and it can't be 0.

### reboot (x86)
> `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`

> Default: system dependent

Specify the host reboot method.

`warm` instructs Xen to not set the cold reboot flag.

`cold` instructs Xen to set the cold reboot flag.

`no` instructs Xen to not automatically reboot after panics or crashes.

`triple` instructs Xen to reboot the host by causing a triple fault.

`kbd` instructs Xen to reboot the host via the keyboard controller.

`acpi` instructs Xen to reboot the host using RESET_REG in the ACPI FADT (this
is default mode if available).

`pci` instructs Xen to reboot the host using PCI reset register (port CF9).

`Power` instructs Xen to power-cycle the host using PCI reset register (port CF9).

`efi` instructs Xen to reboot using the EFI reboot call.

`xen` instructs Xen to reboot using Xen's SCHEDOP hypercall (this is the default
when running nested Xen)

### rmrr
> `= start<-end>=[s1]bdf1[,[s1]bdf2[,...]];start<-end>=[s2]bdf1[,[s2]bdf2[,...]]`

Define RMRR units that are missing from ACPI table along with device they
belong to and use them for 1:1 mapping. End addresses can be omitted and one
page will be mapped. The ranges are inclusive when start and end are specified.
If segment of the first device is not specified, segment zero will be used.
If other segments are not specified, first device segment will be used.
If a segment is specified for other than the first device and it does not match
the one specified for the first one, an error will be reported.

'start' and 'end' values are page numbers (not full physical addresses),
in hexadecimal format (can optionally be preceded by "0x").

Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be
reserved, and device 0:0:1a.0 requires three pages (0xd5d46 thru 0xd5d48)
to be reserved, one usage would be:

rmrr=d5d45=0:0:1d.0;0xd5d46-0xd5d48=0:0:1a.0

Note: grub2 requires to escape or use quotations if special characters are used,
namely ';', refer to the grub2 documentation if multiple ranges are specified.

### ro-hpet (x86)
> `= <boolean>`

> Default: `true`

Map the HPET page as read only in Dom0. If disabled the page will be mapped
with read and write permissions.

### sched
> `= credit | credit2 | arinc653 | rtds | null`

> Default: `sched=credit2`

Choose the default scheduler. Note the default scheduler is selectable via
Kconfig and depends on enabled schedulers. Check
`CONFIG_SCHED_DEFAULT` to see which scheduler is the default.

### sched_credit2_max_cpus_runqueue
> `= <integer>`

> Default: `16`

Defines how many CPUs will be put, at most, in each Credit2 runqueue.

Runqueues are still arranged according to the host topology (and following
what indicated by the 'credit2_runqueue' parameter). But we also have a cap
to the number of CPUs that share each runqueues.

A value that is a submultiple of the number of online CPUs is recommended,
as that would likely produce a perfectly balanced runqueue configuration.

### sched_credit2_migrate_resist
> `= <integer>`

### sched_credit_tslice_ms
> `= <integer>`

Set the timeslice of the credit1 scheduler, in milliseconds.  The
default is 30ms.  Reasonable values may include 10, 5, or even 1 for
very latency-sensitive workloads.

### sched-gran (x86)
> `= cpu | core | socket`

> Default: `sched-gran=cpu`

Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
statically to a "scheduling unit" which will then be subject to scheduling.
This assignment of vcpus to scheduling units is fixed.

`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a
hyperthread using x86/Intel terminology)

`core`: As many vcpus as there are cpus on a physical core are scheduled
together on a physical core.

`socket`: As many vcpus as there are cpus on a physical sockets are scheduled
together on a physical socket.

Note: a value other than `cpu` will result in rejecting a runtime modification
attempt of the "smt" setting.

Note: for AMD x86 processors before Fam17 the terminology in the official data
sheets is different: a cpu is named "core" and multiple "cores" are running
in the same "compute unit". As from Fam17 on AMD is using the same names as
Intel ("thread" and "core") the topology levels are named "cpu", "core" and
"socket" even on older AMD processors.

### sched_ratelimit_us
> `= <integer>`

In order to limit the rate of context switching, set the minimum
amount of time that a vcpu can be scheduled for before preempting it,
in microseconds.  The default is 1000us (1ms).  Setting this to 0
disables it altogether.

### sched_smt_power_savings
> `= <boolean>`

Normally Xen will try to maximize performance and cache utilization by
spreading out vcpus across as many different divisions as possible
(i.e, numa nodes, sockets, cores threads, &c).  This often maximizes
throughput, but also maximizes energy usage, since it reduces the
depth to which a processor can sleep.

This option inverts the logic, so that the scheduler in effect tries
to keep the vcpus on the smallest amount of silicon possible; i.e.,
first fill up sibling threads, then sibling cores, then sibling
sockets, &c.  This will reduce performance somewhat, particularly on
systems with hyperthreading enabled, but should reduce power by
enabling more sockets and cores to go into deeper sleep states.

### scrub-domheap
> `= <boolean>`

> Default: `false`

Scrub domains' freed pages. This is a safety net against a (buggy) domain
accidentally leaking secrets by releasing pages without proper sanitization.

### serial_tx_buffer
> `= <size>`

> Default: `CONFIG_SERIAL_TX_BUFSIZE`

Set the serial transmit buffer size.

### serrors (ARM)
> `= diverse | panic`

> Default: `diverse`

This parameter is provided to administrators to determine how the hypervisor
handles SErrors.

* `diverse`:
  The hypervisor will distinguish guest SErrors from hypervisor SErrors:
    - The guest generated SErrors will be forwarded to the currently running
      guest.
    - The hypervisor generated SErrors will cause the whole system to crash

* `panic`:
  All SErrors will cause the whole system to crash. This option should only
  be used if you trust all your guests and/or they don't have a gadget (e.g.
  device) to generate SErrors in normal run.

### shim_mem (x86)
> `= List of ( min:<size> | max:<size> | <size> )`

Set the amount of memory that xen-shim uses. Only has effect if pv-shim mode is
enabled. Note that this value accounts for the memory used by the shim itself
plus the free memory slack given to the shim for runtime allocations.

* `min:<size>` specifies the minimum amount of memory. Ignored if greater
   than max.
* `max:<size>` specifies the maximum amount of memory.
* `<size>` specifies the exact amount of memory. Overrides both min and max.

By default, the amount of free memory slack given to the shim for runtime usage
is 1MB.

### smap (x86)
> `= <boolean> | hvm`

> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware

Flag to enable Supervisor Mode Access Prevention
Use `smap=hvm` to allow SMAP use by HVM guests only.

In PV shim mode on AMD or Hygon hardware due to significant performance impact
in some cases and generally lower security risk the option defaults to false.

### smep (x86)
> `= <boolean> | hvm`

> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware

Flag to enable Supervisor Mode Execution Protection
Use `smep=hvm` to allow SMEP use by HVM guests only.

In PV shim mode on AMD or Hygon hardware due to significant performance impact
in some cases and generally lower security risk the option defaults to false.

### smt (x86)
> `= <boolean>`

Default: `true`

Control bring up of multiple hyper-threads per CPU core.

### snb_igd_quirk
> `= <boolean> | cap | <integer>`

A true boolean value enables legacy behavior (1s timeout), while `cap`
enforces the maximum theoretically necessary timeout of 670ms. Any number
is being interpreted as a custom timeout in milliseconds. Zero or boolean
false disable the quirk workaround, which is also the default.

### spec-ctrl (Arm)
> `= List of [ ssbd=force-disable|runtime|force-enable ]`

Controls for speculative execution sidechannel mitigations.

The option `ssbd=` is used to control the state of Speculative Store
Bypass Disable (SSBD) mitigation.

* `ssbd=force-disable` will keep the mitigation permanently off. The guest
will not be able to control the state of the mitigation.
* `ssbd=runtime` will always turn on the mitigation when running in the
hypervisor context. The guest will be to turn on/off the mitigation for
itself by using the firmware interface `ARCH_WORKAROUND_2`.
* `ssbd=force-enable` will keep the mitigation permanently on. The guest will
not be able to control the state of the mitigation.

By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`).

### spec-ctrl (x86)
> `= List of [ <bool>, xen=<bool>, {pv,hvm}=<bool>,
>              {msr-sc,rsb,verw,{ibpb,bhb}-entry}=<bool>|{pv,hvm}=<bool>,
>              bti-thunk=retpoline|lfence|jmp,bhb-seq=short|tsx|long,
>              {ibrs,ibpb,ssbd,psfd,
>              eager-fpu,l1d-flush,branch-harden,srb-lock,
>              unpriv-mmio,gds-mit,div-scrub,lock-harden,
>              bhi-dis-s,bp-spec-reduce,ibpb-alt}=<bool> ]`

Controls for speculative execution sidechannel mitigations.  By default, Xen
will pick the most appropriate mitigations based on compiled in support,
loaded microcode, and hardware details, and will virtualise appropriate
mitigations for guests to use.

**WARNING: Any use of this option may interfere with heuristics.  Use with
extreme care.**

An overall boolean value, `spec-ctrl=no`, can be specified to turn off all
mitigations, including pieces of infrastructure used to virtualise certain
mitigation features for guests.  This also includes settings which `xpti`,
`smt`, `pv-l1tf`, `tsx` control, unless the respective option(s) have been
specified earlier on the command line.

Alternatively, a slightly more restricted `spec-ctrl=no-xen` can be used to
turn off all of Xen's mitigations, while leaving the virtualisation support
in place for guests to use.

Use of a positive boolean value for either of these options is invalid.

The `pv=`, `hvm=`, `msr-sc=`, `rsb=`, `verw=`, `ibpb-entry=` and `bhb-entry=`
options offer fine grained control over the primitives by Xen.  These impact
Xen's ability to protect itself, and/or Xen's ability to virtualise support
for guests to use.

* `pv=` and `hvm=` offer control over all suboptions for PV and HVM guests
  respectively.
* Each other option can be used either as a plain boolean
  (e.g. `spec-ctrl=rsb` to control both the PV and HVM sub-options), or with
  `pv=` or `hvm=` subsuboptions (e.g. `spec-ctrl=rsb=no-hvm` to disable HVM
  RSB only).

* `msr-sc=` offers control over Xen's support for manipulating `MSR_SPEC_CTRL`
  on entry and exit.  These blocks are necessary to virtualise support for
  guests and if disabled, guests will be unable to use IBRS/STIBP/SSBD/etc.
* `rsb=` offers control over whether to overwrite the Return Stack Buffer /
  Return Address Stack on entry to Xen and on idle.
* `verw=` offers control over whether to use VERW for its scrubbing side
  effects at appropriate privilege transitions.  The exact side effects are
  microarchitecture and microcode specific.  *Note: `md-clear=` is accepted as
  a deprecated alias.  For compatibility with development versions of XSA-297,
  `mds=` is also accepted on Xen 4.12 and earlier as an alias.  Consult vendor
  documentation in preference to here.*
* `ibpb-entry=` offers control over whether IBPB (Indirect Branch Prediction
  Barrier) is used on entry to Xen.  This is used by default on hardware
  vulnerable to Branch Type Confusion, and hardware vulnerable to Speculative
  Return Stack Overflow if appropriate microcode has been loaded, but for
  performance reasons dom0 is unprotected by default.  If it is necessary to
  protect dom0 too, boot with `spec-ctrl=ibpb-entry`.
* `bhb-entry=` offers control over whether BHB-clearing (Branch History
  Buffer) sequences are used on entry to Xen.  This is used by default on
  hardware vulnerable to Branch History Injection, when the BHI_DIS_S control
  is not available (see `bhi-dis-s`).  The choice of scrubbing sequence can be
  selected using the `bhb-seq=` option.  If it is necessary to protect dom0
  too, boot with `spec-ctrl=bhb-entry`.

If Xen was compiled with `CONFIG_INDIRECT_THUNK` support, `bti-thunk=` can be
used to select which of the thunks gets patched into the
`__x86_indirect_thunk_%reg` locations.  The default thunk is `retpoline`
(generally preferred), with the alternatives being `jmp` (a `jmp *%reg` gadget,
minimal overhead), and `lfence` (an `lfence; jmp *%reg` gadget).

On all hardware, `bhb-seq=` can be used to select which of the BHB-clearing
sequences gets used.  This interacts with the `bhb-entry=` and `bhi-dis-s=`
options in order to mitigate Branch History Injection on affected hardware.
The default sequence is `short`, with `tsx` as an alternative available
capable hardware, and `long` that can be opted in to.

On hardware supporting IBRS (Indirect Branch Restricted Speculation), the
`ibrs=` option can be used to force or prevent Xen using the feature itself.
If Xen is not using IBRS itself, functionality is still set up so IBRS can be
virtualised for guests.

On hardware supporting STIBP (Single Thread Indirect Branch Predictors), the
`stibp=` option can be used to force or prevent Xen using the feature itself.
By default, Xen will use STIBP when IBRS is in use (IBRS implies STIBP), and
when hardware hints recommend using it as a blanket setting.

On hardware supporting SSBD (Speculative Store Bypass Disable), the `ssbd=`
option can be used to force or prevent Xen using the feature itself.  The
feature is virtualised for guests, independently of Xen's choice of setting.
On AMD hardware, disabling Xen SSBD usage on the command line (`ssbd=0` which
is the default value) can lead to Xen running with the guest SSBD selection
depending on hardware support, on the same hardware setting `ssbd=1` will
result in SSBD always being enabled, regardless of guest choice.

On hardware supporting PSFD (Predictive Store Forwarding Disable), the `psfd=`
option can be used to force or prevent Xen using the feature itself.  By
default, Xen will not use PSFD.  PSFD is implied by SSBD, and SSBD is off by
default.

On hardware supporting BHI_DIS_S (Branch History Injection Disable
Supervisor), the `bhi-dis-s=` option can be used to force or prevent Xen using
the feature itself.  By default Xen will use BHI_DIS_S on hardware susceptible
to Branch History Injection.

On hardware supporting IBPB (Indirect Branch Prediction Barrier), the `ibpb=`
option can be used to force (the default) or prevent Xen from issuing branch
prediction barriers on vcpu context switches.

On all hardware, the `eager-fpu=` option can be used to force or prevent Xen
from using fully eager FPU context switches.  This is currently implemented as
a global control.  By default, Xen will choose to use fully eager context
switches on hardware believed to speculate past #NM exceptions.

On hardware supporting L1D_FLUSH, the `l1d-flush=` option can be used to force
or prevent Xen from issuing an L1 data cache flush on each VMEntry.
Irrespective of Xen's setting, the feature is virtualised for HVM guests to
use.  By default, Xen will enable this mitigation on hardware believed to be
vulnerable to L1TF.

If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_BRANCH`, the
`branch-harden=` boolean can be used to force or prevent Xen from using
speculation barriers to protect selected conditional branches.  By default,
Xen will enable this mitigation.

On hardware supporting SRBDS_CTRL, the `srb-lock=` option can be used to force
or prevent Xen from protect the Special Register Buffer from leaking stale
data. By default, Xen will enable this mitigation, except on parts where MDS
is fixed and TAA is fixed/mitigated and there are no unprivileged MMIO
mappings (in which case, there is believed to be no way for an attacker to
obtain stale data).

The `unpriv-mmio=` boolean indicates whether the system has (or will have)
less than fully privileged domains granted access to MMIO devices.  By
default, this option is disabled.  If enabled, Xen will use the `FB_CLEAR`
and/or `SRBDS_CTRL` functionality available in the Intel May 2022 microcode
release to mitigate cross-domain leakage of data via the MMIO Stale Data
vulnerabilities.

On all hardware, the `gds-mit=` option can be used to force or prevent Xen
from mitigating the GDS (Gather Data Sampling) vulnerability.  By default, Xen
will mitigate GDS on hardware believed to be vulnerable.  On hardware
supporting GDS_CTRL (requires the August 2023 microcode), and where firmware
has elected not to lock the configuration, Xen will use GDS_CTRL to mitigate
GDS with.  Otherwise, Xen will mitigate by disabling AVX, which blocks the use
of the AVX2 Gather instructions.

On all hardware, the `div-scrub=` option can be used to force or prevent Xen
from mitigating the DIV-leakage vulnerability.  By default, Xen will mitigate
DIV-leakage on hardware believed to be vulnerable.

If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_LOCK`, the `lock-harden=`
boolean can be used to force or prevent Xen from using speculation barriers to
protect lock critical regions.  This mitigation won't be engaged by default,
and needs to be explicitly enabled on the command line.

On hardware supporting SRSO_MSR_FIX, the `bp-spec-reduce=` option can be used
to force or prevent Xen from using MSR_BP_CFG.BP_SPEC_REDUCE to mitigate the
SRSO (Speculative Return Stack Overflow) vulnerability.  Xen will use
bp-spec-reduce when available, as it is preferable to using `ibpb-entry=hvm`
to mitigate SRSO for HVM guests, and because it is a prerequisite to advertise
SRSO_U/S_NO to PV guests.

On Sappire and Emerald Rapids CPUs with May 2025 microcode or later, the
`ibpb-alt=` option can be used to switch to the alternative mitigation for
Intel SA-00982.  Intel suggest that some workloads will benefit from this.

### sync_console
> `= <boolean>`

> Default: `false`

Flag to force synchronous console output.  Useful for debugging, but
not suitable for production environments due to incurred overhead.

### tboot (x86)
> `= 0x<phys_addr>`

Specify the physical address of the trusted boot shared page.

### tbuf_size
> `= <integer>`

Specify the per-cpu trace buffer size in pages.

### tdt (x86)
> `= <boolean>`

> Default: `true`

Flag to enable TSC deadline as the APIC timer mode.

### tee (arm)
> `= <string>`

Specify the TEE mediator to be probed and use.

The default behaviour is to probe all TEEs supported by Xen and use
the first one successfully probed. When this parameter is passed, Xen will
probe only the TEE mediator passed as argument and boot will fail if this
mediator is not properly probed or if the requested TEE is not supported by
Xen.

This parameter can be set to `optee` or `ffa` if the corresponding mediators
are compiled in.

### tevt_mask
> `= <integer>`

Specify a mask for Xen event tracing. This allows Xen tracing to be
enabled at boot. Refer to the xentrace(8) documentation for a list of
valid event mask values. In order to enable tracing, a buffer size (in
pages) must also be specified via the tbuf_size parameter.

### tickle_one_idle_cpu
> `= <boolean>`

### timer_slop
> `= <integer>`

### tsc (x86)
> `= unstable | skewed | stable:socket`

### tsx
    = <bool>

    Applicability: x86 with CONFIG_INTEL active
    Default: false on parts vulnerable to TAA, true otherwise

Controls for the use of Transactional Synchronization eXtensions.

Several microcode updates are relevant:

 * March 2019, fixing the TSX memory ordering errata on all TSX-enabled CPUs
   to date.  Introduced MSR_TSX_FORCE_ABORT on SKL/SKX/KBL/WHL/CFL parts.  The
   errata workaround uses Performance Counter 3, so the user can select
   between working TSX and working perfcounters.

 * November 2019, fixing the TSX Async Abort speculative vulnerability.
   Introduced MSR_TSX_CTRL on all TSX-enabled MDS_NO parts to date,
   CLX/WHL-R/CFL-R, with the controls becoming architectural moving forward
   and formally retiring HLE from the architecture.  The user can disable TSX
   to mitigate TAA, and elect to hide the HLE/RTM CPUID bits.  Also causes
   VERW to once-again flush the microarchiectural buffers in case a TAA
   mitigation is wanted along with TSX being enabled.

 * June 2021, removing the workaround for March 2019 on client CPUs and
   formally de-featured TSX on SKL/KBL/WHL/CFL (Note: SKX still retains the
   March 2019 fix).  Introduced the ability to hide the HLE/RTM CPUID bits.
   PCR3 works fine, and TSX is disabled by default, but the user can re-enable
   TSX at their own risk, accepting that the memory order erratum is unfixed.

 * February 2022, removing the VERW flushing workaround from November 2019 on
   client CPUs and formally de-featuring TSX on WHL-R/CFL-R (Note: CLX still
   retains the VERW flushing workaround).  TSX defaults to disabled, and is
   locked off when SGX is enabled in the BIOS.  When SGX is not enabled, TSX
   can be re-enabled at the users own risk, as it reintroduces the TSX Async
   Abort speculative vulnerability.

On systems with the ability to configure TSX, this boolean offers system wide
control of whether TSX is enabled or disabled.

When TSX is disabled, transactions unconditionally abort.  This is compatible
with the TSX spec, which requires software to have a non-transactional path as
a fallback.  The RTM and HLE CPUID bits are hidden from VMs by default, but
can be re-enabled if required.  This allows VMs which previously saw RTM/HLE
to be migrated in, although any TSX-enabled software will run with reduced
performance.

 * When TSX is locked off by firmware, `tsx=` is ignored and treated as
   `false`.

 * An explicit `tsx=` choice is honoured, even if it is `true` and would
   result in a vulnerable system.

 * When no explicit `tsx=` choice is given, parts vulnerable to TAA will be
   mitigated by disabling TSX, as this is the lowest overhead option.

 * When no explicit `tsx=` option is given, parts susceptible to the memory
   ordering errata default to `true` to enable working TSX.  Alternatively,
   selecting `tsx=0` will disable TSX and restore PCR3 to a working state.

   SKX and SKL/KBL/WHL/CFL on pre-June 2021 microcode default to `true`.
   Alternatively, selecting `tsx=0` will disable TSX and restore PCR3 to a
   working state.

   SKL/KBL/WHL/CFL on the June 2021 microcode or later default to `false`.
   Alternatively, selecting `tsx=1` will re-enable TSX at the users own risk.

### ucode
> `= List of [ <integer> | scan=<bool>, nmi=<bool>, digest-check=<bool> ]`

    Applicability: x86
    Default: `scan` is selectable via Kconfig, `nmi,digest-check`

Controls for CPU microcode loading. For early loading, this parameter can
specify how and where to find the microcode update blob. For late loading,
this parameter specifies if the update happens within a NMI handler.

'integer' specifies the CPU microcode update blob module index. When positive,
this specifies the n-th module (in the GrUB entry, zero based) to be used
for updating CPU micrcode. When negative, counting starts at the end of
the modules in the GrUB entry (so with the blob commonly being last,
one could specify `ucode=-1`). Note that the value of zero is not valid
here (entry zero, i.e. the first module, is always the Dom0 kernel
image). Note further that use of this option has an unspecified effect
when used with xen.efi (there the concept of modules doesn't exist, and
the blob gets specified via the `ucode=<filename>` config file/section
entry; see [EFI configuration file description](efi.html)).

'scan' instructs the hypervisor to scan the multiboot images for an cpio
image that contains microcode. Depending on the platform the blob with the
microcode in the cpio name space must be:
  - on Intel: kernel/x86/microcode/GenuineIntel.bin
  - on AMD  : kernel/x86/microcode/AuthenticAMD.bin
When using xen.efi, the `ucode=<filename>` config file setting takes
precedence over `scan`. The default value for `scan` is set with
`CONFIG_UCODE_SCAN_DEFAULT`.

'nmi' determines late loading is performed in NMI handler or just in
stop_machine context. In NMI handler, even NMIs are blocked, which is
considered safer. The default value is `true`.

The `digest-check=` option is active by default and controls whether to
perform additional authenticity checks.  Collisions in the signature algorithm
used by AMD Fam17h/19h processors have been found.  Xen contains a table of
digests of microcode patches with known-good provenance, and will block
loading of patches that do not match.

### unrestricted_guest (Intel)
> `= <boolean>`

### vcpu_migration_delay
> `= <integer>`

> Default: `0`

Specify a delay, in microseconds, between migrations of a VCPU between
PCPUs when using the credit1 scheduler. This prevents rapid fluttering
of a VCPU between CPUs, and reduces the implicit overheads such as
cache-warming. 1ms (1000) has been measured as a good value.

### vesa-ram
> `= <integer>`

> Default: `0`

This allows to override the amount of video RAM, in MiB, determined to be
present.

### vga
> `= ( ask | current | text-80x<rows> | gfx-<width>x<height>x<depth> | mode-<mode> )[,keep]`

`ask` causes Xen to display a menu of available modes and request the
user to choose one of them.

`current` causes Xen to use the graphics adapter in its current state,
without further setup.

`text-80x<rows>` instructs Xen to set up text mode.  Valid values for
`<rows>` are `25, 28, 30, 34, 43, 50, 80`

`gfx-<width>x<height>x<depth>` instructs Xen to set up graphics mode
with the specified width, height and depth.

`mode-<mode>` instructs Xen to use a specific mode, as shown with the
`ask` option.  (N.B menu modes are displayed in hex, so `<mode>`
should be a hexadecimal number)

The optional `keep` parameter causes Xen to continue using the vga
console even after dom0 has been started.  The default behaviour is to
relinquish control to dom0.

### viridian-spinlock-retry-count (x86)
> `= <integer>`

> Default: `2047`

Specify the maximum number of retries before an enlightened Windows
guest will notify Xen that it has failed to acquire a spinlock.

### viridian-version (x86)
> `= [<major>],[<minor>],[<build>]`

> Default: `6,0,0x1772`

<major>, <minor> and <build> must be integers. The values will be
encoded in guest CPUID 0x40000002 if viridian enlightenments are enabled.

### vm-notify-window (Intel)
> `= <integer>`

> Default: `0`

Specify the value of the VM Notify window used to detect locked VMs. Set to -1
to disable the feature.  Value is in units of crystal clock cycles.

Note the hardware might add a threshold to the provided value in order to make
it safe, and hence using 0 is fine.

### vpid (Intel)
> `= <boolean>`

> Default: `true`

Use Virtual Processor ID support if available.  This prevents the need for TLB
flushes on VM entry and exit, increasing performance.

### vpmu (x86)
    = List of [ <bool>, bts, ipc, arch, rtm-abort=<bool> ]

    Applicability: x86.  Default: false

Controls for Performance Monitoring Unit virtualisation.

Performance monitoring facilities tend to be very hardware specific, and
provide access to a wealth of low level processor information.

*   An overall boolean can be used to enable or disable vPMU support.  vPMU is
    disabled by default.

    When enabled, guests have full access to all performance counter settings,
    including model specific functionality.  This is a superset of the
    functionality offered by `ipc` and/or `arch`, but a subset of the
    functionality offered by `bts`.

    Xen's watchdog functionality is implemented using performance counters.
    As a result, use of the **watchdog** option will override and disable
    vPMU.

*   The `bts` option enables performance monitoring, and permits additional
    access to the Branch Trace Store controls.  BTS is an Intel feature where
    the processor can write data into a buffer whenever a branch occurs.
    However, as this feature isn't virtualised, a misconfiguration by the
    guest can lock the entire system up.

*   The `ipc` option allows access to the most minimal set of counters
    possible: instructions, cycles, and reference cycles.  These can be used
    to calculate instructions per cycle (IPC).

*   The `arch` option allows access to the pre-defined architectural events.

*   The `rtm-abort` boolean has been superseded.  Use `tsx=0` instead.

*Warning:*
As the virtualisation is not 100% safe, don't use the vpmu flag on
production systems (see https://xenbits.xen.org/xsa/advisory-163.html)!

### vwfi (arm)
> `= trap | native`

> Default: `trap`

WFI is the ARM instruction to "wait for interrupt". WFE is similar and
means "wait for event". This option, which is ARM specific, changes the
way guest WFI and WFE are implemented in Xen. By default, Xen traps both
instructions. In the case of WFI, Xen blocks the guest vcpu; in the case
of WFE, Xen yield the guest vcpu. When setting vwfi to `native`, Xen
doesn't trap either instruction, running them in guest context. Setting
vwfi to `native` reduces irq latency significantly. It can also lead to
suboptimal scheduling decisions, but only when the system is
oversubscribed (i.e., in total there are more vCPUs than pCPUs).

### wallclock (x86)
> `= auto | xen | cmos | efi`

> Default: `auto`

Allow forcing the usage of a specific wallclock source.

 * `auto` let the hypervisor select the clocksource based on internal
   heuristics.

 * `xen` force usage of the Xen shared_info wallclock when booted as a Xen
   guest.  This option is only available if the hypervisor was compiled with
   `CONFIG_XEN_GUEST` enabled.

 * `cmos` force usage of the CMOS RTC wallclock.

 * `efi` force usage of the EFI_GET_TIME run-time method when booted from EFI
   firmware.

If the selected option is invalid or not available Xen will default to `auto`.

### watchdog (x86)
> `= force | <boolean>`

> Default: `false`

Run an NMI watchdog on each processor.  If a processor is stuck for
longer than the **watchdog_timeout**, a panic occurs.  When `force` is
specified, in addition to running an NMI watchdog on each processor,
unknown NMIs will still be processed.

### watchdog_timeout (x86)
> `= <integer>`

> Default: `5`

Set the NMI watchdog timeout in seconds.  Specifying `0` will turn off
the watchdog.

### x2apic (x86)
> `= <boolean>`

> Default: `true`

Permit use of x2apic setup for SMP environments.

### x2apic-mode (x86)
> `= physical | mixed`

> Default: `physical` if **FADT** mandates physical mode, otherwise set at
>          build time by CONFIG_X2APIC_{PHYSICAL,MIXED}.

In the case that x2apic is in use, this option switches between modes to
address APICs in the system as interrupt destinations.

### x2apic_phys (x86)
> `= <boolean>`

> Default: `true` if **FADT** mandates physical mode or if interrupt remapping
>          is not available, `false` otherwise.

In the case that x2apic is in use, this option switches between physical and
clustered mode.  The default, given no hint from the **FADT**, is cluster
mode.

**WARNING: `x2apic_phys` is deprecated and superseded by `x2apic-mode`.
The latter takes precedence if both are set.**

### xen-llc-colors (arm64)
> `= List of [ <integer> | <integer>-<integer> ]`

> Default: `0: the lowermost color`

Specify Xen LLC color configuration. This options is available only when
`CONFIG_LLC_COLORING` is enabled.
Two colors are most likely needed on platforms where private caches are
physically indexed, e.g. the L1 instruction cache of the Arm Cortex-A57.

### xenheap_megabytes (arm32)
> `= <size>`

> Default: `0` (1/32 of RAM)

Amount of RAM to set aside for the Xenheap. Must be an integer multiple of 32.

By default will use 1/32 of the RAM up to a maximum of 1GB and with a
minimum of 32M, subject to a suitably aligned and sized contiguous
region of memory being available.

### xpti (x86)
> `= List of [ default | <boolean> | dom0=<bool> | domu=<bool> ]`

> Default: `false` on hardware known not to be vulnerable to Meltdown (e.g. AMD)
> Default: `true` everywhere else

Override default selection of whether to isolate 64-bit PV guest page
tables.

`true` activates page table isolation even on hardware not vulnerable by
Meltdown for all domains.

`false` deactivates page table isolation on all systems for all domains.

`default` sets the default behaviour.

With `dom0` and `domu` it is possible to control page table isolation
for dom0 or guest domains only.

### xsave (x86)
> `= <boolean>`

> Default: `true`

Permit use of the `xsave/xrstor` instructions.

### xsm
> `= dummy | flask | silo`

> Default: selectable via Kconfig.  Depends on enabled XSM modules.

Specify which XSM module should be enabled.  This option is only available if
the hypervisor was compiled with `CONFIG_XSM` enabled.

* `dummy`: this is the default choice.  Basic restriction for common deployment
  (the dummy module) will be applied.  It's also used when XSM is compiled out.
* `flask`: this is the policy based access control.  To choose this, the
  separated option in kconfig must also be enabled.
* `silo`: this will deny any unmediated communication channels between
  unprivileged VMs.  To choose this, the separated option in kconfig must also
  be enabled.