1.. SPDX-License-Identifier: GPL-2.0 2 3====================== 4BIOS/EFI Configuration 5====================== 6 7BIOS and EFI are largely responsible for configuring static information about 8devices (or potential future devices) such that Linux can build the appropriate 9logical representations of these devices. 10 11At a high level, this is what occurs during this phase of configuration. 12 13* The bootloader starts the BIOS/EFI. 14 15* BIOS/EFI do early device probe to determine static configuration 16 17* BIOS/EFI creates ACPI Tables that describe static config for the OS 18 19* BIOS/EFI create the system memory map (EFI Memory Map, E820, etc) 20 21* BIOS/EFI calls :code:`start_kernel` and begins the Linux Early Boot process. 22 23Much of what this section is concerned with is ACPI Table production and 24static memory map configuration. More detail on these tables can be found 25at :doc:`ACPI Tables <acpi>`. 26 27.. note:: 28 Platform Vendors should read carefully, as this sections has recommendations 29 on physical memory region size and alignment, memory holes, HDM interleave, 30 and what linux expects of HDM decoders trying to work with these features. 31 32UEFI Settings 33============= 34If your platform supports it, the :code:`uefisettings` command can be used to 35read/write EFI settings. Changes will be reflected on the next reboot. Kexec 36is not a sufficient reboot. 37 38One notable configuration here is the EFI_MEMORY_SP (Specific Purpose) bit. 39When this is enabled, this bit tells linux to defer management of a memory 40region to a driver (in this case, the CXL driver). Otherwise, the memory is 41treated as "normal memory", and is exposed to the page allocator during 42:code:`__init`. 43 44uefisettings examples 45--------------------- 46 47:code:`uefisettings identify` :: 48 49 uefisettings identify 50 51 bios_vendor: xxx 52 bios_version: xxx 53 bios_release: xxx 54 bios_date: xxx 55 product_name: xxx 56 product_family: xxx 57 product_version: xxx 58 59On some AMD platforms, the :code:`EFI_MEMORY_SP` bit is set via the :code:`CXL 60Memory Attribute` field. This may be called something else on your platform. 61 62:code:`uefisettings get "CXL Memory Attribute"` :: 63 64 selector: xxx 65 ... 66 question: Question { 67 name: "CXL Memory Attribute", 68 answer: "Enabled", 69 ... 70 } 71 72Physical Memory Map 73=================== 74 75Physical Address Region Alignment 76--------------------------------- 77 78As of Linux v6.14, the hotplug memory system requires memory regions to be 79uniform in size and alignment. While the CXL specification allows for memory 80regions as small as 256MB, the supported memory block size and alignment for 81hotplugged memory is architecture-defined. 82 83A Linux memory blocks may be as small as 128MB and increase in powers of two. 84 85* On ARM, the default block size and alignment is either 128MB or 256MB. 86 87* On x86, the default block size is 256MB, and increases to 2GB as the 88 capacity of the system increases up to 64GB. 89 90For best support across versions, platform vendors should place CXL memory at 91a 2GB aligned base address, and regions should be 2GB aligned. This also helps 92prevent the creating thousands of memory devices (one per block). 93 94Memory Holes 95------------ 96 97Holes in the memory map are tricky. Consider a 4GB device located at base 98address 0x100000000, but with the following memory map :: 99 100 --------------------- 101 | 0x100000000 | 102 | CXL | 103 | 0x1BFFFFFFF | 104 --------------------- 105 | 0x1C0000000 | 106 | MEMORY HOLE | 107 | 0x1FFFFFFFF | 108 --------------------- 109 | 0x200000000 | 110 | CXL CONT. | 111 | 0x23FFFFFFF | 112 --------------------- 113 114There are two issues to consider: 115 116* decoder programming, and 117* memory block alignment. 118 119If your architecture requires 2GB uniform size and aligned memory blocks, the 120only capacity Linux is capable of mapping (as of v6.14) would be the capacity 121from `0x100000000-0x180000000`. The remaining capacity will be stranded, as 122they are not of 2GB aligned length. 123 124Assuming your architecture and memory configuration allows 1GB memory blocks, 125this memory map is supported and this should be presented as multiple CFMWS 126in the CEDT that describe each side of the memory hole separately - along with 127matching decoders. 128 129Multiple decoders can (and should) be used to manage such a memory hole (see 130below), but each chunk of a memory hole should be aligned to a reasonable block 131size (larger alignment is always better). If you intend to have memory holes 132in the memory map, expect to use one decoder per contiguous chunk of host 133physical memory. 134 135As of v6.14, Linux does provide support for memory hotplug of multiple 136physical memory regions separated by a memory hole described by a single 137HDM decoder. 138 139 140Decoder Programming 141=================== 142If BIOS/EFI intends to program the decoders to be statically configured, 143there are a few things to consider to avoid major pitfalls that will 144prevent Linux compatibility. Some of these recommendations are not 145required "per the specification", but Linux makes no guarantees of support 146otherwise. 147 148 149Translation Point 150----------------- 151Per the specification, the only decoders which **TRANSLATE** Host Physical 152Address (HPA) to Device Physical Address (DPA) are the **Endpoint Decoders**. 153All other decoders in the fabric are intended to route accesses without 154translating the addresses. 155 156This is heavily implied by the specification, see: :: 157 158 CXL Specification 3.1 159 8.2.4.20: CXL HDM Decoder Capability Structure 160 - Implementation Note: CXL Host Bridge and Upstream Switch Port Decoder Flow 161 - Implementation Note: Device Decoder Logic 162 163Given this, Linux makes a strong assumption that decoders between CPU and 164endpoint will all be programmed with addresses ranges that are subsets of 165their parent decoder. 166 167Due to some ambiguity in how Architecture, ACPI, PCI, and CXL specifications 168"hand off" responsibility between domains, some early adopting platforms 169attempted to do translation at the originating memory controller or host 170bridge. This configuration requires a platform specific extension to the 171driver and is not officially endorsed - despite being supported. 172 173It is *highly recommended* **NOT** to do this; otherwise, you are on your own 174to implement driver support for your platform. 175 176Interleave and Configuration Flexibility 177---------------------------------------- 178If providing cross-host-bridge interleave, a CFMWS entry in the :doc:`CEDT 179<acpi/cedt>` must be presented with target host-bridges for the interleaved 180device sets (there may be multiple behind each host bridge). 181 182If providing intra-host-bridge interleaving, only 1 CFMWS entry in the CEDT is 183required for that host bridge - if it covers the entire capacity of the devices 184behind the host bridge. 185 186If intending to provide users flexibility in programming decoders beyond the 187root, you may want to provide multiple CFMWS entries in the CEDT intended for 188different purposes. For example, you may want to consider adding: 189 1901) A CFMWS entry to cover all interleavable host bridges. 1912) A CFMWS entry to cover all devices on a single host bridge. 1923) A CFMWS entry to cover each device. 193 194A platform may choose to add all of these, or change the mode based on a BIOS 195setting. For each CFMWS entry, Linux expects descriptions of the described 196memory regions in the :doc:`SRAT <acpi/srat>` to determine the number of 197NUMA nodes it should reserve during early boot / init. 198 199As of v6.14, Linux will create a NUMA node for each CEDT CFMWS entry, even if 200a matching SRAT entry does not exist; however, this is not guaranteed in the 201future and such a configuration should be avoided. 202 203Memory Holes 204------------ 205If your platform includes memory holes intersparsed between your CXL memory, it 206is recommended to utilize multiple decoders to cover these regions of memory, 207rather than try to program the decoders to accept the entire range and expect 208Linux to manage the overlap. 209 210For example, consider the Memory Hole described above :: 211 212 --------------------- 213 | 0x100000000 | 214 | CXL | 215 | 0x1BFFFFFFF | 216 --------------------- 217 | 0x1C0000000 | 218 | MEMORY HOLE | 219 | 0x1FFFFFFFF | 220 --------------------- 221 | 0x200000000 | 222 | CXL CONT. | 223 | 0x23FFFFFFF | 224 --------------------- 225 226Assuming this is provided by a single device attached directly to a host bridge, 227Linux would expect the following decoder programming :: 228 229 ----------------------- ----------------------- 230 | root-decoder-0 | | root-decoder-1 | 231 | base: 0x100000000 | | base: 0x200000000 | 232 | size: 0xC0000000 | | size: 0x40000000 | 233 ----------------------- ----------------------- 234 | | 235 ----------------------- ----------------------- 236 | HB-decoder-0 | | HB-decoder-1 | 237 | base: 0x100000000 | | base: 0x200000000 | 238 | size: 0xC0000000 | | size: 0x40000000 | 239 ----------------------- ----------------------- 240 | | 241 ----------------------- ----------------------- 242 | ep-decoder-0 | | ep-decoder-1 | 243 | base: 0x100000000 | | base: 0x200000000 | 244 | size: 0xC0000000 | | size: 0x40000000 | 245 ----------------------- ----------------------- 246 247With a CEDT configuration with two CFMWS describing the above root decoders. 248 249Linux makes no guarantee of support for strange memory hole situations. 250 251Multi-Media Devices 252------------------- 253The CFMWS field of the CEDT has special restriction bits which describe whether 254the described memory region allows volatile or persistent memory (or both). If 255the platform intends to support either: 256 2571) A device with multiple medias, or 2582) Using a persistent memory device as normal memory 259 260A platform may wish to create multiple CEDT CFMWS entries to describe the same 261memory, with the intent of allowing the end user flexibility in how that memory 262is configured. Linux does not presently have strong requirements in this area. 263