1.. SPDX-License-Identifier: CC-BY-4.0
2
3Xen cache coloring user guide
4=============================
5
6The cache coloring support in Xen allows to reserve Last Level Cache (LLC)
7partitions for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.
8Cache coloring realizes per-set cache partitioning in software and is applicable
9to shared LLCs as implemented in Cortex-A53, Cortex-A72 and similar CPUs.
10
11To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
12
13If needed, change the maximum number of colors with
14``CONFIG_LLC_COLORS_ORDER=<n>``.
15
16If needed, change the buddy allocator reserved size with
17``CONFIG_BUDDY_ALLOCATOR_SIZE=<n>``.
18
19Runtime configuration is done via `Command line parameters`_.
20For DomUs follow `DomUs configuration`_.
21
22Background
23**********
24
25Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
26to each core (hence using multiple cache units), while the last level is shared
27among all of them. Such configuration implies that memory operations on one
28core (e.g. running a DomU) are able to generate interference on another core
29(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
30in software and mitigates this, guaranteeing more predictable performances for
31memory accesses.
32Software-based cache coloring is particularly useful in those situations where
33no hardware mechanisms (e.g., DSU-based way partitioning) are available to
34partition caches. This is the case for e.g., Cortex-A53, A57 and A72 CPUs that
35feature a L2 LLC cache shared among all cores.
36
37The key concept underlying cache coloring is a fragmentation of the memory
38space into a set of sub-spaces called colors that are mapped to disjoint cache
39partitions. Technically, the whole memory space is first divided into a number
40of subsequent regions. Then each region is in turn divided into a number of
41subsequent sub-colors. The generic i-th color is then obtained by all the
42i-th sub-colors in each region.
43
44::
45
46                            Region j            Region j+1
47                .....................   ............
48                .                     . .
49                .                       .
50            _ _ _______________ _ _____________________ _ _
51                |     |     |     |     |     |     |
52                | c_0 | c_1 |     | c_n | c_0 | c_1 |
53           _ _ _|_____|_____|_ _ _|_____|_____|_____|_ _ _
54                    :                       :
55                    :                       :...         ... .
56                    :                            color 0
57                    :...........................         ... .
58                                                :
59          . . ..................................:
60
61How colors are actually defined depends on the function that maps memory to
62cache lines. In case of physically-indexed, physically-tagged caches with linear
63mapping, the set index is found by extracting some contiguous bits from the
64physical address. This allows colors to be defined as shown in figure: they
65appear in memory as subsequent blocks of equal size and repeats themselves after
66``n`` different colors, where ``n`` is the total number of colors.
67
68If some kind of bit shuffling appears in the mapping function, then colors
69assume a different layout in memory. Those kind of caches aren't supported by
70the current implementation.
71
72**Note**: Finding the exact cache mapping function can be a really difficult
73task since it's not always documented in the CPU manual. As said Cortex-A53, A57
74and A72 are known to work with the current implementation.
75
76How to compute the number of colors
77###################################
78
79Given the linear mapping from physical memory to cache lines for granted, the
80number of available colors for a specific platform is computed using three
81parameters:
82
83- the size of the LLC.
84- the number of the LLC ways.
85- the page size used by Xen.
86
87The first two parameters can be found in the processor manual, while the third
88one is the minimum mapping granularity. Dividing the cache size by the number of
89its ways we obtain the size of a way. Dividing this number by the page size,
90the number of total cache colors is found. So for example an Arm Cortex-A53
91with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
924 KiB in size.
93
94Effective colors assignment
95###########################
96
97When assigning colors, if one wants to avoid cache interference between two
98domains, different colors needs to be used for their memory.
99
100Command line parameters
101***********************
102
103Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
104
105+----------------------+-------------------------------+
106| **Parameter**        | **Description**               |
107+----------------------+-------------------------------+
108| ``llc-coloring``     | Enable coloring at runtime    |
109+----------------------+-------------------------------+
110| ``llc-size``         | Set the LLC size              |
111+----------------------+-------------------------------+
112| ``llc-nr-ways``      | Set the LLC number of ways    |
113+----------------------+-------------------------------+
114| ``dom0-llc-colors``  | Dom0 color configuration      |
115+----------------------+-------------------------------+
116| ``buddy-alloc-size`` | Buddy allocator reserved size |
117+----------------------+-------------------------------+
118| ``xen-llc-colors``   | Xen color configuration       |
119+----------------------+-------------------------------+
120
121Colors selection format
122***********************
123
124Regardless of the memory pool that has to be colored (Xen, Dom0/DomUs),
125the color selection can be expressed using the same syntax. In particular a
126comma-separated list of colors or ranges of colors is used.
127Ranges are hyphen-separated intervals (such as `0-4`) and are inclusive on both
128sides.
129
130Note that:
131
132- no spaces are allowed between values.
133- no overlapping ranges or duplicated colors are allowed.
134- values must be written in ascending order.
135
136Examples:
137
138+-------------------+-----------------------------+
139| **Configuration** | **Actual selection**        |
140+-------------------+-----------------------------+
141| 1-2,5-8           | [1, 2, 5, 6, 7, 8]          |
142+-------------------+-----------------------------+
143| 4-8,10,11,12      | [4, 5, 6, 7, 8, 10, 11, 12] |
144+-------------------+-----------------------------+
145| 0                 | [0]                         |
146+-------------------+-----------------------------+
147
148Auto-probing of LLC specs
149#########################
150
151LLC size and number of ways are probed automatically by default.
152
153In the Arm implementation, this is done by inspecting the CLIDR_EL1 register.
154This means that other system caches that aren't visible there are ignored.
155
156LLC specs can be manually set via the above command line parameters. This
157bypasses any auto-probing and it's used to overcome failing situations, such as
158flawed probing logic, or for debugging/testing purposes.
159
160DomUs configuration
161*******************
162
163DomUs colors can be set either in the ``xl`` configuration file (documentation
164at `docs/man/xl.cfg.pod.5.in`) or via Device Tree (documentation at
165`docs/misc/arm/device-tree/booting.txt`) using the ``llc-colors`` option.
166For example:
167
168::
169
170    xen,xen-bootargs = "console=dtuart dtuart=serial0 dom0_mem=1G dom0_max_vcpus=1 sched=null llc-coloring=on dom0-llc-colors=2-6";
171    xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen root=/dev/ram0"
172
173    dom0 {
174        compatible = "xen,linux-zimage" "xen,multiboot-module";
175        reg = <0x0 0x1000000 0x0 15858176>;
176    };
177
178    dom0-ramdisk {
179        compatible = "xen,linux-initrd" "xen,multiboot-module";
180        reg = <0x0 0x2000000 0x0 20638062>;
181    };
182
183    domU0 {
184        #address-cells = <0x1>;
185        #size-cells = <0x1>;
186        compatible = "xen,domain";
187        memory = <0x0 0x40000>;
188        llc-colors = "4-8,10,11,12";
189        cpus = <0x1>;
190        vpl011 = <0x1>;
191
192        module@2000000 {
193            compatible = "multiboot,kernel", "multiboot,module";
194            reg = <0x2000000 0xffffff>;
195            bootargs = "console=ttyAMA0";
196        };
197
198        module@30000000 {
199            compatible = "multiboot,ramdisk", "multiboot,module";
200            reg = <0x3000000 0xffffff>;
201        };
202    };
203
204**Note:** If no color configuration is provided for a domain, the default one,
205which corresponds to all available colors is used instead.
206
207Colored allocator and buddy allocator
208*************************************
209
210The colored allocator distributes pages based on color configurations of
211domains so that each domains only gets pages of its own colors.
212The colored allocator is meant as an alternative to the buddy allocator because
213its allocation policy is by definition incompatible with the generic one. Since
214the Xen heap is not colored yet, we need to support the coexistence of the two
215allocators and some memory must be left for the buddy one. Buddy memory
216reservation is configured via Kconfig or via command-line.
217
218Known issues and limitations
219****************************
220
221"xen,static-mem" isn't supported when coloring is enabled
222#########################################################
223
224In the domain configuration, "xen,static-mem" allows memory to be statically
225allocated to the domain. This isn't possible when LLC coloring is enabled,
226because that memory can't be guaranteed to use only colors assigned to the
227domain.
228
229Cache coloring is intended only for embedded systems
230####################################################
231
232The current implementation aims to satisfy the need of predictability in
233embedded systems with small amount of memory to be managed in a colored way.
234Given that, some shortcuts are taken in the development. Expect worse
235performances on larger systems.
236
237Colored allocator can only make use of order-0 pages
238####################################################
239
240The cache coloring technique relies on memory mappings and on the smallest
241mapping granularity to achieve the maximum number of colors (cache partitions)
242possible. This granularity is what is normally called a page and, in Xen
243terminology, the order-0 page is the smallest one. The fairly simple
244colored allocator currently implemented, makes use only of such pages.
245It must be said that a more complex one could, in theory, adopt higher order
246pages if the colors selection contained adjacent colors. Two subsequent colors,
247for example, can be represented by an order-1 page, four colors correspond to
248an order-2 page, etc.
249