1.. SPDX-License-Identifier: CC-BY-4.0 2 3============================================= 4C Dialect and Translation Assumptions for Xen 5============================================= 6 7This document specifies the C language dialect used by Xen and 8the assumptions Xen makes on the translation toolchain. 9It covers, in particular: 10 111. the used language extensions; 122. the translation limits that the translation toolchains must be able 13 to accommodate; 143. the implementation-defined behaviors upon which Xen may depend. 15 16All points are of course relevant for portability. In addition, 17programming in C is impossible without a detailed knowledge of the 18implementation-defined behaviors. For this reason, it is recommended 19that Xen developers have familiarity with this document and the 20documentation referenced therein. 21 22This document needs maintenance and adaptation in the following 23circumstances: 24 25- whenever the compiler is changed or updated; 26- whenever the use of a certain language extension is added or removed; 27- whenever code modifications cause exceeding the stated translation limits. 28 29 30Applicable C Language Standard 31______________________________ 32 33Xen is written in C99 with extensions. The relevant ISO standard is 34 35 *ISO/IEC 9899:1999/Cor 3:2007*: Programming Languages - C, 36 Technical Corrigendum 3. 37 ISO/IEC, Geneva, Switzerland, 2007. 38 39 40Reference Documentation 41_______________________ 42 43The following documents are referred to in the sequel: 44 45GCC_MANUAL: 46 https://gcc.gnu.org/onlinedocs/gcc-12.1.0/gcc.pdf 47CPP_MANUAL: 48 https://gcc.gnu.org/onlinedocs/gcc-12.1.0/cpp.pdf 49ARM64_ABI_MANUAL: 50 https://github.com/ARM-software/abi-aa/blob/60a8eb8c55e999d74dac5e368fc9d7e36e38dda4/aapcs64/aapcs64.rst 51X86_64_ABI_MANUAL: 52 https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/artifacts/master/raw/x86-64-ABI/abi.pdf?job=build 53 54 55C Language Extensions 56_____________________ 57 58 59The following table lists the extensions currently used in Xen. 60The table columns are as follows: 61 62 Extension 63 a terse description of the extension; 64 Architectures 65 a set of Xen architectures making use of the extension; 66 References 67 when available, references to the documentation explaining 68 the syntax and semantics of (each instance of) the extension. 69 70 71.. list-table:: 72 :widths: 30 15 55 73 :header-rows: 1 74 75 * - Extension 76 - Architectures 77 - References 78 79 * - Non-standard tokens 80 - ARM64, X86_64 81 - _Static_assert: 82 see Section "2.1 C Language" of GCC_MANUAL. 83 asm, __asm__: 84 see Sections "6.48 Alternate Keywords" and "6.47 How to Use Inline Assembly Language in C Code" of GCC_MANUAL. 85 __volatile__: 86 see Sections "6.48 Alternate Keywords" and "6.47.2.1 Volatile" of GCC_MANUAL. 87 __const__: 88 see Section "6.48 Alternate Keywords" of GCC_MANUAL. 89 __inline, __inline__: 90 see Section "6.48 Alternate Keywords" of GCC_MANUAL. 91 typeof, __typeof__: 92 see Section "6.7 Referring to a Type with typeof" of GCC_MANUAL. 93 __alignof__, __alignof: 94 see Sections "6.48 Alternate Keywords" and "6.44 Determining the Alignment of Functions, Types or Variables" of GCC_MANUAL. 95 __attribute__: 96 see Section "6.39 Attribute Syntax" of GCC_MANUAL. 97 __builtin_types_compatible_p: 98 see Section "6.59 Other Built-in Functions Provided by GCC" of GCC_MANUAL. 99 __builtin_va_arg: 100 non-documented GCC extension. 101 __builtin_offsetof: 102 see Section "6.53 Support for offsetof" of GCC_MANUAL. 103 104 * - Empty initialization list 105 - ARM64, X86_64 106 - Non-documented GCC extension. 107 108 * - Arithmetic operator on pointer to void 109 - ARM64, X86_64 110 - See Section "6.24 Arithmetic on void- and Function-Pointers" of GCC_MANUAL." 111 112 * - Statements and declarations in expressions 113 - ARM64, X86_64 114 - See Section "6.1 Statements and Declarations in Expressions" of GCC_MANUAL. 115 116 * - Structure or union definition with no members 117 - ARM64, X86_64 118 - See Section "6.19 Structures with No Members" of GCC_MANUAL. 119 120 * - Zero size array type 121 - ARM64, X86_64 122 - See Section "6.18 Arrays of Length Zero" of GCC_MANUAL. 123 124 * - Binary conditional expression 125 - ARM64, X86_64 126 - See Section "6.8 Conditionals with Omitted Operands" of GCC_MANUAL. 127 128 * - 'Case' label with upper/lower values 129 - ARM64, X86_64 130 - See Section "6.30 Case Ranges" of GCC_MANUAL. 131 132 * - Unnamed field that is not a bit-field 133 - ARM64, X86_64 134 - See Section "6.63 Unnamed Structure and Union Fields" of GCC_MANUAL. 135 136 * - Empty declaration 137 - ARM64, X86_64 138 - Non-documented GCC extension. 139 Note: an empty declaration is caused by a semicolon at file scope 140 with nothing before it (not to be confused with an empty statement). 141 142 * - Incomplete enum declaration 143 - ARM64 144 - See Section "6.49 Incomplete enum Types" of GCC_MANUAL. 145 146 * - Implicit conversion from a pointer to an incompatible pointer 147 - ARM64, X86_64 148 - Non-documented GCC extension. The documentation for option 149 -Wincompatible-pointer-types in Section 150 "3.8 Options to Request or Suppress Warnings" of GCC_MANUAL 151 is possibly relevant. 152 153 * - Pointer to a function is converted to a pointer to an object or a pointer to an object is converted to a pointer to a function 154 - X86_64 155 - Non-documented GCC extension. The information provided in 156 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83584 157 is possibly relevant. 158 159 * - Token pasting of ',' and __VA_ARGS__ 160 - ARM64, X86_64 161 - See Section "6.21 Macros with a Variable Number of Arguments" of GCC_MANUAL. 162 163 * - Named variadic macro arguments 164 - ARM64, X86_64 165 - See Section "6.21 Macros with a Variable Number of Arguments" of GCC_MANUAL. 166 167 * - No arguments for '...' parameter of variadic macro 168 - ARM64, X86_64 169 - See Section "6.21 Macros with a Variable Number of Arguments" of GCC_MANUAL. 170 171 * - void function returning void expression 172 - ARM64, X86_64 173 - See the documentation for -Wreturn-type in Section "3.8 Options to Request or Suppress Warnings" of GCC_MANUAL. 174 175 * - GNU statement expressions from macro expansion 176 - ARM64, X86_64 177 - See Section "6.1 Statements and Declarations in Expressions" of GCC_MANUAL. 178 179 * - Invalid application of sizeof to a void type 180 - ARM64, X86_64 181 - See Section "6.24 Arithmetic on void- and Function-Pointers" of GCC_MANUAL. 182 183 * - Redeclaration of already-defined enum 184 - ARM64, X86_64 185 - See Section "6.49 Incomplete enum Types" of GCC_MANUAL. 186 187 * - struct with flexible array member nested in a struct 188 - ARM64, X86_64 189 - See Section "6.18 Arrays of Length Zero" of GCC_MANUAL. 190 191 * - struct with flexible array member used as an array element 192 - ARM64, X86_64 193 - See Section "6.18 Arrays of Length Zero" of GCC_MANUAL. 194 195 * - enumerator value outside the range of int 196 - ARM64, X86_64 197 - Non-documented GCC extension. 198 199 * - Extended integer types 200 - X86_64 201 - See Section "6.9 128-bit Integers" of GCC_MANUAL. 202 203 * - Designated initializer for a range of elements 204 - ARM64, X86_64 205 - See Section "6.29 Designated Initializers" of GCC_MANUAL 206 207 * - Signed << compiler-defined behavior 208 - All architectures 209 - See Section "4.5 Integers" of GCC_MANUAL. As an extension to the 210 C language, GCC does not use the latitude given in C99 and C11 211 only to treat certain aspects of signed << as undefined. 212 213 * - Signed >> acts on negative numbers by sign extension 214 - All architectures 215 - See Section "4.5 Integers" of GCC_MANUAL. 216 217 * - Taking the address of a label 218 - All architectures 219 - See Section "6.3 Labels as Values" of GCC_MANUAL. 220 221Translation Limits 222__________________ 223 224The following table lists the translation limits that a toolchain has 225to satisfy in order to translate Xen. The numbers given are a 226compromise: on the one hand, many modern compilers have very generous 227limits (in several cases, the only limitation is the amount of 228available memory); on the other hand we prefer setting limits that are 229not too high, because compilers do not have any obligation of 230diagnosing when a limit has been exceeded, and not too low, so as to 231avoid frequently updating this document. In the table, only the 232limits that go beyond the minima specified by the relevant C Standard 233are listed. 234 235The table columns are as follows: 236 237 Limit 238 a terse description of the translation limit; 239 Architectures 240 a set relevant of Xen architectures; 241 Threshold 242 a value that the Xen project does not wish to exceed for that limit 243 (this is typically below, often much below what the translation 244 toolchain supports); 245 References 246 when available, references to the documentation providing evidence 247 that the translation toolchain honors the threshold (and more). 248 249.. list-table:: 250 :widths: 30 15 10 45 251 :header-rows: 1 252 253 * - Limit 254 - Architectures 255 - Threshold 256 - References 257 258 * - Size of an object 259 - ARM64, X86_64 260 - 8388608 261 - The maximum size of an object is defined in the MAX_SIZE macro, and for a 32 bit architecture is 8MB. 262 The maximum size for an array is defined in the PTRDIFF_MAX and in a 32 bit architecture is 2^30-1. 263 See occurrences of these macros in GCC_MANUAL. 264 265 * - Characters in one logical source line 266 - ARM64 267 - 5000 268 - See Section "11.2 Implementation limits" of CPP_MANUAL. 269 270 * - Characters in one logical source line 271 - X86_64 272 - 12000 273 - See Section "11.2 Implementation limits" of CPP_MANUAL. 274 275 * - Nesting levels for #include files 276 - ARM64 277 - 24 278 - See Section "11.2 Implementation limits" of CPP_MANUAL. 279 280 * - Nesting levels for #include files 281 - X86_64 282 - 32 283 - See Section "11.2 Implementation limits" of CPP_MANUAL. 284 285 * - case labels for a switch statement (excluding those for any nested switch statements) 286 - X86_64 287 - 1500 288 - See Section "4.12 Statements" of GCC_MANUAL. 289 290 * - Number of significant initial characters in an external identifier 291 - ARM64, X86_64 292 - 63 293 - See Section "4.3 Identifiers" of GCC_MANUAL. 294 295 296Implementation-Defined Behaviors 297________________________________ 298 299The following table lists the C language implementation-defined behaviors 300relevant for MISRA C:2012 Dir 1.1 upon which Xen may possibly depend. 301 302The table columns are as follows: 303 304 I.-D.B. 305 a terse description of the implementation-defined behavior; 306 Architectures 307 a set relevant of Xen architectures; 308 Value(s) 309 for i.-d.b.'s with values, the values allowed; 310 References 311 when available, references to the documentation providing details 312 about how the i.-d.b. is resolved by the translation toolchain. 313 314.. list-table:: 315 :widths: 30 15 10 45 316 :header-rows: 1 317 318 * - I.-D.B. 319 - Architectures 320 - Value(s) 321 - References 322 323 * - Allowable bit-field types other than _Bool, signed int, and unsigned int 324 - ARM64, X86_64 325 - All explicitly signed integer types, all unsigned integer types, 326 and enumerations. 327 - See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields". 328 329 * - #pragma preprocessing directive that is documented as causing translation failure or some other form of undefined behavior is encountered 330 - ARM64, X86_64 331 - pack, GCC visibility 332 - #pragma pack: 333 see Section "6.62.11 Structure-Layout Pragmas" of GCC_MANUAL. 334 #pragma GCC visibility: 335 see Section "6.62.14 Visibility Pragmas" of GCC_MANUAL. 336 337 * - The number of bits in a byte 338 - ARM64 339 - 8 340 - See Section "4.4 Characters" of GCC_MANUAL and Section "8.1 Data types" of ARM64_ABI_MANUAL. 341 342 * - The number of bits in a byte 343 - X86_64 344 - 8 345 - See Section "4.4 Characters" of GCC_MANUAL and Section "3.1.2 Data Representation" of X86_64_ABI_MANUAL. 346 347 * - Whether signed integer types are represented using sign and magnitude, two's complement, or one's complement, and whether the extraordinary value is a trap representation or an ordinary value 348 - ARM64, X86_64 349 - Two's complement 350 - See Section "4.5 Integers" of GCC_MANUAL. 351 352 * - Any extended integer types that exist in the implementation 353 - X86_64 354 - __uint128_t 355 - See Section "6.9 128-bit Integers" of GCC_MANUAL. 356 357 * - The number, order, and encoding of bytes in any object 358 - ARM64 359 - 360 - See Section "4.15 Architecture" of GCC_MANUAL and Chapter 5 "Data types and alignment" of ARM64_ABI_MANUAL. 361 362 * - The number, order, and encoding of bytes in any object 363 - X86_64 364 - 365 - See Section "4.15 Architecture" of GCC_MANUAL and Section "3.1.2 Data Representation" of X86_64_ABI_MANUAL. 366 367 * - Whether a bit-field can straddle a storage-unit boundary 368 - ARM64 369 - 370 - See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields of GCC_MANUAL and Section "8.1.8 Bit-fields" of ARM64_ABI_MANUAL. 371 372 * - Whether a bit-field can straddle a storage-unit boundary 373 - X86_64 374 - 375 - See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields" of GCC_MANUAL and Section "3.1.2 Data Representation" of X86_64_ABI_MANUAL. 376 377 * - The order of allocation of bit-fields within a unit 378 - ARM64 379 - 380 - See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields of GCC_MANUAL and Section "8.1.8 Bit-fields" of ARM64_ABI_MANUAL. 381 382 * - The order of allocation of bit-fields within a unit 383 - X86_64 384 - 385 - See Section "4.9 Structures, Unions, Enumerations, and Bit-Fields" of GCC_MANUAL and Section "3.1.2 Data Representation" of X86_64_ABI_MANUAL. 386 387 * - What constitutes an access to an object that has volatile-qualified type 388 - ARM64, X86_64 389 - 390 - See Section "4.10 Qualifiers" of GCC_MANUAL. 391 392 * - The values or expressions assigned to the macros specified in the headers <float.h>, <limits.h>, and <stdint.h> 393 - ARM64 394 - 395 - See Section "4.15 Architecture" of GCC_MANUAL and Chapter 5 "Data types and alignment" of ARM64_ABI_MANUAL. 396 397 * - The values or expressions assigned to the macros specified in the headers <float.h>, <limits.h>, and <stdint.h> 398 - X86_64 399 - 400 - See Section "4.15 Architecture" of GCC_MANUAL and Section "3.1.2 Data Representation" of X86_64_ABI_MANUAL. 401 402 * - Character not in the basic source character set is encountered in a source file, except in an identifier, a character constant, a string literal, a header name, a comment, or a preprocessing token that is never converted to a token 403 - ARM64 404 - UTF-8 405 - See Section "1.1 Character sets" of CPP_MANUAL. 406 We assume the locale is not restricting any UTF-8 characters being part of the source character set. 407 408 * - The value of a char object into which has been stored any character other than a member of the basic execution character set 409 - ARM64 410 - 411 - See Section "4.4 Characters" of GCC_MANUAL and Section "8.1 Data types" of ARM64_ABI_MANUAL. 412 413 * - The value of a char object into which has been stored any character other than a member of the basic execution character set 414 - X86_64 415 - 416 - See Section "4.4 Characters" of GCC_MANUAL and Section "3.1.2 Data Representation" of X86_64_ABI_MANUAL. 417 418 * - The value of an integer character constant containing more than one character or containing a character or escape sequence that does not map to a single-byte execution character 419 - ARM64 420 - 421 - See Section "4.4 Characters" of GCC_MANUAL and Section "8.1 Data types" of ARM64_ABI_MANUAL. 422 423 * - The value of an integer character constant containing more than one character or containing a character or escape sequence that does not map to a single-byte execution character 424 - X86_64 425 - 426 - See Section "4.4 Characters" of GCC_MANUAL and Section "3.1.2 Data Representation" of X86_64_ABI_MANUAL. 427 428 * - The mapping of members of the source character set 429 - ARM64, X86_64 430 - 431 - See Section "4.4 Characters" of GCC_MANUAL and the documentation for -finput-charset=charset in the same manual. 432 433 * - The members of the source and execution character sets, except as explicitly specified in the Standard 434 - ARM64, X86_64 435 - UTF-8 436 - See Section "4.4 Characters" of GCC_MANUAL 437 438 * - The values of the members of the execution character set 439 - ARM64, X86_64 440 - 441 - See Section "4.4 Characters" of GCC_MANUAL and the documentation for -fexec-charset=charset in the same manual. 442 443 * - How a diagnostic is identified 444 - ARM64, X86_64 445 - 446 - See Section "4.1 Translation" of GCC_MANUAL. 447 448 * - The places that are searched for an included < > delimited header, and how the places are specified or the header is identified 449 - ARM64, X86_64 450 - 451 - See Chapter "2 Header Files" of CPP_MANUAL. 452 453 * - How the named source file is searched for in an included " " delimited header 454 - ARM64, X86_64 455 - 456 - See Chapter "2 Header Files" of CPP_MANUAL. 457 458 * - How sequences in both forms of header names are mapped to headers or external source file names 459 - ARM64, X86_64 460 - 461 - See Chapter "2 Header Files" of CPP_MANUAL. 462 463 * - Whether the # operator inserts a \ character before the \ character that begins a universal character name in a character constant or string literal 464 - ARM64, X86_64 465 - 466 - See Section "3.4 Stringizing" of CPP_MANUAL. 467 468 * - The current locale used to convert a wide string literal into corresponding wide character codes 469 - ARM64, X86_64 470 - 471 - See Section "4.4 Characters" of GCC_MANUAL and Section "11.1 Implementation-defined behavior" of CPP_MANUAL. 472 473 * - The value of a string literal containing a multibyte character or escape sequence not represented in the execution character set 474 - X86_64 475 - 476 - See Section "4.4 Characters" of GCC_MANUAL and Section "11.1 Implementation-defined behavior" of CPP_MANUAL. 477 478 * - The behavior on each recognized #pragma directive 479 - ARM64, X86_64 480 - pack, GCC visibility 481 - See Section "4.13 Preprocessing Directives" of GCC_MANUAL and Section "7 Pragmas" of CPP_MANUAL. 482 483 * - The method by which preprocessing tokens (possibly resulting from macro expansion) in a #include directive are combined into a header name 484 - X86_64 485 - 486 - See Section "4.13 Preprocessing Directives" of GCC_MANUAL and Section "11.1 Implementation-defined behavior" of CPP_MANUAL. 487 488 489Sizes of Integer types 490______________________ 491 492Xen expects System V ABI on x86_64: 493 https://gitlab.com/x86-psABIs/x86-64-ABI 494 495Xen expects AAPCS32 on ARMv8-A AArch32 and ARMv7-A: 496 https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst 497 498Xen expects AAPCS64 LP64 on ARMv8-A AArch64: 499 https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst 500 501A summary table of data types, sizes and alignment is below: 502 503.. list-table:: 504 :widths: 10 10 10 45 505 :header-rows: 1 506 507 * - Type 508 - Size 509 - Alignment 510 - Architectures 511 512 * - char 513 - 8 bits 514 - 8 bits 515 - x86_32, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A, x86_64, 516 ARMv8-A AArch64, RV64, PPC64 517 518 * - short 519 - 16 bits 520 - 16 bits 521 - x86_32, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A, x86_64, 522 ARMv8-A AArch64, RV64, PPC64 523 524 * - int 525 - 32 bits 526 - 32 bits 527 - x86_32, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A, x86_64, 528 ARMv8-A AArch64, RV64, PPC64 529 530 * - long 531 - 32 bits 532 - 32 bits 533 - x86_32, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A 534 535 * - long 536 - 64 bits 537 - 64 bits 538 - x86_64, ARMv8-A AArch64, RV64, PPC64 539 540 * - long long 541 - 64-bit 542 - 32-bit 543 - x86_32 544 545 * - long long 546 - 64-bit 547 - 64-bit 548 - x86_64, ARMv8-A AArch64, RV64, PPC64, ARMv8-A AArch32, ARMv8-R 549 AArch32, ARMv7-A 550 551 * - pointer 552 - 32-bit 553 - 32-bit 554 - x86_32, ARMv8-A AArch32, ARMv8-R AArch32, ARMv7-A 555 556 * - pointer 557 - 64-bit 558 - 64-bit 559 - x86_64, ARMv8-A AArch64, RV64, PPC64 560 561 562END OF DOCUMENT. 563