1# Zircon program loading and dynamic linking 2 3In Zircon, the kernel is not directly involved in normal program loading. 4(The one necessary exception is bootstrapping the userspace environment at 5system startup; see [`userboot`](userboot.md).) Instead, the kernel merely 6provides the building blocks 7([VMO](objects/vm_object.md), [process](objects/process.md), 8[VMAR](objects/vm_address_region.md), [thread](objects/thread.md)) from 9which userspace program loading is built. 10 11[TOC] 12 13## ELF and the system ABI 14 15The standard Zircon userspace environment uses 16the [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) 17format for machine-code executable files, and provides a dynamic linker and 18C/C++ execution environment that are based on ELF. Zircon processes can 19use [system calls](syscalls.md) only via the [vDSO](vdso.md), which is 20provided by the kernel in ELF format and uses the C/C++ calling conventions 21common to ELF-based systems for the machine. Userspace code (given the 22appropriate capabilities) can use the [system call](syscalls.md) building 23blocks directly to create processes and load programs into them without 24using ELF. But Zircon's standard ABI for machine code uses ELF as 25described here. 26 27## Background: traditional ELF program loading 28 29[ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) was 30introduced with Unix System V Release 4 and became the common standard 31executable file format for most Unix-like systems, today including Linux and 32all the BSD variants as well as Solaris and many others. In these systems, 33the kernel integrates program loading with filesystem access via the POSIX 34`execve` API. There are some variations in how they load ELF programs, but 35most follow a pattern close to this: 36 37 1. The kernel loads the file by name, and checks whether it's ELF or some 38 other kind of file that system supports. This is where `#!` script 39 handling is done, as well non-ELF format support when present. 40 2. The kernel maps the ELF image according to its `PT_LOAD` program 41 headers. For an `ET_EXEC` file, this places the program's segments at 42 fixed addresses in memory specified in `p_vaddr`. For an `ET_DYN` 43 file, the system chooses the base address where the program's first 44 `PT_LOAD` gets loaded, and following segments are placed according to 45 their `p_vaddr` relative to the first segment's `p_vaddr`. Usually the 46 base address is chosen randomly (ASLR). 47 3. If there was a `PT_INTERP` program header, its contents (a range of 48 bytes in the ELF file given by `p_offset` and `p_filesz`) is looked up 49 as a file name to find another ELF file called the *ELF interpreter*. 50 This must be an `ET_DYN` file; the kernel loads it in the same way as it 51 loaded the executable, but always at a location of its own choosing. 52 The interpreter program is usually the ELF dynamic linker with a name 53 like `/lib/ld.so.1` or `/lib/ld-linux.so.2`, but the kernel loads 54 whatever file is named. 55 4. The kernel sets up the stack and registers for the initial thread, and 56 starts the thread running with the PC at the chosen entry point address. 57 58 * The entry point is the `e_entry` value from the ELF file header, 59 adjusted by base address. When there was a `PT_INTERP`, the entry 60 point is that of the interepreter rather than the main executable. 61 * There is an assembly-level protocol of register and stack contents 62 that the kernel sets up for the program to receive its argument and 63 environment strings and an *auxiliary vector* of useful values. When 64 there was a `PT_INTERP`, these include the base address, entry point, 65 and program header table address from the main executable's ELF 66 headers. This information allows the dynamic linker to find the main 67 executable's ELF dynamic linking metadata in memory and do its work. 68 When dynamic linking startup is complete, the dynamic linker jumps to 69 the main executable's entry point address. 70 71Zircon program loading is inspired by this tradition, but does it somewhat 72differently. A key reason for the traditional pattern of loading the 73executable before loading the dynamic linker is that the dynamic linker's 74randomly-chosen base address must not intersect with the fixed addresses 75used by an `ET_EXEC` executable file. Zircon does not support 76fixed-address program loading (ELF `ET_EXEC` files) at all, only 77position-independent executables or *PIE*s, which are ELF `ET_DYN` files. 78 79## The **launchpad** library 80 81The main implementation of program loading resides in 82the [`launchpad` library](../system/ulib/launchpad/). It has a C API 83in 84[`<launchpad/launchpad.h>`](../system/ulib/launchpad/include/launchpad/launchpad.h) but 85is not formally documented. The `launchpad` API is not described here. Its 86treatment of executable files and process startup forms the Zircon system 87ABI for program loading. 88The [lowest userspace layers of the system](userboot.md) implement the same 89protocols. It's anticipated that in the future most process launching in 90the system will be done by a system service that uses `launchpad` in its 91implementation, rather than by direct use of the library. 92 93Filesystems are not part of the lower layers of Zircon API. Instead, 94program loading is based on [VMOs](objects/vm_object.md) and on IPC 95protocols used via [channels](objects/channel.md). 96 97A program loading request starts with: 98 99 * a handle to a VMO containing the executable file (`ZX_RIGHT_READ` and 100 `ZX_RIGHT_EXECUTE` rights are required) 101 * a list of argument strings (to become `argv[]` in a C/C++ program) 102 * a list of environment strings (to become `environ[]` in a C/C++ program) 103 * a list of initial [handles](handles.md), each with 104 a [*handle info entry*](#handle-info-entry) 105 106Three types of file are handled: 107 108{#hashbang} 109* a script file starting with `#!` 110 111 The first line of the file starts with `#!` and must be no more than 127 112 characters long. The first non-whitespace word following `#!` is the 113 *script interpreter name*. If there's anything after that, it all 114 together becomes the *script interpreter argument*. 115 116 * The script interpreter name is prepended to the original argument 117 list (to become `argv[0]`). 118 * If there was a script interpreter argument, it's inserted between the 119 interpreter name and the original argument list (to become `argv[1]`, 120 with the original `argv[0]` becoming `argv[2]`). 121 * The program loader looks up the script interpreter name via 122 the [loader service](#the-loader-service) to get a new VMO. 123 * Program loading restarts on that script interpreter VMO with the 124 modified argument list but everything else the same. The VMO handle 125 for the original executable is just closed; the script interpreter only 126 gets the original `argv[0]` string to work with, not the original VMO. 127 There is a maximum nesting limit (currently 5) constraining how many 128 such restarts will be allowed before program loading just fails. 129 130* an ELF `ET_DYN` file with no `PT_INTERP` 131 132 * The system chooses a random base address for the first `PT_LOAD` segment 133 and then maps in each `PT_LOAD` segment relative to that base address. 134 This is done by creating a [VMAR](objects/vm_address_region.md) covering 135 the whole range from the first page of the first segment to the last 136 page of the last segment. 137 * A VMO is created and mapped at another random address to hold the stack 138 for the initial thread. If there was a `PT_GNU_STACK` program header 139 with a nonzero `p_memsz`, that determines the size of the stack (rounded 140 up to whole pages). Otherwise, a reasonable default stack size is used. 141 * The [vDSO](vdso.md) is mapped into the process 142 (another VMO containing an ELF image), also at a random base address. 143 * A new thread is created in the process with [**thread_create**()](syscalls/thread_create.md). 144 * A new [channel](objects/channel.md) is created, called the *bootstrap 145 channel*. The program loader writes into this channel a message 146 in [the `processargs` protocol](#the-processargs-protocol) format. This 147 *bootstrap message* includes the argument and environment strings and 148 the initial handles from the original request. That list is augmented 149 with handles for: 150 151 * the new [process](objects/process.md) itself 152 * its root [VMAR](objects/vm_address_region.md) 153 * its initial [thread](objects/thread.md) 154 * the VMAR covering where the executable was loaded 155 * the VMO just created for the stack 156 * optionally, a default [job](objects/job.md) so the new 157 process itself can create more processes 158 * optionally, the vDSO VMO so the new process can let the processes 159 it creates make system calls themselves 160 161 The program loader then closes its end of the channel. 162 * The initial thread is launched with 163 the [**process_start**() system call](syscalls/process_start.md): 164 165 * `entry` sets the new thread's PC to `e_entry` from the executable's 166 ELF header, adjusted by base address. 167 * `stack` sets the new thread's stack pointer to the top of the 168 stack mapping. 169 * `arg1` transfers the handle to the *bootstrap channel* into the 170 first argument register in the C ABI. 171 * `arg2` passes the base address of the vDSO into the second argument 172 register in the C ABI. 173 174 Thus, the program entry point can be written as a C function: 175 ```c 176 noreturn void _start(zx_handle_t bootstrap_channel, uintptr_t vdso_base); 177 ``` 178 179{#PT_INTERP} 180* an ELF `ET_DYN` file with a `PT_INTERP` 181 182 In this case, the program loader does not directly use the VMO containing 183 the ELF executable after reading its `PT_INTERP` header. Instead, it 184 uses the `PT_INTERP` contents as the name of an *ELF interpreter*. This 185 name is used in a request to the [loader service](#the-loader-service) to 186 get a new VMO containing the ELF interpreter, which is another ELF 187 `ET_DYN` file. Then that VMO is loaded instead of the main executable's 188 VMO. Startup is as described above, with these differences: 189 190 * An extra message 191 in [the `processargs` protocol](#the-processargs-protocol) is written 192 to the *bootstrap channel*, preceding the main bootstrap message. The 193 ELF interpreter is expected to consume this *loader bootstrap message* 194 itself so that it can do its work, but then leave the second bootstrap 195 message in the channel and hand off the bootstrap channel handle to 196 the main program's entry point. The *loader bootstrap message* 197 includes only the necessary handles added by the program loader, not 198 the full set that go into the main *bootstrap message*, plus these: 199 200 * the original VMO handle for main ELF executable 201 * a channel handle to the [loader service](#the-loader-service) 202 203 These allow the ELF interpreter to do its own loading of the 204 executable from the VMO and to use the loader service to get 205 additional VMOs for shared libraries to load. The message also 206 includes the argument and environment strings, which lets the ELF 207 interpreter use `argv[0]` in its log messages, and check for 208 environment variables like `LD_DEBUG`. 209 210 * `PT_GNU_STACK` program headers are ignored. Instead, the program 211 loader chooses a minimal stack size that is just large enough to 212 contain the *loader bootstrap message* plus some breathing room for 213 the ELF interpreter's startup code to use as call frames. This 214 "breathing room" size is `PTHREAD_STACK_MIN` in the source, and is 215 tuned such that with a small bootstrap message size the whole stack is 216 only a single page, but a careful dynamic linker implementation has 217 enough space to work in. The dynamic linker is expected to read the 218 main executable's `PT_GNU_STACK` and switch to a stack of reasonable 219 size for normal use before it jumps to the main executable's entry 220 point. 221 222*** aside 223 224The program loader chooses three randomly-placed chunks of the new 225process's address space before the program (or dynamic linker) gets 226control: the vDSO, the stack, and the dynamic linker itself. To make it 227possible for the program's own startup to control its address space more 228fully, the program loader currently ensures that these random placements 229are always somewhere in the **upper half of the address space**. This is 230for the convenience of sanitizer runtimes, which need to reserve some lower 231fraction of the address space. This behavior will change in the future so 232there is some way to support the sanitizer cases but other processes will 233get fully random placement to maximize the benefits of ASLR. 234 235*** 236 237## The **processargs** protocol 238 239[`<zircon/processargs.h>`](../system/public/zircon/processargs.h) defines 240the protocol for the *bootstrap message* sent on the *bootstrap channel* by 241the program loader. When a process starts up, it has a handle to this 242bootstrap channel and it has access to [system calls](syscalls.md) via 243the [vDSO](vdso.md). The process has only this one handle and so it can 244see only global system information and its own memory until it gets more 245information and handles via the bootstrap channel. 246 247The `processargs` protocol is a one-way protocol for messages sent on the 248bootstrap channel. The new process is never expected to write back onto 249the channel. The program loader usually sends its messages and then closes 250its end of the channel before the new process has even started. These 251messages must communicate everything a new process will ever need, but the 252code that receives and decodes messages in this format must run in a very 253constrained environment. Heap allocation is impossible and nontrivial 254library facilities may not be available. 255 256See the [header file](../system/public/zircon/processargs.h) for full 257details of the message format. It's anticipated that this ad hoc protocol 258will be replaced with a formal IDL-based protocol eventually, but the 259format will be kept simple enough to be decoded by simple hand-written 260code. 261 262A bootstrap message conveys: 263 264 * a list of initial [handles](handles.md) 265 * a 32-bit *handle info entry* corresponding to each handle 266 * a list of name strings that a *handle info entry* can refer to 267 * a list of argument strings (to become `argv[]` in a C/C++ program) 268 * a list of environment strings (to become `environ[]` in a C/C++ program) 269 270{#handle-info-entry} 271The handles serve many purposes, indicated by the *handle info entry* type: 272 273 * essential handles for the process to make [system calls](syscalls.md): 274 [process](objects/process.md), [VMAR](objects/vm_address_region.md), 275 [thread](objects/thread.md), [job](objects/job.md) 276 * [channel](objects/channel.md) to the [loader service](#the-loader-service) 277 * [vDSO](vdso.md) [VMO](objects/vm_object.md) 278 * filesystem-related handles: current directory, file descriptors, name 279 space bindings (these encode an index into the list of name strings) 280 * special handles for system processes: 281 [resource](objects/resource.md), [VMO](objects/vm_object.md) 282 * other types used for higher-layer or private protocol purposes 283 284Most of these are just passed through by the program loader, 285which does not need to know what they're for. 286 287## The **loader service** 288 289In dynamic linking systems, an executable file refers to and uses at 290runtime additional files containing shared libraries and plugins. The 291dynamic linker is loaded as an [*ELF interpreter*](#PT_INTERP) and is 292responsible getting access to all these additional files to complete 293dynamic linking before the main program's entry point gets control. 294 295All of Zircon's standard userspace uses dynamic linking, down to the very 296first process loaded by [`userboot`](userboot.md). Device drivers and 297filesystems are implemented by userspace programs loaded this way. So 298program loading cannot be defined in terms of higher-layer abstractions 299such as a filesystem paradigm, 300as 301[traditional systems have done](#background_traditional-elf-program-loading). 302Instead, program loading is based only on [VMOs](objects/vm_object.md) and 303a simple [channel](objects/channel.md)-based protocol. 304 305This *loader service* protocol is how a dynamic linker acquires VMOs 306representing the additional files it needs to load as shared libraries. 307 308This is a simple RPC protocol, defined in 309[`<zircon/processargs.h>`](../system/public/zircon/processargs.h). 310As with [the `processargs` protocol](#the-processargs-protocol), 311it's anticipated that this ad hoc protocol will be replaced with a formal 312IDL-based protocol eventually, but the format will be kept simple enough to 313be decoded by simple hand-written code. The code sending loader service 314requests and receiving their replies during dynamic linker startup may 315not have access to nontrivial library facilities. 316 317An ELF interpreter receives a channel handle for its loader service in its 318`processargs` bootstrap message, identified by the *handle info entry* 319`PA_HND(PA_LDSVC_LOADER, 0)`. All requests are synchronous RPCs made 320with [**channel_call**()](syscalls/channel_call.md). Both requests and 321replies start with the `zx_loader_svc_msg_t` header; some contain 322additional data; some contain a VMO handle. Request opcodes are: 323 324 * `LOADER_SVC_OP_LOAD_SCRIPT_INTERP`: *string* -> *VMO handle* 325 326 The program loader sends the *script interpreter name* from 327 a [`#!` script](#hashbang) and gets back a VMO to execute in place of 328 the script. 329 330 * `LOADER_SVC_OP_LOAD_OBJECT`: *string* -> *VMO handle* 331 332 The dynamic linker sends the name of an *object* (shared library or 333 plugin) and gets back a VMO handle containing the file. 334 335 * `LOADER_SVC_OP_CONFIG` : *string* -> `reply ignored` 336 337 The dynamic linker sends a string identifying its *load configuration*. 338 This is intended to affect how later `LOADER_SVC_OP_LOAD_OBJECT` 339 requests decide what particular implementation file to supply for a 340 given name. 341 342 * `LOADER_SVC_OP_DEBUG_PRINT`: *string* -> `reply ignored` 343 344 This is a simple ad hoc logging facility intended for debugging the 345 dynamic linker and early program startup issues. It's convenient 346 because the early startup code is using the loader service but doesn't 347 have access to many other handles or complex facilities yet. This will 348 be replaced in the future with some simple-to-use logging facility that 349 does not go through the loader service. 350 351 * `LOADER_SVC_OP_LOAD_DEBUG_CONFIG`: *string* -> *VMO handle* 352 353 **This is intended to be a developer-oriented feature and might not 354 ordinarily be available in production runs.** 355 356 The program runtime sends a string naming a *debug configuration* of 357 some kind and gets back a VMO to read configuration data from. The 358 sanitizer runtimes use this to allow large options text to be stored in 359 a file rather than passed directly in environment strings. 360 361 * `LOADER_SVC_OP_PUBLISH_DATA_SINK`: *string*, *VMO handle* -> `reply ignored` 362 363 **This is intended to be a developer-oriented feature and might not 364 ordinarily be available in production runs.** 365 366 The program runtime sends a string naming a *data sink* and transfers 367 the sole handle to a VMO it wants published there. The *data sink* 368 string identifies a type of data, and the VMO's object name can 369 specifically identify the data set in this VMO. The client must 370 transfer the only handle to the VMO (which prevents the VMO being 371 resized without the receiver's knowledge), but it might still have the 372 VMO mapped in and continue to write data to it. Code instrumentation 373 runtimes use this to deliver large binary trace results. 374 375## Zircon's standard ELF dynamic linker 376 377The ELF conventions described above and 378the [`processargs`](#the-processargs-protocol) 379and [loader service](#the-loader-service) protocols are the permanent 380system ABI for program loading. Programs can use any implementation of a 381machine code executable that meets the basic ELF format conventions. The 382implementation can use the [vDSO](vdso.md) [system call](syscalls.md) 383ABI, the `processargs` data, and the loader service facilities as it sees 384fit. The exact details of what handles and data they will receive via 385these protocols depend on the higher-layer program environment. Zircon's 386system processes use an ELF interpreter that implements basic ELF dynamic 387linking, and a simple implementation of the loader service. 388 389Zircon's standard C library and dynamic linker have 390a [unified implementation](../third_party/ulib/musl/) originally derived 391from [`musl`](http://www.musl-libc.org/). It's identified by the 392`PT_INTERP` string `ld.so.1`. It uses the `DT_NEEDED` strings naming 393shared libraries as [loader service](#the-loader-service) *object* names. 394 395The simple loader service maps requests into filesystem access: 396 * *script interpreter* and *debug configuration* names must start with `/` 397 and are used as absolute file names. 398 * *data sink* names become subdirectories in `/tmp`, and each VMO 399 published becomes a file in that subdirectory with the VMO's object name 400 * *object* names are searched for as files in system `lib/` directories. 401 * *load configuration* strings are taken as a subdirectory name, 402 optionally preceded by a `!` character. Subdirectories by that name in 403 system `lib/` directories searched are searched before `lib/` itself. 404 If there was a `!` prefix, *only* those subdirectories are searched. 405 For example, sanitizer runtimes use `asan` because that instrumentation 406 is compatible with uninstrumented library code, but `!dfsan` because 407 that instrumentation requires that all code in the process be 408 instrumented. 409 410A version of the standard runtime instrumented with 411LLVM [AddressSanitizer](https://clang.llvm.org/docs/AddressSanitizer.html) 412is identified by the `PT_INTERP` string `asan/ld.so.1`. This version sends 413the *load configuration* string `asan` before loading shared libraries. 414When [SanitizerCoverage](https://clang.llvm.org/docs/SanitizerCoverage.html) 415is enabled, it publishes a VMO to the *data sink* name `sancov` and uses a 416VMO name including the process KOID. 417