1# Zircon Kernel Concepts 2 3## Introduction 4 5The kernel manages a number of different types of Objects. Those which are 6accessible directly via system calls are C++ classes which implement the 7Dispatcher interface. These are implemented in 8[kernel/object](../kernel/object). Many are self-contained higher-level Objects. 9Some wrap lower-level [lk](../../docs/glossary.md#lk) primitives. 10 11## [System Calls](syscalls.md) 12 13Userspace code interacts with kernel objects via system calls, and almost 14exclusively via [Handles](handles.md). In userspace, a Handle is represented as 1532bit integer (type zx_handle_t). When syscalls are executed, the kernel checks 16that Handle parameters refer to an actual handle that exists within the calling 17process's handle table. The kernel further checks that the Handle is of the 18correct type (passing a Thread Handle to a syscall requiring an event handle 19will result in an error), and that the Handle has the required Rights for the 20requested operation. 21 22System calls fall into three broad categories, from an access standpoint: 23 241. Calls which have no limitations, of which there are only a very few, for 25example [*zx_clock_get()*](syscalls/clock_get.md) 26and [*zx_nanosleep()*](syscalls/nanosleep.md) may be called by any thread. 272. Calls which take a Handle as the first parameter, denoting the Object they act upon, 28which are the vast majority, for example [*zx_channel_write()*](syscalls/channel_write.md) 29and [*zx_port_queue()*](syscalls/port_queue.md). 303. Calls which create new Objects but do not take a Handle, such as 31[*zx_event_create()*](syscalls/event_create.md) and 32[*zx_channel_create()*](syscalls/channel_create.md). Access to these (and limitations 33upon them) is controlled by the Job in which the calling Process is contained. 34 35System calls are provided by libzircon.so, which is a "virtual" shared 36library that the Zircon kernel provides to userspace, better known as the 37[*virtual Dynamic Shared Object* or vDSO](vdso.md). 38They are C ELF ABI functions of the form *zx_noun_verb()* or 39*zx_noun_verb_direct-object()*. 40 41The system calls are defined by [syscalls.abigen](../system/public/zircon/syscalls.abigen) 42and processed by the [abigen](../system/host/abigen/) tool into include files and glue 43code in libzircon and the kernel's libsyscalls. 44 45 46## [Handles](handles.md) and [Rights](rights.md) 47 48Objects may have multiple Handles (in one or more Processes) that refer to them. 49 50For almost all Objects, when the last open Handle that refers to an Object is closed, 51the Object is either destroyed, or put into a final state that may not be undone. 52 53Handles may be moved from one Process to another by writing them into a Channel 54(using [*zx_channel_write()*](syscalls/channel_write.md)), or by using 55[*zx_process_start()*](syscalls/process_start.md) to pass a Handle as the argument 56of the first thread in a new Process. 57 58The actions which may be taken on a Handle or the Object it refers to are governed 59by the Rights associated with that Handle. Two Handles that refer to the same Object 60may have different Rights. 61 62The [*zx_handle_duplicate()*](syscalls/handle_duplicate.md) and 63[*zx_handle_replace()*](syscalls/handle_replace.md) system calls may be used to 64obtain additional Handles referring to the same Object as the Handle passed in, 65optionally with reduced Rights. The [*zx_handle_close()*](syscalls/handle_close.md) 66system call closes a Handle, releasing the Object it refers to, if that Handle is 67the last one for that Object. The [*zx_handle_close_many()*](syscalls/handle_close_many.md) 68system call similarly closes an array of handles. 69 70 71## Kernel Object IDs 72 73Every object in the kernel has a "kernel object id" or "koid" for short. 74It is a 64 bit unsigned integer that can be used to identify the object 75and is unique for the lifetime of the running system. 76This means in particular that koids are never reused. 77 78There are two special koid values: 79 80*ZX_KOID_INVALID* Has the value zero and is used as a "null" sentinel. 81 82*ZX_KOID_KERNEL* There is only one kernel, and it has its own koid. 83 84Kernel generated koids only use 63 bits (which is plenty). 85This leaves space for artificially allocated koids by having the most 86significant bit set. 87Artificial koids exist to support things like identifying artificial objects, 88like virtual threads in tracing, for consumption by tools. 89How artificial koids are allocated is left to each program, 90this document does not impose any rules or conventions. 91 92 93## Running Code: Jobs, Processes, and Threads. 94 95Threads represent threads of execution (CPU registers, stack, etc) within an 96address space which is owned by the Process in which they exist. Processes are 97owned by Jobs, which define various resource limitations. Jobs are owned by 98parent Jobs, all the way up to the Root Job which was created by the kernel at 99boot and passed to [`userboot`, the first userspace Process to begin execution](userboot.md). 100 101Without a Job Handle, it is not possible for a Thread within a Process to create another 102Process or another Job. 103 104[Program loading](program_loading.md) is provided by userspace facilities and 105protocols above the kernel layer. 106 107See: [process_create](syscalls/process_create.md), 108[process_start](syscalls/process_start.md), 109[thread_create](syscalls/thread_create.md), 110and [thread_start](syscalls/thread_start.md). 111 112 113## Message Passing: Sockets and Channels 114 115Both Sockets and Channels are IPC Objects which are bi-directional and two-ended. 116Creating a Socket or a Channel will return two Handles, one referring to each endpoint 117of the Object. 118 119Sockets are stream-oriented and data may be written into or read out of them in units 120of one or more bytes. Short writes (if the Socket's buffers are full) and short reads 121(if more data is requested than in the buffers) are possible. 122 123Channels are datagram-oriented and have a maximum message size given by **ZX_CHANNEL_MAX_MSG_BYTES**, 124and may also have up to **ZX_CHANNEL_MAX_MSG_HANDLES** Handles attached to a message. 125They do not support short reads or writes -- either a message fits or it does not. 126 127When Handles are written into a Channel, they are removed from the sending Process. 128When a message with Handles is read from a Channel, the Handles are added to the receiving 129Process. Between these two events, the Handles continue to exist (ensuring the Objects 130they refer to continue to exist), unless the end of the Channel which they have been written 131towards is closed -- at which point messages in flight to that endpoint are discarded and 132any Handles they contained are closed. 133 134See: [channel_create](syscalls/channel_create.md), 135[channel_read](syscalls/channel_read.md), 136[channel_write](syscalls/channel_write.md), 137[channel_call](syscalls/channel_call.md), 138[socket_create](syscalls/socket_create.md), 139[socket_read](syscalls/socket_read.md), 140and [socket_write](syscalls/socket_write.md). 141 142## Objects and Signals 143 144Objects may have up to 32 signals (represented by the zx_signals_t type and the ZX_*_SIGNAL_* 145defines) which represent a piece of information about their current state. Channels and Sockets, 146for example, may be READABLE or WRITABLE. Processes or Threads may be TERMINATED. And so on. 147 148Threads may wait for signals to become active on one or more Objects. 149 150See [signals](signals.md) for more information. 151 152## Waiting: Wait One, Wait Many, and Ports 153 154A Thread may use [*zx_object_wait_one()*](syscalls/object_wait_one.md) 155to wait for a signal to be active on a single handle or 156[*zx_object_wait_many()*](syscalls/object_wait_many.md) to wait for 157signals on multiple handles. Both calls allow for a timeout after 158which they'll return even if no signals are pending. 159 160If a Thread is going to wait on a large set of handles, it is more efficient to use 161a Port, which is an Object that other Objects may be bound to such that when signals 162are asserted on them, the Port receives a packet containing information about the 163pending Signals. 164 165See: [port_create](syscalls/port_create.md), 166[port_queue](syscalls/port_queue.md), 167[port_wait](syscalls/port_wait.md), 168and [port_cancel](syscalls/port_cancel.md). 169 170 171## Events, Event Pairs. 172 173An Event is the simplest Object, having no other state than its collection of active Signals. 174 175An Event Pair is one of a pair of Events that may signal each other. A useful property of 176Event Pairs is that when one side of a pair goes away (all Handles to it have been 177closed), the PEER_CLOSED signal is asserted on the other side. 178 179See: [event_create](syscalls/event_create.md), 180and [eventpair_create](syscalls/eventpair_create.md). 181 182 183## Shared Memory: Virtual Memory Objects (VMOs) 184 185Virtual Memory Objects represent a set of physical pages of memory, or the *potential* 186for pages (which will be created/filled lazily, on-demand). 187 188They may be mapped into the address space of a Process with 189[*zx_vmar_map()*](syscalls/vmar_map.md) and unmapped with 190[*zx_vmar_unmap()*](syscalls/vmar_unmap.md). Permissions of 191mapped pages may be adjusted with [*zx_vmar_protect()*](syscalls/vmar_protect.md). 192 193VMOs may also be read from and written to directly with 194[*zx_vmo_read()*](syscalls/vmo_read.md) and [*zx_vmo_write()*](syscalls/vmo_write.md). 195Thus the cost of mapping them into an address space may be avoided for one-shot operations 196like "create a VMO, write a dataset into it, and hand it to another Process to use." 197 198## Address Space Management 199 200Virtual Memory Address Regions (VMARs) provide an abstraction for managing a 201process's address space. At process creation time, a handle to the root VMAR 202is given to the process creator. That handle refers to a VMAR that spans the 203entire address space. This space can be carved up via the 204[*zx_vmar_map()*](syscalls/vmar_map.md) and 205[*zx_vmar_allocate()*](syscalls/vmar_allocate.md) interfaces. 206[*zx_vmar_allocate()*](syscalls/vmar_allocate.md) can be used to generate new 207VMARs (called subregions or children) which can be used to group together 208parts of the address space. 209 210See: [vmar_map](syscalls/vmar_map.md), 211[vmar_allocate](syscalls/vmar_allocate.md), 212[vmar_protect](syscalls/vmar_protect.md), 213[vmar_unmap](syscalls/vmar_unmap.md), 214and [vmar_destroy](syscalls/vmar_destroy.md), 215 216## Futexes 217 218Futexes are kernel primitives used with userspace atomic operations to implement 219efficient synchronization primitives -- for example, Mutexes which only need to make 220a syscall in the contended case. Usually they are only of interest to implementers of 221standard libraries. Zircon's libc and libc++ provide C11, C++, and pthread APIs for 222mutexes, condition variables, etc, implemented in terms of Futexes. 223 224See: [futex_wait](syscalls/futex_wait.md), 225[futex_wake](syscalls/futex_wake.md), 226and [futex_requeue](syscalls/futex_requeue.md). 227