1# Zircon Kernel Concepts
2
3## Introduction
4
5The kernel manages a number of different types of Objects. Those which are
6accessible directly via system calls are C++ classes which implement the
7Dispatcher interface. These are implemented in
8[kernel/object](../kernel/object). Many are self-contained higher-level Objects.
9Some wrap lower-level [lk](../../docs/glossary.md#lk) primitives.
10
11## [System Calls](syscalls.md)
12
13Userspace code interacts with kernel objects via system calls, and almost
14exclusively via [Handles](handles.md).  In userspace, a Handle is represented as
1532bit integer (type zx_handle_t).  When syscalls are executed, the kernel checks
16that Handle parameters refer to an actual handle that exists within the calling
17process's handle table.  The kernel further checks that the Handle is of the
18correct type (passing a Thread Handle to a syscall requiring an event handle
19will result in an error), and that the Handle has the required Rights for the
20requested operation.
21
22System calls fall into three broad categories, from an access standpoint:
23
241. Calls which have no limitations, of which there are only a very few, for
25example [*zx_clock_get()*](syscalls/clock_get.md)
26and [*zx_nanosleep()*](syscalls/nanosleep.md) may be called by any thread.
272. Calls which take a Handle as the first parameter, denoting the Object they act upon,
28which are the vast majority, for example [*zx_channel_write()*](syscalls/channel_write.md)
29and [*zx_port_queue()*](syscalls/port_queue.md).
303. Calls which create new Objects but do not take a Handle, such as
31[*zx_event_create()*](syscalls/event_create.md) and
32[*zx_channel_create()*](syscalls/channel_create.md).  Access to these (and limitations
33upon them) is controlled by the Job in which the calling Process is contained.
34
35System calls are provided by libzircon.so, which is a "virtual" shared
36library that the Zircon kernel provides to userspace, better known as the
37[*virtual Dynamic Shared Object* or vDSO](vdso.md).
38They are C ELF ABI functions of the form *zx_noun_verb()* or
39*zx_noun_verb_direct-object()*.
40
41The system calls are defined by [syscalls.abigen](../system/public/zircon/syscalls.abigen)
42and processed by the [abigen](../system/host/abigen/) tool into include files and glue
43code in libzircon and the kernel's libsyscalls.
44
45
46## [Handles](handles.md) and [Rights](rights.md)
47
48Objects may have multiple Handles (in one or more Processes) that refer to them.
49
50For almost all Objects, when the last open Handle that refers to an Object is closed,
51the Object is either destroyed, or put into a final state that may not be undone.
52
53Handles may be moved from one Process to another by writing them into a Channel
54(using [*zx_channel_write()*](syscalls/channel_write.md)), or by using
55[*zx_process_start()*](syscalls/process_start.md) to pass a Handle as the argument
56of the first thread in a new Process.
57
58The actions which may be taken on a Handle or the Object it refers to are governed
59by the Rights associated with that Handle.  Two Handles that refer to the same Object
60may have different Rights.
61
62The [*zx_handle_duplicate()*](syscalls/handle_duplicate.md) and
63[*zx_handle_replace()*](syscalls/handle_replace.md) system calls may be used to
64obtain additional Handles referring to the same Object as the Handle passed in,
65optionally with reduced Rights.  The [*zx_handle_close()*](syscalls/handle_close.md)
66system call closes a Handle, releasing the Object it refers to, if that Handle is
67the last one for that Object. The [*zx_handle_close_many()*](syscalls/handle_close_many.md)
68system call similarly closes an array of handles.
69
70
71## Kernel Object IDs
72
73Every object in the kernel has a "kernel object id" or "koid" for short.
74It is a 64 bit unsigned integer that can be used to identify the object
75and is unique for the lifetime of the running system.
76This means in particular that koids are never reused.
77
78There are two special koid values:
79
80*ZX_KOID_INVALID* Has the value zero and is used as a "null" sentinel.
81
82*ZX_KOID_KERNEL* There is only one kernel, and it has its own koid.
83
84Kernel generated koids only use 63 bits (which is plenty).
85This leaves space for artificially allocated koids by having the most
86significant bit set.
87Artificial koids exist to support things like identifying artificial objects,
88like virtual threads in tracing, for consumption by tools.
89How artificial koids are allocated is left to each program,
90this document does not impose any rules or conventions.
91
92
93## Running Code: Jobs, Processes, and Threads.
94
95Threads represent threads of execution (CPU registers, stack, etc) within an
96address space which is owned by the Process in which they exist.  Processes are
97owned by Jobs, which define various resource limitations.  Jobs are owned by
98parent Jobs, all the way up to the Root Job which was created by the kernel at
99boot and passed to [`userboot`, the first userspace Process to begin execution](userboot.md).
100
101Without a Job Handle, it is not possible for a Thread within a Process to create another
102Process or another Job.
103
104[Program loading](program_loading.md) is provided by userspace facilities and
105protocols above the kernel layer.
106
107See: [process_create](syscalls/process_create.md),
108[process_start](syscalls/process_start.md),
109[thread_create](syscalls/thread_create.md),
110and [thread_start](syscalls/thread_start.md).
111
112
113## Message Passing: Sockets and Channels
114
115Both Sockets and Channels are IPC Objects which are bi-directional and two-ended.
116Creating a Socket or a Channel will return two Handles, one referring to each endpoint
117of the Object.
118
119Sockets are stream-oriented and data may be written into or read out of them in units
120of one or more bytes.  Short writes (if the Socket's buffers are full) and short reads
121(if more data is requested than in the buffers) are possible.
122
123Channels are datagram-oriented and have a maximum message size given by **ZX_CHANNEL_MAX_MSG_BYTES**,
124and may also have up to **ZX_CHANNEL_MAX_MSG_HANDLES** Handles attached to a message.
125They do not support short reads or writes -- either a message fits or it does not.
126
127When Handles are written into a Channel, they are removed from the sending Process.
128When a message with Handles is read from a Channel, the Handles are added to the receiving
129Process.  Between these two events, the Handles continue to exist (ensuring the Objects
130they refer to continue to exist), unless the end of the Channel which they have been written
131towards is closed -- at which point messages in flight to that endpoint are discarded and
132any Handles they contained are closed.
133
134See: [channel_create](syscalls/channel_create.md),
135[channel_read](syscalls/channel_read.md),
136[channel_write](syscalls/channel_write.md),
137[channel_call](syscalls/channel_call.md),
138[socket_create](syscalls/socket_create.md),
139[socket_read](syscalls/socket_read.md),
140and [socket_write](syscalls/socket_write.md).
141
142## Objects and Signals
143
144Objects may have up to 32 signals (represented by the zx_signals_t type and the ZX_*_SIGNAL_*
145defines) which represent a piece of information about their current state.  Channels and Sockets,
146for example, may be READABLE or WRITABLE.  Processes or Threads may be TERMINATED.  And so on.
147
148Threads may wait for signals to become active on one or more Objects.
149
150See [signals](signals.md) for more information.
151
152## Waiting: Wait One, Wait Many, and Ports
153
154A Thread may use [*zx_object_wait_one()*](syscalls/object_wait_one.md)
155to wait for a signal to be active on a single handle or
156[*zx_object_wait_many()*](syscalls/object_wait_many.md) to wait for
157signals on multiple handles.  Both calls allow for a timeout after
158which they'll return even if no signals are pending.
159
160If a Thread is going to wait on a large set of handles, it is more efficient to use
161a Port, which is an Object that other Objects may be bound to such that when signals
162are asserted on them, the Port receives a packet containing information about the
163pending Signals.
164
165See: [port_create](syscalls/port_create.md),
166[port_queue](syscalls/port_queue.md),
167[port_wait](syscalls/port_wait.md),
168and [port_cancel](syscalls/port_cancel.md).
169
170
171## Events, Event Pairs.
172
173An Event is the simplest Object, having no other state than its collection of active Signals.
174
175An Event Pair is one of a pair of Events that may signal each other.  A useful property of
176Event Pairs is that when one side of a pair goes away (all Handles to it have been
177closed), the PEER_CLOSED signal is asserted on the other side.
178
179See: [event_create](syscalls/event_create.md),
180and [eventpair_create](syscalls/eventpair_create.md).
181
182
183## Shared Memory: Virtual Memory Objects (VMOs)
184
185Virtual Memory Objects represent a set of physical pages of memory, or the *potential*
186for pages (which will be created/filled lazily, on-demand).
187
188They may be mapped into the address space of a Process with
189[*zx_vmar_map()*](syscalls/vmar_map.md) and unmapped with
190[*zx_vmar_unmap()*](syscalls/vmar_unmap.md).  Permissions of
191mapped pages may be adjusted with [*zx_vmar_protect()*](syscalls/vmar_protect.md).
192
193VMOs may also be read from and written to directly with
194[*zx_vmo_read()*](syscalls/vmo_read.md) and [*zx_vmo_write()*](syscalls/vmo_write.md).
195Thus the cost of mapping them into an address space may be avoided for one-shot operations
196like "create a VMO, write a dataset into it, and hand it to another Process to use."
197
198## Address Space Management
199
200Virtual Memory Address Regions (VMARs) provide an abstraction for managing a
201process's address space.  At process creation time, a handle to the root VMAR
202is given to the process creator.  That handle refers to a VMAR that spans the
203entire address space.  This space can be carved up via the
204[*zx_vmar_map()*](syscalls/vmar_map.md) and
205[*zx_vmar_allocate()*](syscalls/vmar_allocate.md) interfaces.
206[*zx_vmar_allocate()*](syscalls/vmar_allocate.md) can be used to generate new
207VMARs (called subregions or children) which can be used to group together
208parts of the address space.
209
210See: [vmar_map](syscalls/vmar_map.md),
211[vmar_allocate](syscalls/vmar_allocate.md),
212[vmar_protect](syscalls/vmar_protect.md),
213[vmar_unmap](syscalls/vmar_unmap.md),
214and [vmar_destroy](syscalls/vmar_destroy.md),
215
216## Futexes
217
218Futexes are kernel primitives used with userspace atomic operations to implement
219efficient synchronization primitives -- for example, Mutexes which only need to make
220a syscall in the contended case.  Usually they are only of interest to implementers of
221standard libraries.  Zircon's libc and libc++ provide C11, C++, and pthread APIs for
222mutexes, condition variables, etc, implemented in terms of Futexes.
223
224See: [futex_wait](syscalls/futex_wait.md),
225[futex_wake](syscalls/futex_wake.md),
226and [futex_requeue](syscalls/futex_requeue.md).
227