1# Futex
2
3## NAME
4
5futex - A primitive for creating userspace synchronization tools.
6
7## SYNOPSIS
8
9A **futex** is a Fast Userspace muTEX. It is a low level
10synchronization primitive which is a building block for higher level
11APIs such as `pthread_mutex_t` and `pthread_cond_t`.
12
13Futexes are designed to not enter the kernel or allocate kernel
14resources in the uncontested case.
15
16## DESCRIPTION
17
18The zircon futex implementation currently supports three operations distributed
19over 6 syscalls:
20
21```C
22    zx_status_t zx_futex_wait(const zx_futex_t* value_ptr,
23                              int current_value,
24                              zx_handle_t new_futex_owner,
25                              zx_time_t deadline);
26    zx_status_t zx_futex_wake(const zx_futex_t* value_ptr, uint32_t wake_count);
27    zx_status_t zx_futex_wake_single_owner(const zx_futex_t* value_ptr);
28    zx_status_t zx_futex_requeue(const zx_futex_t* value_ptr,
29                                 uint32_t wake_count,
30                                 int current_value,
31                                 const zx_futex_t* requeue_ptr,
32                                 uint32_t requeue_count,
33                                 zx_handle_t new_requeue_owner);
34    zx_status_t zx_futex_requeue_single_owner(const zx_futex_t* value_ptr,
35                                              int current_value,
36                                              const zx_futex_t* requeue_ptr,
37                                              uint32_t requeue_count,
38                                              zx_handle_t new_requeue_owner);
39    zx_status_t zx_futex_get_owner(const zx_futex_t* value_ptr, uint64_t* koid);
40```
41
42All of these share a `value_ptr` parameter, which is the virtual
43address of an aligned userspace integer. This virtual address is the
44information used in kernel to track what futex given threads are
45waiting on. The kernel does not currently modify the value of
46`*value_ptr` (but see below for future operations which might do
47so). It is up to userspace code to correctly atomically modify this
48value across threads in order to build mutexes and so on.
49
50See the [futex_wait](../syscalls/futex_wait.md),
51[futex_wake](../syscalls/futex_wake.md),
52[futex_requeue](../syscalls/futex_requeue.md), and
53[futex_get_owner](../syscalls/futex_get_owner.md) man pages for more details.
54
55## RIGHTS
56
57Futex objects do not have any rights associated with them.
58
59There are only 2 primitive operations which userspace code can perform on a
60futex: waiting and waking (requeue is a combination of the two).  Because
61futexes are strictly a process local concept, revoking access to either of these
62operations would make the futex functionally worthless.
63
64Additionally, from the kernel's perspective, futexes are ephemeral objects whose
65state only exists while the futex has waiters.  Without a more durable state
66present in the kernel, it is more or less impossible to have a persisted concept
67of rights for a futex.
68
69### Differences from Linux futexes
70
71Note that all of the zircon futex operations key off of the virtual
72address of an userspace pointer. This differs from the Linux
73implementation, which distinguishes private futex operations (which
74correspond to our in-process-only ones) from ones shared across
75address spaces.
76
77As noted above, all of our futex operations leave the value of the
78futex unmodified from the kernel. Other potential operations, such as
79Linux's `FUTEX_WAKE_OP`, requires atomic manipulation of the value
80from the kernel, which our current implementation does not require.
81
82### Ownership and Priority Inheritance
83
84#### Overview
85
86Some runtimes may need to implement synchronization primitives based on futexes
87which exhibit priority inheritance behavior.  In order to support these users,
88zircon futexes have a concept of 'ownership' which can be used to implement such
89primitives.  Use of this feature is optional.
90
91At any point in time, a futex may be either unowned, or owned by a single
92thread.  When a thread owns one or more futexes, its effective priority becomes
93the maximum of its base priority, and the priorities of all of the current
94waiters of all of the futexes currently owned by it.  As soon a thread no longer
95owns a futex, the pressure of the priorities of the futex's waiters disappears
96from the relationship above.  Once the thread no longer owns any futexes, its
97priority will relax back to its base priority.
98
99Signaling of the owner of a futex is the responsibility of the userspace code,
100as is applying the ownership concept properly when constructing a specific type
101of synchronization object which needs priority inheritance behavior.
102
103Zircon futexes have at most a single owner.  Multiple ownership of futexes for
104the purpose of priority inheritance is not supported.  The owner of a futex may
105never simultaneously be a waiter for the same futex.
106
107#### Assigning Ownership
108
109Ownership of a futex is assigned via each 'wait' or 'requeue' operation.  In the
110case of a requeue operation, the target futex is the requeue futex, not the
111wake_futex.  Users pass a handle to a thread indicating who the current owner of
112the futex should be, or **ZX_HANDLE_INVALID** if there should be no owner.
113
114+ Passing a valid handle to a thread to indicate the futex owner is the
115  responsibility of the userspace code.  Passing an invalid handle, or a handle
116  to a non-thread object will result in the wait/requeue operation failing.
117+ If the wait/requeue operation succeeds, the owner of the target futex will
118  _always_ be set to either the thread specified, or nothing if
119  **ZX_HANDLE_INVALID** is passed.
120+ In particular, if the wait/requeue operation fails because of a mismatch
121  between the expected futex value and the actual futex value, the owner of the
122  futex will remain unchanged.
123
124#### Transferring Ownership
125
126Ownership of a futex may be transferred by the kernel on behalf of the user
127during a wake operation or a requeue operation.  In the case of a requeue
128operation, the target of the transfer is the wake_futex, not the requeue_futex.
129Ownership transfer only takes place when using the zx_futex_wake_single_owner or
130zx_futex_requeue_single_owner variants of the wake/requeue operations.  The
131single_owner variants of these operations will release exactly one waiter, and
132assign ownership of the futex to the released thread.
133
134+ If there are _no_ waiters during the wake operation, then there is already no
135  owner.  This will remain unchanged.
136+ If a requeue operation fails because of a mismatch between the expected futex
137  value and the actual futex value, the owner of the futex will remain
138  unchanged.
139+ A successful call to either of the non-single_owner variants of the
140  wake/requeue operation will cause the target futex's owner to be set to
141  nothing.
142
143### Papers about futexes
144
145- [Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux](https://www.kernel.org/doc/ols/2002/ols2002-pages-479-495.pdf), Hubertus Franke and Rusty Russell
146
147    This is the original white paper describing the Linux futex. It
148    documents the history and design of the original implementation,
149    prior (failed) attempts at creating a fast userspace
150    synchronization primitive, and performance measurements.
151
152- [Futexes Are Tricky](https://www.akkadia.org/drepper/futex.pdf), Ulrich Drepper
153
154    This paper describes some gotchas and implementation details of
155    futexes in Linux. It discusses the kernel implementation, and goes
156    into more detail about correct and efficient userspace
157    implementations of mutexes, condition variables, and so on.
158
159- [Mutexes and Condition Variables using Futexes](http://locklessinc.com/articles/mutex_cv_futex/)
160
161    Further commentary on "Futexes are tricky", outlining a simple
162    implementation that avoids the need for `FUTEX_CMP_REQUEUE`
163
164- [Locking in WebKit](https://webkit.org/blog/6161/locking-in-webkit/), Filip Pizlo
165
166    An in-depth tour of the locking primitives in WebKit, complete with
167    benchmarks and analysis. Contains a detailed explanation of the "parking
168    lot" concept, which allows very compact representation of userspace
169    mutexes.
170
171## SYSCALLS
172
173+ [futex_wait](../syscalls/futex_wait.md)
174+ [futex_wake](../syscalls/futex_wake.md)
175+ [futex_requeue](../syscalls/futex_requeue.md)
176+ [futex_get_owner](../syscalls/futex_get_owner.md)
177