1% 2% Copyright 2014, General Dynamics C4 Systems 3% 4% SPDX-License-Identifier: GPL-2.0-only 5% 6 7\chapter{\label{ch:ipc}Message Passing (IPC)} 8 9The seL4 microkernel provides a message-passing IPC mechanism for communication 10between threads. The same mechanism is also used for communication with 11kernel-provided services. Messages are sent by invoking a capability to a 12kernel object. Messages sent to \obj{Endpoint}s are destined for other 13threads, while messages sent to other objects are processed by the kernel. This 14chapter describes the common message format, endpoints, 15and how they can be used for communication between applications. 16 17\section{Message Registers} 18\label{sec:messageinfo} 19 20Each message contains a number of message words and optionally a number of 21capabilities. 22The message words are sent to or received from a thread by placing them in its \emph{message registers}. 23The message registers are numbered and the first few message registers are implemented 24using physical CPU registers, while the rest are backed by a fixed region of 25memory called the \emph{IPC buffer}. 26The reason for this design is efficiency: 27very short messages need not use the memory. 28The IPC buffer is assigned to the calling thread (see \autoref{sec:threads} and \autoref{api:tcb_setipcbuffer}). 29%FIXME: seL4_TCB_SetIPCBuffer is only mentioned in the API reference! 30 31Every IPC message also has a tag (structure \texttt{seL4\_MessageInfo\_t}). The 32tag consists of four fields: the label, message length, number of capabilities 33(the \texttt{extraCaps} field) and the \texttt{capsUnwrapped} field. The 34message length and number of capabilities determine either the number of 35message registers and capabilities that the sending thread wishes to transfer, 36or the number of message registers and capabilities that were actually 37transferred. The label is not interpreted by the 38kernel and is passed unmodified as the first data payload of the message. The 39label may, for example, be used to specify a requested operation. The 40\texttt{capsUnwrapped} field is used only on the receive side, to indicate the 41manner in which capabilities were received. It is described in 42\autoref{sec:cap-transfer}. 43 44% FIXME: a little too low-level, perhaps? 45 46\newcommand{\ipcparam}[4]{\texttt{#1} \emph{#2}&\texttt{#3}\\ } 47\begin{table}[htb] 48 \begin{center} 49 \begin{tabularx}{\textwidth}{p{0.28\textwidth}p{0.18\textwidth}X} 50 \toprule 51 \textbf{Type} & \textbf{Name} & \textbf{Description} \\ 52 \midrule 53 \ipcparam{seL4\_MessageInfo\_t}{}{tag}{Message tag} 54 \ipcparam{seL4\_Word[]}{}{msg}{Message contents} 55 \ipcparam{seL4\_Word}{}{userData}{Base address of the structure, used by 56 supporting user libraries} 57 \ipcparam{seL4\_CPtr[]}{(in)}{caps}{Capabilities to transfer} 58 \ipcparam{seL4\_CapData\_t[]}{(out)}{badges}{Badges for 59 endpoint capabilities received} 60 \ipcparam{seL4\_CPtr}{}{receiveCNode}{CPTR to a CNode from which to 61 find 62 the receive slot} 63 \ipcparam{seL4\_CPtr}{}{receiveIndex}{CPTR to the receive slot 64 relative to \texttt{receiveCNode}} 65 \ipcparam{seL4\_Word}{}{receiveDepth}{Number of bits of 66 \texttt{receiveIndex} to 67 use} 68 \bottomrule 69 \end{tabularx} 70 \caption{\label{tbl:ipcbuffer}Fields of the 71 \texttt{seL4\_IPCBuffer} structure. Note that 72 \texttt{badges} and \texttt{caps} use the same area of memory in 73 the structure.} 74 \end{center} 75\end{table} 76 77The kernel assumes that the IPC buffer contains a structure of type 78\texttt{seL4\_IPCBuffer} as defined in \autoref{tbl:ipcbuffer}. The 79kernel uses as many physical registers as possible to transfer IPC 80messages. When more arguments are transferred than physical message 81registers are available, the kernel begins using the IPC buffer's 82\texttt{msg} field to transfer arguments. However, it leaves room in 83this array for the physical message registers. For example, if an IPC 84transfer or kernel object invocation required 854 message registers (and there are only 2 physical message registers 86available on this architecture) then arguments 1 and 2 would be 87transferred via message registers and arguments 3 and 4 would be in 88\texttt{msg[2]} and \texttt{msg[3]}. 89This allows the user-level object-invocation stubs to copy the arguments passed in physical registers to 90the space left in the \texttt{msg} array if desired. 91The situation is similar for the tag field. 92There is space for this field in the \texttt{seL4\_IPCBuffer} structure, which the kernel ignores. 93User level stubs 94may wish to copy the message tag from its CPU register to this field, although 95the user level stubs provided with the kernel do not do this. 96 97\section{Endpoints} 98 99\obj{Endpoint}s allow a small amount 100of data and capabilities (namely the IPC buffer) to be transferred between two 101threads. \obj{Endpoint} objects are invoked directly using the seL4 system calls 102described in \autoref{sec:syscalls}. 103 104IPC \obj{Endpoints} uses a rendezvous model and as such is 105synchronous and blocking. An \obj{Endpoint} object may queue 106threads either to send or to receive. If no receiver is ready, threads 107performing the \apifunc{seL4\_Send}{sel4_send} or \apifunc{seL4\_Call}{sel4_call} 108system calls will wait in a queue for the first available receiver. Likewise, if 109no sender is ready, threads performing the \apifunc{seL4\_Recv}{sel4_recv} 110system call or the second half of \apifunc{seL4\_ReplyRecv}{sel4_replyrecv} 111will wait for the first available sender. 112 113Trying to Send or Call without the Write right will fail and return an error. In 114the case of Send the error is ignored (The kernel isn't allowed to reply). Thus 115there is no way of knowing that a send has failed because of missing right. 116On the other hand calling \apifunc{seL4\_Recv}{sel4_recv} with a endpoint capability that 117does not have the Read right will raise a fault, see \autoref{sec:faults}. This 118because otherwise the error message would be indistinguishable from a normal 119message received from another thread via the endpoint. 120 121\subsection{Endpoint Badges\label{s:ep-badge}} 122\label{sec:ep-badges} 123 124Endpoint capabilities may be \emph{minted} to 125create a new endpoint capability with a \emph{badge} attached to it, a data 126word chosen by the invoker of the \emph{mint} operation. When a message is sent to an endpoint using a badged 127capability, the badge is transferred to the receiving thread's 128\texttt{badge} register. 129 130An endpoint capability with a zero badge is said to be \emph{unbadged}. 131Such a capability can be badged with the \apifunc{seL4\_CNode\_Mint}{cnode_mint} 132invocation on the \obj{CNode} containing the capability. Endpoint 133capabilities with badges cannot be unbadged, rebadged or used to create 134child capabilities with different badges. 135 136On 32-bit platforms, only the low 28 bits of the badge are available for use. 137The kernel will silently ignore any usage of the high 4 bits. 138On 64-bit platforms, 64 bits are available for badges. 139 140\subsection{Capability Transfer} 141\label{sec:cap-transfer} 142 143Messages may contain capabilities, which will be copied to the 144receiver, provided that the endpoint capability 145invoked by the sending thread has Grant rights. An attempt to send 146capabilities using an endpoint capability without the Grant right will 147result in transfer of the raw message, without any capability transfer. 148 149Capabilities to be sent in a message are specified in the sending thread's 150IPC buffer in the \texttt{caps} field. Each entry in that array is interpreted 151as a CPTR in the sending thread's capability space. The number of capabilities 152to send is specified in the \texttt{extraCaps} field of the message tag. 153 154The receiver specifies the slot 155in which it is willing to receive a capability, with three fields within the IPC buffer: \texttt{receiveCNode}, \texttt{receiveIndex} and \texttt{receiveDepth}. 156These fields specify the root CNode, capability address and number of bits to resolve, respectively, to find 157the slot in which to put the capability. Capability 158addressing is described in \autoref{sec:cap_addressing}. 159 160Note that receiving threads may specify only one receive slot, whereas a 161sending thread may include multiple capabilities in the message. Messages 162containing more than one capability may be interpreted by kernel objects. They 163may also be sent to receiving threads in the case where some of the extra 164capabilities in the message can be \emph{unwrapped}. 165 166If the n-th capability in the message refers to the endpoint through 167which the message is sent, the capability is \emph{unwrapped}: its badge is placed into 168the n-th 169position of the receiver's badges array, and the kernel sets the n-th bit (counting from the 170least significant) in the \texttt{capsUnwrapped} field of the message 171tag. The capability itself is not transferred, so the receive slot may be used 172for another capability. 173 174A capability that is not unwrapped is transferred by copying it from the 175sender's CNode slot to the receiver's CNode slot. The sender retains access 176to the sent capability. 177 178If a receiver gets a message whose tag has an \texttt{extraCaps} of 2 and a 179\texttt{capsUnwrapped} of 2, then the first capability in the message was 180transferred to the specified receive slot and the second capability was 181unwrapped, placing its badge in \texttt{badges[1]}. There may have been a 182third capability in the sender's message which could not be unwrapped. 183 184\subsection{Errors} 185 186Errors in capability transfers can occur at two places: in the send 187phase or in the receive phase. In the send phase, all capabilities that 188the caller is attempting to send are looked up to ensure that they exist 189before the send is initiated in the kernel. If the lookup fails for any 190reason, \apifunc{seL4\_Send}{sel4_send} and \apifunc{seL4\_Call}{sel4_call} system calls immediately abort and 191no IPC or capability transfer takes place. The system call will return 192a lookup failure error as described in \autoref{sec:errors}. 193 194In the receive phase, seL4 transfers capabilities in the order that they 195are found in the sending thread's IPC buffer \texttt{caps} array 196and terminates as soon as an error is encountered. Possible error 197conditions are: 198 199\begin{itemize} 200 \item A source capability cannot be looked up. Although the presence 201 of the source capabilities is checked when the sending thread 202 performs the send system call, this error may still occur. The sending 203 thread may have been blocked on the endpoint for some time before it 204 was paired with a receiving thread. During this time, its 205 CSpace may have changed and the source capability pointers may 206 no longer be valid. 207 208 \item The destination slot cannot be looked up. Unlike the send 209 system call, the \apifunc{seL4\_Recv}{sel4_recv} system call does not check that the 210 destination slot exists and is empty before it initiates the receive. 211 Hence, the \apifunc{seL4\_Recv}{sel4_recv} system call will not fail with an error if the 212 destination slot is invalid and will instead transfer badged 213 capabilities until an attempt to save a capability to the 214 destination slot is made. 215 216 \item The capability being transferred cannot be derived. See 217 \autoref{sec:cap_derivation} for details. 218\end{itemize} 219 220An error will not void the entire transfer, it will just end it 221prematurely. The capabilities processed before the failure are still 222transferred and the \texttt{extraCaps} field in the receiver's IPC 223buffer is set to the number of capabilities transferred up to failure. 224No error message will be returned to the receiving thread in any of the 225above cases. 226 227\subsection{Calling and Replying} 228\label{sec:ep-cal} 229 230As explained in \autoref{sec:sys_call}, when the user calls 231\apifunc{seL4\_Call}{sel4_call} on an endpoint capability, 232some specific actions are taken. First a call will do exactly the same action as 233a normal \apifunc{seL4\_Send}{sel4_send}. Then after the rendezvous and all the 234normal IPC procedure happened, instead of returning directly to the caller, 235\apifunc{seL4\_Call}{sel4_call} will check if either Grant or GrantReply are 236present on the invoked endpoint capability: 237 238\begin{itemize} 239\item If this is not the case, the caller thread is suspended as if 240 \apifunc{seL4\_TCB\_Suspend}{tcb_suspend} was called on it. The send part of 241 the call would still have been performed as usual. 242\item If this is the case. A reply capability is set in a specific slot of the 243 receiver TCB. The Grant right of that reply capability is set by copying the Grant 244 right of the endpoint capability invoked by the receiver in 245 \apifunc{seL4\_Recv}{sel4_recv}. 246 Then, the caller thread is blocked waiting for the reply. 247 \end{itemize} 248 249A reply capability points directly to the caller thread and once the call has 250been performed is completely unrelated to the original \obj{Endpoint}. Even if 251the latter was destroyed, the reply capability would still exist and point to 252the caller who would still be waiting for a reply. 253 254The generated reply capability can then be either invoked in place (in the 255specific TCB slot) with the \apifunc{seL4\_Reply}{sel4_reply} or saved to an 256addressable slot using \apifunc{seL4\_CNode\_SaveCaller}{cnode_savecaller} to be 257invoked later with \apifunc{seL4\_Send}{sel4_send}. The specific slot cannot be 258directly addressed with any CPtr as it is not part of any CSpace. 259 260A reply capability is invoked in the same way as a normal send on a 261\obj{Endpoint}. A reply capability has implicitly the Write right, so the 262message will always go through. Transferring caps in the reply can only happen 263if the reply capability has the Grant right and is done in exactly the same way 264as in a normal IPC transfer as described in \autoref{sec:cap-transfer}. 265 266The main difference with a normal endpoint transfer is that the kernel guarantees 267that invoking a reply capability cannot block: If you own a reply capability, 268then the thread it points to is waiting for a reply. However a reply capability 269is a non-owning reference, contrary to all the other capabilities. That means that 270if the caller thread is destroyed or modified in any way that would render 271a reply impossible (for example being suspended with 272\apifunc{seL4\_TCB\_Suspend}{tcb_suspend}), the kernel would immediately destroy 273the reply capability. 274 275Once the reply capability has been invoked, the caller receives the message as if 276it has been performing a \apifunc{seL4\_Recv}{sel4_recv} and just received the 277message. In particular, it starts running again. 278 279The \apifunc{seL4\_Call}{sel4_call} operation exists not only for 280efficiency reasons (combining two operations into a single system 281call). It differs from 282\apifunc{seL4\_Send}{sel4_send} immediately followed by 283\apifunc{seL4\_Recv}{sel4_recv} in ways that allow certain system setup to work 284much more efficiently with much less setup that with a traditional setup. 285In particular, it is guaranteed that the reply received by the caller comes from 286the thread that received the call without having to check any kind of badge. 287