1Xenstore protocol specification
2-------------------------------
3
4Xenstore implements a database which maps filename-like pathnames
5(also known as `keys') to values.  Clients may read and write values,
6watch for changes, and set permissions to allow or deny access.  There
7is a rudimentary transaction system.
8
9While xenstore and most tools and APIs are capable of dealing with
10arbitrary binary data as values, this should generally be avoided.
11Data should generally be human-readable for ease of management and
12debugging; xenstore is not a high-performance facility and should be
13used only for small amounts of control plane data.  Therefore xenstore
14values should normally be 7-bit ASCII text strings containing bytes
150x20..0x7f only, and should not contain a trailing nul byte.  (The
16APIs used for accessing xenstore generally add a nul when reading, for
17the caller's convenience.)
18
19A separate specification will detail the keys and values which are
20used in the Xen system and what their meanings are.  (Sadly that
21specification currently exists only in multiple out-of-date versions.)
22
23
24Paths are /-separated and start with a /, just as Unix filenames.
25
26We can speak of two paths being <child> and <parent>, which is the
27case if they're identical, or if <parent> is /, or if <parent>/ is an
28initial substring of <child>.  (This includes <path> being a child of
29itself.)
30
31If a particular path exists, all of its parents do too.  Every
32existing path maps to a possibly empty value, and may also have zero
33or more immediate children.  There is thus no particular distinction
34between directories and leaf nodes.  However, it is conventional not
35to store nonempty values at nodes which also have children.
36
37The permitted character for paths set is ASCII alphanumerics and plus
38the four punctuation characters -/_@ (hyphen slash underscore atsign).
39@ should be avoided except to specify special watches (see below).
40Doubled slashes and trailing slashes (except to specify the root) are
41forbidden.  The empty path is also forbidden.  Paths longer than 3072
42bytes are forbidden; clients specifying relative paths should keep
43them to within 2048 bytes.  (See XENSTORE_*_PATH_MAX in xs_wire.h.)
44
45
46Each node has one or multiple permission entries.  Permissions are
47granted by domain-id, the first permission entry of each node specifies
48the owner of the node, who always has full access to the node (read and
49write permission).  The access rights of the first entry specify the
50allowed access for all domains not having a dedicated permission entry
51(the default is "n", removing access for all domains not explicitly
52added via additional permission entries).  Permissions of a node can be
53changed by the owner of the node, the owner can only be modified by the
54control domain (usually domain id 0).  Other permissions can be setup to
55allow read and/or write access for other domains.  When a domain is
56being removed from Xenstore nodes owned by that domain will be removed
57together with all of those nodes' children.
58
59
60Communication with xenstore is via either sockets, or event channel
61and shared memory, as specified in io/xs_wire.h: each message in
62either direction is a header formatted as a struct xsd_sockmsg
63followed by xsd_sockmsg.len bytes of payload.
64
65The payload syntax varies according to the type field.  Generally
66requests each generate a reply with an identical type, req_id and
67tx_id.  However, if an error occurs, a reply will be returned with
68type ERROR, and only req_id and tx_id copied from the request.
69
70A caller who sends several requests may receive the replies in any
71order and must use req_id (and tx_id, if applicable) to match up
72replies to requests.  (The current implementation always replies to
73requests in the order received but this should not be relied on.)
74
75The payload length (len field of the header) is limited to 4096
76(XENSTORE_PAYLOAD_MAX) in both directions.  If a client exceeds the
77limit, its xenstored connection will be immediately killed by
78xenstored, which is usually catastrophic from the client's point of
79view.  Clients (particularly domains, which cannot just reconnect)
80should avoid this.
81
82Existing clients do not always contain defences against overly long
83payloads.  Increasing xenstored's limit is therefore difficult; it
84would require negotiation with the client, and obviously would make
85parts of xenstore inaccessible to some clients.  In any case passing
86bulk data through xenstore is not recommended as the performance
87properties are poor.
88
89
90---------- Xenstore protocol details - introduction ----------
91
92The payload syntax and semantics of the requests and replies are
93described below.  In the payload syntax specifications we use the
94following notations:
95
96 |		A nul (zero) byte.
97 <foo>		A string guaranteed not to contain any nul bytes.
98 <foo|>		Binary data (which may contain zero or more nul bytes)
99 <foo>|*	Zero or more strings each followed by a trailing nul
100 <foo>|+	One or more strings each followed by a trailing nul
101 ?		Reserved value (may not contain nuls)
102 ??		Reserved value (may contain nuls)
103
104Except as otherwise noted, reserved values are believed to be sent as
105empty strings by all current clients.  Clients should not send
106nonempty strings for reserved values; those parts of the protocol may
107be used for extension in the future.
108
109
110Error replies are as follows:
111
112ERROR						E<something>|
113	Where E<something> is the name of an errno value
114	listed in io/xs_wire.h.  Note that the string name
115	is transmitted, not a numeric value.
116
117
118Where no reply payload format is specified below, success responses
119have the following payload:
120						OK|
121
122Values commonly included in payloads include:
123
124    <path>
125	Specifies a path in the hierarchical key structure.
126	If <path> starts with a / it simply represents that path.
127
128	<path> is allowed not to start with /, in which case the
129	caller must be a domain (rather than connected via a socket)
130	and the path is taken to be relative to /local/domain/<domid>
131	(eg, `x/y' sent by domain 3 would mean `/local/domain/3/x/y').
132
133    <domid>
134	Integer domid, represented as decimal number 0..65535.
135	Parsing errors and values out of range generally go
136	undetected.  The special DOMID_... values (see xen.h) are
137	represented as integers; unless otherwise specified it
138	is an error not to specify a real domain id.
139
140
141
142The following are the actual type values, including the request and
143reply payloads as applicable:
144
145
146---------- Database read, write and permissions operations ----------
147
148READ			<path>|			<value|>
149WRITE			<path>|<value|>
150	Store and read the octet string <value> at <path>.
151	WRITE creates any missing parent paths, with empty values.
152
153MKDIR			<path>|
154	Ensures that the <path> exists, by necessary by creating
155	it and any missing parents with empty values.  If <path>
156	or any parent already exists, its value is left unchanged.
157
158RM			<path>|
159	Ensures that the <path> does not exist, by deleting
160	it and all of its children.  It is not an error if <path> does
161	not exist, but it _is_ an error if <path>'s immediate parent
162	does not exist either.
163
164DIRECTORY		<path>|			<child-leaf-name>|*
165	Gives a list of the immediate children of <path>, as only the
166	leafnames.  The resulting children are each named
167	<path>/<child-leaf-name>.
168
169DIRECTORY_PART		<path>|<offset>		<gencnt>|<child-leaf-name>|*
170	Same as DIRECTORY, but to be used for children lists longer than
171	XENSTORE_PAYLOAD_MAX. Input are <path> and the byte offset into
172	the list of children to return. Return values are the generation
173	count <gencnt> of the node (to be used to ensure the node hasn't
174	changed between two reads: <gencnt> being the same for multiple
175	reads guarantees the node hasn't changed) and the list of children
176	starting at the specified <offset> of the complete list.
177
178GET_PERMS	 	<path>|			<perm-as-string>|+
179SET_PERMS		<path>|<perm-as-string>|+?
180	<perm-as-string> is one of the following
181		w<domid>	write only
182		r<domid>	read only
183		b<domid>	both read and write
184		n<domid>	no access
185	See https://wiki.xen.org/wiki/XenBus section
186	`Permissions' for details of the permissions system.
187	It is possible to set permissions for the special watch paths
188	"@introduceDomain" and "@releaseDomain" to enable receiving those
189	watches in unprivileged domains.
190
191---------- Watches ----------
192
193WATCH			<wpath>|<token>|[<depth>|]?
194	Adds a watch.
195
196	When a <path> is modified (including path creation, removal,
197	contents change or permissions change) this generates an event
198	on the changed <path>.  Changes made in transactions cause an
199	event only if and when committed.  Each occurring event is
200	matched against all the watches currently set up, and each
201	matching watch results in a WATCH_EVENT message (see below).
202
203	The event's path matches the watch's <wpath> if it is an child
204	of <wpath>. This match can be limited by specifying <depth> (a
205	decimal value of 0 or larger): it denotes the directory levels
206	below <wpath> to consider for a match ("0" would not match for
207	a child of <wpath>, "1" would match only for a direct child,
208	etc.).
209
210	<wpath> can be a <path> to watch or @<wspecial>.  In the
211	latter case <wspecial> may have any syntax but it matches
212	(according to the rules above) only the following special
213	events which are invented by xenstored:
214	    @introduceDomain	occurs on INTRODUCE
215	    @releaseDomain 	occurs on any domain crash or
216				shutdown, and also on RELEASE
217				and domain destruction
218	<wspecial> events are sent to privileged callers or explicitly
219	via SET_PERMS enabled domains only. The semantics for a
220	specification of <depth> differ for generating <wspecial>
221	events: specifying "1" will report the related domid by using
222	@<wspecial>/<domid> for the reported path. Other <depth>
223	values are not supported.
224	For @releaseDomain it is possible to watch only for a specific
225	domain by specifying @releaseDomain/<domid> for the path.
226
227	When a watch is first set up it is triggered once straight
228	away, with <path> equal to <wpath>.  Watches may be triggered
229	spuriously.  The tx_id in a WATCH request is ignored.
230
231	Watches are supposed to be restricted by the permissions
232	system but in practice the implementation is imperfect.
233	Applications should not rely on being sent a notification for
234	paths that they cannot read; however, an application may rely
235	on being sent a watch when a path which it _is_ able to read
236	is deleted even if that leaves only a nonexistent unreadable
237	parent.  A notification may omitted if a node's permissions
238	are changed so as to make it unreadable, in which case future
239	notifications may be suppressed (and if the node is later made
240	readable, some notifications may have been lost).
241
242WATCH_EVENT					<epath>|<token>|
243	Unsolicited `reply' generated for matching modification events
244	as described above.  req_id and tx_id are both 0.
245
246	<epath> is the event's path, ie the actual path that was
247	modified; however if the event was the recursive removal of an
248	parent of <wpath>, <epath> is just
249	<wpath> (rather than the actual path which was removed).  So
250	<epath> is a child of <wpath>, regardless.
251
252	Iff <wpath> for the watch was specified as a relative pathname,
253	the <epath> path will also be relative (with the same base,
254	obviously).
255
256UNWATCH			<wpath>|<token>|?
257
258RESET_WATCHES		|
259	Reset all watches and transactions of the caller.
260
261---------- Transactions ----------
262
263TRANSACTION_START	|			<transid>|
264	<transid> is an opaque uint32_t allocated by xenstored
265	represented as unsigned decimal.  After this, transaction may
266	be referenced by using <transid> (as 32-bit binary) in the
267	tx_id request header field.  When transaction is started whole
268	db is copied; reads and writes happen on the copy.
269	It is not legal to send non-0 tx_id in TRANSACTION_START.
270
271TRANSACTION_END		T|
272TRANSACTION_END		F|
273	tx_id must refer to existing transaction.  After this
274 	request the tx_id is no longer valid and may be reused by
275	xenstore.  If F, the transaction is discarded.  If T,
276	it is committed: if there were any other intervening writes
277	then our END gets get EAGAIN.
278
279	The plan is that in the future only intervening `conflicting'
280	writes cause EAGAIN, meaning only writes or other commits
281	which changed paths which were read or written in the
282	transaction at hand.
283
284---------- Domain management and xenstored communications ----------
285
286INTRODUCE		<domid>|<gfn>|<evtchn>|?
287	Notifies xenstored to communicate with this domain.
288
289	INTRODUCE is currently only used by xend (during domain
290	startup and various forms of restore and resume), and
291	xenstored prevents its use other than by dom0.
292
293	<domid> must be a real domain id (not 0 and not a special
294	DOMID_... value).  <gfn> must be a page in that domain
295	represented in signed decimal (!).  <evtchn> must be event
296	channel is an unbound event channel in <domid> (likewise in
297	decimal), on which xenstored will call bind_interdomain.
298	Violations of these rules may result in undefined behaviour;
299	for example passing a high-bit-set 32-bit gfn as an unsigned
300	decimal will attempt to use 0x7fffffff instead (!).
301
302RELEASE			<domid>|
303	Manually requests that xenstored disconnect from the domain.
304	The event channel is unbound at the xenstored end and the page
305	unmapped.  If the domain is still running it won't be able to
306	communicate with xenstored.  NB that xenstored will in any
307	case detect domain destruction and disconnect by itself.
308	xenstored prevents the use of RELEASE other than by dom0.
309
310GET_DOMAIN_PATH		<domid>|		<path>|
311	Returns the domain's base path, as is used for relative
312	transactions: ie, /local/domain/<domid> (with <domid>
313	normalised).  The answer will be useless unless <domid> is a
314	real domain id.
315
316IS_DOMAIN_INTRODUCED	<domid>|		T| or F|
317	Returns T if xenstored is in communication with the domain:
318	ie, if INTRODUCE for the domain has not yet been followed by
319	domain destruction or explicit RELEASE.
320
321RESUME			<domid>|
322
323	Arranges that @releaseDomain events will once more be
324	generated when the domain becomes shut down.  This might have
325	to be used if a domain were to be shut down (generating one
326	@releaseDomain) and then subsequently restarted, since the
327	state-sensitive algorithm in xenstored will not otherwise send
328	further watch event notifications if the domain were to be
329	shut down again.
330
331	This command will be issued in place such as resume because
332	Xen will "shutdown" the domain on suspend.
333
334	xenstored prevents the use of RESUME other than by dom0.
335
336
337SET_TARGET		<domid>|<tdomid>|
338	Notifies xenstored that domain <domid> is targeting domain
339	<tdomid>. This grants domain <domid> full access to paths
340	owned by <tdomid>. Domain <domid> also inherits all
341	permissions granted to <tdomid> on all other paths. This
342	allows <domid> to behave as if it were dom0 when modifying
343	paths related to <tdomid>.
344
345	xenstored prevents the use of SET_TARGET other than by dom0.
346
347GET_FEATURE		[<domid>|]		<value>|
348SET_FEATURE		<domid>|<value>|
349	Returns or sets the contents of the "feature" field copied to
350	offset 2064 of the Xenstore ring page of the domain specified by
351	<domid>. <value> is a decimal number being a logical or of the
352	feature bits as defined in docs/misc/xenstore-ring.txt. Trying
353	to set a bit for a feature not being supported by the running
354	Xenstore will be denied. Providing no <domid> with the
355	GET_FEATURE command will return the features which are supported
356	by Xenstore.
357
358	SET_FEATURE for a domain will be rejected after the INTRODUCE
359	command for this domain has been sent to xenstored.
360
361	xenstored prevents the use of GET_FEATURE and SET_FEATURE other
362	than by dom0.
363
364GET_QUOTA		[[<domid>|]<quota>|]	<value>|
365SET_QUOTA		[<domid>|]<quota>|<value>|
366	Returns or sets a quota value for the domain being specified by
367	<domid>. Omitting <domid> will return or set the global quota
368	values, which are the default values for new domains. <quota> is
369	 one of "nodes", "watches", "transactions", "node-size",
370	"permissions", or any other implementation defined value. For
371	GET_QUOTA it is possible to omit the <quota> parameter together
372	with the <domid> parameter, which will return a single string of
373	all supported <quota> values separated by blanks. <value> is a
374	decimal number specifying the quota value, with "0" having the
375	special meaning of quota checks being disabled. The initial quota
376	settings for a domain are the global ones of Xenstore.
377
378	xenstored prevents the use of GET_QUOTA and SET_QUOTA other
379	than by dom0.
380
381---------- Miscellaneous ----------
382
383CONTROL			<command>|[<parameters>|]
384	Send a control command <command> with optional parameters
385	(<parameters>) to Xenstore daemon.
386
387	The set of commands and their semantics is implementation
388	specific and is likely to change from one Xen version to the
389	next.  Out-of-tree users will encounter compatibility issues.
390
391	Current commands are:
392	check
393		checks xenstored innards
394	live-update|<params>|+
395		perform a live-update of the Xenstore daemon, only to
396		be used via xenstore-control command.
397		<params> are implementation specific and are used for
398		different steps of the live-update processing. Currently
399		supported <params> are:
400		-f <file>  specify new daemon binary
401		-b <size>  specify size of new stubdom binary
402		-d <chunk-size> <binary-chunk>  transfer chunk of new
403			stubdom binary
404		-c <pars>  specify new command line to use
405		-s [-t <sec>] [-F]  start live update process (-t specifies
406			timeout in seconds to wait for active transactions
407			to finish, default is 60 seconds; -F will force
408			live update to happen even with running transactions
409			after timeout elapsed)
410		-a  abort live update handling
411		All sub-options will return "OK" in case of success or an
412		error string in case of failure. -s can return "BUSY" in case
413		of an active transaction, a retry of -s can be done in that
414		case.
415	log|[on|off|+<switch>|-<switch>]
416		without parameters: show possible log switches
417		on: turn xenstore logging on
418		off: turn xenstore logging off
419		+<switch>: activates log entries for <switch>,
420		-<switch>: deactivates log entries for <switch>
421	logfile|<file-name>
422		log to specified file
423	memreport|[<file-name>]
424		print memory statistics to logfile (no <file-name>
425		specified) or to specific file
426	print|<string>
427		print <string> to syslog (xenstore runs as daemon) or
428		to console (xenstore runs as stubdom)
429	quota|[set <name> <val>|<domid>|max [-r]]
430		without parameters: print the current quota settings
431		with "set <name> <val>": set the quota <name> to new value
432		<val> (The admin should make sure all the domain usage is
433		below the quota. If it is not, then Xenstored may continue to
434		handle requests from the domain as long as the resource
435		violating the new quota setting isn't increased further)
436		with "<domid>": print quota related accounting data for
437		the domain <domid>
438		with "max [-r]": show global per-domain maximum values of all
439		unprivileged domains, optionally reset the values by adding
440		"-r"
441	quota-soft|[set <name> <val>]
442		like the "quota" command, but for soft-quota.
443	help			<supported-commands>
444		return list of supported commands for CONTROL
445
446DEBUG
447	Deprecated, now named CONTROL
448
449