1Xenstore protocol specification
2-------------------------------
3
4Xenstore implements a database which maps filename-like pathnames
5(also known as `keys') to values.  Clients may read and write values,
6watch for changes, and set permissions to allow or deny access.  There
7is a rudimentary transaction system.
8
9While xenstore and most tools and APIs are capable of dealing with
10arbitrary binary data as values, this should generally be avoided.
11Data should generally be human-readable for ease of management and
12debugging; xenstore is not a high-performance facility and should be
13used only for small amounts of control plane data.  Therefore xenstore
14values should normally be 7-bit ASCII text strings containing bytes
150x20..0x7f only, and should not contain a trailing nul byte.  (The
16APIs used for accessing xenstore generally add a nul when reading, for
17the caller's convenience.)
18
19A separate specification will detail the keys and values which are
20used in the Xen system and what their meanings are.  (Sadly that
21specification currently exists only in multiple out-of-date versions.)
22
23
24Paths are /-separated and start with a /, just as Unix filenames.
25
26We can speak of two paths being <child> and <parent>, which is the
27case if they're identical, or if <parent> is /, or if <parent>/ is an
28initial substring of <child>.  (This includes <path> being a child of
29itself.)
30
31If a particular path exists, all of its parents do too.  Every
32existing path maps to a possibly empty value, and may also have zero
33or more immediate children.  There is thus no particular distinction
34between directories and leaf nodes.  However, it is conventional not
35to store nonempty values at nodes which also have children.
36
37The permitted character for paths set is ASCII alphanumerics and plus
38the four punctuation characters -/_@ (hyphen slash underscore atsign).
39@ should be avoided except to specify special watches (see below).
40Doubled slashes and trailing slashes (except to specify the root) are
41forbidden.  The empty path is also forbidden.  Paths longer than 3072
42bytes are forbidden; clients specifying relative paths should keep
43them to within 2048 bytes.  (See XENSTORE_*_PATH_MAX in xs_wire.h.)
44
45
46Each node has one or multiple permission entries.  Permissions are
47granted by domain-id, the first permission entry of each node specifies
48the owner of the node, who always has full access to the node (read and
49write permission).  The access rights of the first entry specify the
50allowed access for all domains not having a dedicated permission entry
51(the default is "n", removing access for all domains not explicitly
52added via additional permission entries).  Permissions of a node can be
53changed by the owner of the node, the owner can only be modified by the
54control domain (usually domain id 0).  Other permissions can be setup to
55allow read and/or write access for other domains.  When a domain is
56being removed from Xenstore nodes owned by that domain will be removed
57together with all of those nodes' children.
58
59
60Communication with xenstore is via either sockets, or event channel
61and shared memory, as specified in io/xs_wire.h: each message in
62either direction is a header formatted as a struct xsd_sockmsg
63followed by xsd_sockmsg.len bytes of payload.
64
65The payload syntax varies according to the type field.  Generally
66requests each generate a reply with an identical type, req_id and
67tx_id.  However, if an error occurs, a reply will be returned with
68type ERROR, and only req_id and tx_id copied from the request.
69
70A caller who sends several requests may receive the replies in any
71order and must use req_id (and tx_id, if applicable) to match up
72replies to requests.  (The current implementation always replies to
73requests in the order received but this should not be relied on.)
74
75The payload length (len field of the header) is limited to 4096
76(XENSTORE_PAYLOAD_MAX) in both directions.  If a client exceeds the
77limit, its xenstored connection will be immediately killed by
78xenstored, which is usually catastrophic from the client's point of
79view.  Clients (particularly domains, which cannot just reconnect)
80should avoid this.
81
82Existing clients do not always contain defences against overly long
83payloads.  Increasing xenstored's limit is therefore difficult; it
84would require negotiation with the client, and obviously would make
85parts of xenstore inaccessible to some clients.  In any case passing
86bulk data through xenstore is not recommended as the performance
87properties are poor.
88
89---------- Defined Xenstore message types ----------
90
91Below is a table with all defined Xenstore message types (type name
92and its associated numerical value).
93
94Some types are optional to be supported by a specific Xenstore
95implementation.  If an optional type is not supported by a Xenstore
96implementation, Xen tools will continue to work, maybe with slightly
97reduced functionality.  A mandatory type not being supported will
98result in severely reduced functionality, like inability to create
99domains.  In case a type is optional, this is stated in the table with
100the lost functionality in case Xenstore doesn't support that type.
101Any not supported type sent to Xenstore will result in an error response
102with the "ENOSYS" error.
103
104CONTROL               0    optional
105    If not supported, xenstore-control command will not work.
106    DEBUG is a deprecated alias of CONTROL.
107DIRECTORY             1
108READ                  2
109GET_PERMS             3
110WATCH                 4
111UNWATCH               5
112TRANSACTION_START     6
113TRANSACTION_END       7
114INTRODUCE             8
115RELEASE               9
116GET_DOMAIN_PATH      10
117WRITE                11
118MKDIR                12
119RM                   13
120SET_PERMS            14
121WATCH_EVENT          15
122    Not valid in client sent messages.
123    Only valid in Xenstore replies.
124ERROR                16
125    Not valid in client sent messages.
126    Only valid in Xenstore replies.
127IS_DOMAIN_INTRODUCED 17
128RESUME               18
129SET_TARGET           19
130RESTRICT             20    no longer supported
131    RESTRICT has been removed, the type value 20 is invalid.
132RESET_WATCHES        21
133DIRECTORY_PART       22    optional
134    If not supported, the output of xenstore-ls might be incomplete
135    with a node's sub-node list exceeding the maximum payload size
136    (e.g. the "/local/domain" node with more than ca. 1000 domains
137    active).
138GET_FEATURE          23    optional
139SET_FEATURE          24    optional
140    SET_FEATURE requires GET_FEATURE to be supported.
141    If unsupported, setting availability of Xenstore features per
142    domain is not possible.
143GET_QUOTA            25    optional
144SET_QUOTA            26    optional
145    SET_QUOTA requires GET_QUOTA to be supported.
146    If unsupported, setting of Xenstore quota per domain is not
147    possible.
148INVALID           65535
149    Guaranteed invalid type (never supported).
150
151---------- Xenstore protocol details - introduction ----------
152
153The payload syntax and semantics of the requests and replies are
154described below.  In the payload syntax specifications we use the
155following notations:
156
157 |		A nul (zero) byte.
158 <foo>		A string guaranteed not to contain any nul bytes.
159 <foo|>		Binary data (which may contain zero or more nul bytes)
160 <foo>|*	Zero or more strings each followed by a trailing nul
161 <foo>|+	One or more strings each followed by a trailing nul
162 ?		Reserved value (may not contain nuls)
163 ??		Reserved value (may contain nuls)
164
165Except as otherwise noted, reserved values are believed to be sent as
166empty strings by all current clients.  Clients should not send
167nonempty strings for reserved values; those parts of the protocol may
168be used for extension in the future.
169
170
171Error replies are as follows:
172
173ERROR						E<something>|
174	Where E<something> is the name of an errno value
175	listed in io/xs_wire.h.  Note that the string name
176	is transmitted, not a numeric value.
177
178
179Where no reply payload format is specified below, success responses
180have the following payload:
181						OK|
182
183Values commonly included in payloads include:
184
185    <path>
186	Specifies a path in the hierarchical key structure.
187	If <path> starts with a / it simply represents that path.
188
189	<path> is allowed not to start with /, in which case the
190	caller must be a domain (rather than connected via a socket)
191	and the path is taken to be relative to /local/domain/<domid>
192	(eg, `x/y' sent by domain 3 would mean `/local/domain/3/x/y').
193
194    <domid>
195	Integer domid, represented as decimal number 0..65535.
196	Parsing errors and values out of range generally go
197	undetected.  The special DOMID_... values (see xen.h) are
198	represented as integers; unless otherwise specified it
199	is an error not to specify a real domain id.
200
201
202
203The following are the actual type values, including the request and
204reply payloads as applicable:
205
206
207---------- Database read, write and permissions operations ----------
208
209READ			<path>|			<value|>
210WRITE			<path>|<value|>
211	Store and read the octet string <value> at <path>.
212	WRITE creates any missing parent paths, with empty values.
213
214MKDIR			<path>|
215	Ensures that the <path> exists, by necessary by creating
216	it and any missing parents with empty values.  If <path>
217	or any parent already exists, its value is left unchanged.
218
219RM			<path>|
220	Ensures that the <path> does not exist, by deleting
221	it and all of its children.  It is not an error if <path> does
222	not exist, but it _is_ an error if <path>'s immediate parent
223	does not exist either.
224
225DIRECTORY		<path>|			<child-leaf-name>|*
226	Gives a list of the immediate children of <path>, as only the
227	leafnames.  The resulting children are each named
228	<path>/<child-leaf-name>.
229
230DIRECTORY_PART		<path>|<offset>		<gencnt>|<child-leaf-name>|*
231	Same as DIRECTORY, but to be used for children lists longer than
232	XENSTORE_PAYLOAD_MAX. Input are <path> and the byte offset into
233	the list of children to return. Return values are the generation
234	count <gencnt> of the node (to be used to ensure the node hasn't
235	changed between two reads: <gencnt> being the same for multiple
236	reads guarantees the node hasn't changed) and the list of children
237	starting at the specified <offset> of the complete list.
238
239GET_PERMS	 	<path>|			<perm-as-string>|+
240SET_PERMS		<path>|<perm-as-string>|+?
241	<perm-as-string> is one of the following
242		w<domid>	write only
243		r<domid>	read only
244		b<domid>	both read and write
245		n<domid>	no access
246	See https://wiki.xen.org/wiki/XenBus section
247	`Permissions' for details of the permissions system.
248	It is possible to set permissions for the special watch paths
249	"@introduceDomain" and "@releaseDomain" to enable receiving those
250	watches in unprivileged domains.
251
252---------- Watches ----------
253
254WATCH			<wpath>|<token>|[<depth>|]?
255	Adds a watch.
256
257	When a <path> is modified (including path creation, removal,
258	contents change or permissions change) this generates an event
259	on the changed <path>.  Changes made in transactions cause an
260	event only if and when committed.  Each occurring event is
261	matched against all the watches currently set up, and each
262	matching watch results in a WATCH_EVENT message (see below).
263
264	The event's path matches the watch's <wpath> if it is an child
265	of <wpath>. This match can be limited by specifying <depth> (a
266	decimal value of 0 or larger): it denotes the directory levels
267	below <wpath> to consider for a match ("0" would not match for
268	a child of <wpath>, "1" would match only for a direct child,
269	etc.).
270
271	<wpath> can be a <path> to watch or @<wspecial>.  In the
272	latter case <wspecial> may have any syntax but it matches
273	(according to the rules above) only the following special
274	events which are invented by xenstored:
275	    @introduceDomain	occurs on INTRODUCE
276	    @releaseDomain 	occurs on any domain crash or
277				shutdown, and also on RELEASE
278				and domain destruction
279	<wspecial> events are sent to privileged callers or explicitly
280	via SET_PERMS enabled domains only. The semantics for a
281	specification of <depth> differ for generating <wspecial>
282	events: specifying "1" will report the related domid by using
283	@<wspecial>/<domid> for the reported path. Other <depth>
284	values are not supported.
285	For @releaseDomain it is possible to watch only for a specific
286	domain by specifying @releaseDomain/<domid> for the path.
287
288	When a watch is first set up it is triggered once straight
289	away, with <path> equal to <wpath>.  Watches may be triggered
290	spuriously.  The tx_id in a WATCH request is ignored.
291
292	Watches are supposed to be restricted by the permissions
293	system but in practice the implementation is imperfect.
294	Applications should not rely on being sent a notification for
295	paths that they cannot read; however, an application may rely
296	on being sent a watch when a path which it _is_ able to read
297	is deleted even if that leaves only a nonexistent unreadable
298	parent.  A notification may omitted if a node's permissions
299	are changed so as to make it unreadable, in which case future
300	notifications may be suppressed (and if the node is later made
301	readable, some notifications may have been lost).
302
303WATCH_EVENT					<epath>|<token>|
304	Unsolicited `reply' generated for matching modification events
305	as described above.  req_id and tx_id are both 0.
306
307	<epath> is the event's path, ie the actual path that was
308	modified; however if the event was the recursive removal of an
309	parent of <wpath>, <epath> is just
310	<wpath> (rather than the actual path which was removed).  So
311	<epath> is a child of <wpath>, regardless.
312
313	Iff <wpath> for the watch was specified as a relative pathname,
314	the <epath> path will also be relative (with the same base,
315	obviously).
316
317UNWATCH			<wpath>|<token>|?
318
319RESET_WATCHES		|
320	Reset all watches and transactions of the caller.
321
322---------- Transactions ----------
323
324TRANSACTION_START	|			<transid>|
325	<transid> is an opaque uint32_t allocated by xenstored
326	represented as unsigned decimal.  After this, transaction may
327	be referenced by using <transid> (as 32-bit binary) in the
328	tx_id request header field.  When transaction is started whole
329	db is copied; reads and writes happen on the copy.
330	It is not legal to send non-0 tx_id in TRANSACTION_START.
331
332TRANSACTION_END		T|
333TRANSACTION_END		F|
334	tx_id must refer to existing transaction.  After this
335 	request the tx_id is no longer valid and may be reused by
336	xenstore.  If F, the transaction is discarded.  If T,
337	it is committed: if there were any other intervening writes
338	then our END gets get EAGAIN.
339
340	The plan is that in the future only intervening `conflicting'
341	writes cause EAGAIN, meaning only writes or other commits
342	which changed paths which were read or written in the
343	transaction at hand.
344
345---------- Domain management and xenstored communications ----------
346
347INTRODUCE		<domid>|<gfn>|<evtchn>|?
348	Notifies xenstored to communicate with this domain.
349
350	INTRODUCE is currently only used by xen tools (during domain
351	startup and various forms of restore and resume), and
352	xenstored prevents its use other than by dom0.
353
354	<domid> must be a real domain id (not 0 and not a special
355	DOMID_... value).  <gfn> must be a page in that domain
356	represented in signed decimal (!).  <evtchn> is an unbound
357	event channel in <domid> (likewise in decimal), on which
358	xenstored will call bind_interdomain.
359
360	Violations of these rules may result in undefined behaviour;
361	for example passing a high-bit-set 32-bit gfn as an unsigned
362	decimal will attempt to use 0x7fffffff instead (!).
363
364	The <gfn> field is used by xenstoreds which use foreign
365	mapping to access the ring page.
366
367	Alternatively, Grant 1 (GNTTAB_RESERVED_XENSTORE) is reserved
368	for the same purpose, and is populated by the domain builder
369	on behalf of the guest.  This mechanism is preferred because
370	it reduces the permissions that xenstored needs in order to
371	function.
372
373	Both <gfn> and Grant 1 need to agree, because implementations
374	of xenstored will use one and ignore the other.
375
376RELEASE			<domid>|
377	Manually requests that xenstored disconnect from the domain.
378	The event channel is unbound at the xenstored end and the page
379	unmapped.  If the domain is still running it won't be able to
380	communicate with xenstored.  NB that xenstored will in any
381	case detect domain destruction and disconnect by itself.
382	xenstored prevents the use of RELEASE other than by dom0.
383
384GET_DOMAIN_PATH		<domid>|		<path>|
385	Returns the domain's base path, as is used for relative
386	transactions: ie, /local/domain/<domid> (with <domid>
387	normalised).  The answer will be useless unless <domid> is a
388	real domain id.
389
390IS_DOMAIN_INTRODUCED	<domid>|		T| or F|
391	Returns T if xenstored is in communication with the domain:
392	ie, if INTRODUCE for the domain has not yet been followed by
393	domain destruction or explicit RELEASE.
394
395RESUME			<domid>|
396
397	Arranges that @releaseDomain events will once more be
398	generated when the domain becomes shut down.  This might have
399	to be used if a domain were to be shut down (generating one
400	@releaseDomain) and then subsequently restarted, since the
401	state-sensitive algorithm in xenstored will not otherwise send
402	further watch event notifications if the domain were to be
403	shut down again.
404
405	This command will be issued in place such as resume because
406	Xen will "shutdown" the domain on suspend.
407
408	xenstored prevents the use of RESUME other than by dom0.
409
410
411SET_TARGET		<domid>|<tdomid>|
412	Notifies xenstored that domain <domid> is targeting domain
413	<tdomid>. This grants domain <domid> full access to paths
414	owned by <tdomid>. Domain <domid> also inherits all
415	permissions granted to <tdomid> on all other paths. This
416	allows <domid> to behave as if it were dom0 when modifying
417	paths related to <tdomid>.
418
419	xenstored prevents the use of SET_TARGET other than by dom0.
420
421GET_FEATURE		[<domid>|]		<value>|
422SET_FEATURE		<domid>|<value>|
423	Returns or sets the contents of the "feature" field copied to
424	offset 2064 of the Xenstore ring page of the domain specified by
425	<domid>. <value> is a decimal number being a logical or of the
426	feature bits as defined in docs/misc/xenstore-ring.txt. Trying
427	to set a bit for a feature not being supported by the running
428	Xenstore will be denied. Providing no <domid> with the
429	GET_FEATURE command will return the features which are supported
430	by Xenstore.
431
432	SET_FEATURE for a domain will be rejected after the INTRODUCE
433	command for this domain has been sent to xenstored.
434
435	xenstored prevents the use of GET_FEATURE and SET_FEATURE other
436	than by dom0.
437
438GET_QUOTA		[[<domid>|]<quota>|]	<value>|
439SET_QUOTA		[<domid>|]<quota>|<value>|
440	Returns or sets a quota value for the domain being specified by
441	<domid>. Omitting <domid> will return or set the global quota
442	values, which are the default values for new domains. <quota> is
443	 one of "nodes", "watches", "transactions", "node-size",
444	"permissions", or any other implementation defined value. For
445	GET_QUOTA it is possible to omit the <quota> parameter together
446	with the <domid> parameter, which will return a single string of
447	all supported <quota> values separated by blanks. <value> is a
448	decimal number specifying the quota value, with "0" having the
449	special meaning of quota checks being disabled. The initial quota
450	settings for a domain are the global ones of Xenstore.
451
452	xenstored prevents the use of GET_QUOTA and SET_QUOTA other
453	than by dom0.
454
455---------- Miscellaneous ----------
456
457CONTROL			<command>|[<parameters>|]
458	Send a control command <command> with optional parameters
459	(<parameters>) to Xenstore daemon.
460
461	The set of commands and their semantics is implementation
462	specific and is likely to change from one Xen version to the
463	next.  Out-of-tree users will encounter compatibility issues.
464
465	Current commands are:
466	check
467		checks xenstored innards
468	live-update|<params>|+
469		perform a live-update of the Xenstore daemon, only to
470		be used via xenstore-control command.
471		<params> are implementation specific and are used for
472		different steps of the live-update processing. Currently
473		supported <params> are:
474		-f <file>  specify new daemon binary
475		-b <size>  specify size of new stubdom binary
476		-d <chunk-size> <binary-chunk>  transfer chunk of new
477			stubdom binary
478		-c <pars>  specify new command line to use
479		-s [-t <sec>] [-F]  start live update process (-t specifies
480			timeout in seconds to wait for active transactions
481			to finish, default is 60 seconds; -F will force
482			live update to happen even with running transactions
483			after timeout elapsed)
484		-a  abort live update handling
485		All sub-options will return "OK" in case of success or an
486		error string in case of failure. -s can return "BUSY" in case
487		of an active transaction, a retry of -s can be done in that
488		case.
489	log|[on|off|+<switch>|-<switch>]
490		without parameters: show possible log switches
491		on: turn xenstore logging on
492		off: turn xenstore logging off
493		+<switch>: activates log entries for <switch>,
494		-<switch>: deactivates log entries for <switch>
495	logfile|<file-name>
496		log to specified file
497	memreport|[<file-name>]
498		print memory statistics to logfile (no <file-name>
499		specified) or to specific file
500	print|<string>
501		print <string> to syslog (xenstore runs as daemon) or
502		to console (xenstore runs as stubdom)
503	quota|[set <name> <val>|<domid>|max [-r]]
504		without parameters: print the current quota settings
505		with "set <name> <val>": set the quota <name> to new value
506		<val> (The admin should make sure all the domain usage is
507		below the quota. If it is not, then Xenstored may continue to
508		handle requests from the domain as long as the resource
509		violating the new quota setting isn't increased further)
510		with "<domid>": print quota related accounting data for
511		the domain <domid>
512		with "max [-r]": show global per-domain maximum values of all
513		unprivileged domains, optionally reset the values by adding
514		"-r"
515	quota-soft|[set <name> <val>]
516		like the "quota" command, but for soft-quota.
517	help			<supported-commands>
518		return list of supported commands for CONTROL
519
520DEBUG
521	Deprecated, now named CONTROL
522
523