1******************************************************************************** 2 A Rough Introduction to Using Grant Tables 3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4 Christopher Clark, March, 2005. 5 6Grant tables are a mechanism for sharing and transferring frames between 7domains, without requiring the participating domains to be privileged. 8 9The first mode of use allows domA to grant domB access to a specific frame, 10whilst retaining ownership. The block front driver uses this to grant memory 11access to the block back driver, so that it may read or write as requested. 12 13 1. domA creates a grant access reference, and transmits the ref id to domB. 14 2. domB uses the reference to map the granted frame. 15 3. domB performs the memory access. 16 4. domB unmaps the granted frame. 17 5. domA removes its grant. 18 19 20The second mode allows domA to accept a transfer of ownership of a frame from 21domB. The net front and back driver will use this for packet tx/rx. This 22mechanism is still being implemented, though the xen<->guest interface design 23is complete. 24 25 1. domA creates an accept transfer grant reference, and transmits it to domB. 26 2. domB uses the ref to hand over a frame it owns. 27 3. domA accepts the transfer 28 4. domA clears the used reference. 29 30 31******************************************************************************** 32 Data structures 33 ~~~~~~~~~~~~~~~ 34 35 The following data structures are used by Xen and the guests to implement 36 grant tables: 37 38 1. Shared grant entries 39 2. Active grant entries 40 3. Map tracking 41 42 These are not the users primary interface to grant tables, but are discussed 43 because an understanding of how they work may be useful. Each of these is a 44 finite resource. 45 46 Shared grant entries 47 ~~~~~~~~~~~~~~~~~~~~ 48 49 A set of pages are shared between Xen and a guest, holding the shared grant 50 entries. The guest writes into these entries to create grant references. The 51 index of the entry is transmitted to the remote domain: this is the 52 reference used to activate an entry. Xen will write into a shared entry to 53 indicate to a guest that its grant is in use. 54 sha->domid : remote domain being granted rights 55 sha->frame : machine frame being granted 56 sha->flags : allow access, allow transfer, remote is reading/writing, etc. 57 58 Active grant entries 59 ~~~~~~~~~~~~~~~~~~~~ 60 61 Xen maintains a set of private frames per domain, holding the active grant 62 entries for safety, and to reference count mappings. 63 act->domid : remote domain being granted rights 64 act->frame : machine frame being granted 65 act->pin : used to hold reference counts 66 act->lock : spinlock used to serialize access to active entry state 67 68 Map tracking 69 ~~~~~~~~~~~~ 70 71 Every time a frame is mapped, a map track entry is stored in the metadata of 72 the mapping domain. The index of this entry is returned from the map call, 73 and is used to unmap the frame. Map track entries are also searched whenever a 74 page table entry containing a foreign frame number is overwritten: the first 75 matching map track entry is then removed, as if unmap had been invoked. 76 These are not used by the transfer mechanism. 77 map->domid : owner of the mapped frame 78 map->ref : grant reference 79 map->flags : ro/rw, mapped for host or device access 80 81******************************************************************************** 82 Locking 83 ~~~~~~~ 84 Xen uses several locks to serialize access to the internal grant table state. 85 86 grant_table->lock : rwlock used to prevent readers from accessing 87 inconsistent grant table state such as current 88 version, partially initialized active table pages, 89 etc. 90 grant_table->maptrack_lock : spinlock used to protect the maptrack limit 91 v->maptrack_freelist_lock : spinlock used to protect the maptrack free list 92 active_grant_entry->lock : spinlock used to serialize modifications to 93 active entries 94 95 The primary lock for the grant table is a read/write spinlock. All 96 functions that access members of struct grant_table must acquire a 97 read lock around critical sections. Any modification to the members 98 of struct grant_table (e.g., nr_status_frames, nr_grant_frames, 99 active frames, etc.) must only be made if the write lock is 100 held. These elements are read-mostly, and read critical sections can 101 be large, which makes a rwlock a good choice. 102 103 The maptrack free list is protected by its own spinlock. The maptrack 104 lock may be locked while holding the grant table lock. 105 106 The maptrack_freelist_lock is an innermost lock. It may be locked 107 while holding other locks, but no other locks may be acquired within 108 it. 109 110 Active entries are obtained by calling active_entry_acquire(gt, ref). 111 This function returns a pointer to the active entry after locking its 112 spinlock. The caller must hold the grant table read lock before 113 calling active_entry_acquire(). This is because the grant table can 114 be dynamically extended via gnttab_grow_table() while a domain is 115 running and must be fully initialized. Once all access to the active 116 entry is complete, release the lock by calling active_entry_release(act). 117 118 Summary of rules for locking: 119 active_entry_acquire() and active_entry_release() can only be 120 called when holding the relevant grant table's read lock. I.e.: 121 read_lock(>->lock); 122 act = active_entry_acquire(gt, ref); 123 ... 124 active_entry_release(act); 125 read_unlock(>->lock); 126 127 Active entries cannot be acquired while holding the maptrack lock. 128 Multiple active entries can be acquired while holding the grant table 129 _write_ lock. 130 131 Maptrack entries are protected by the corresponding active entry 132 lock. As an exception, new maptrack entries may be populated without 133 holding the lock, provided the flags field is written last. This 134 requires any maptrack entry user validates the flags field as 135 non-zero first. 136 137******************************************************************************** 138 139 Granting a foreign domain access to frames 140 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 141 142 domA [frame]--> domB 143 144 145 domA: #include <asm-xen/gnttab.h> 146 grant_ref_t gref[BATCH_SIZE]; 147 148 for ( i = 0; i < BATCH_SIZE; i++ ) 149 gref[i] = gnttab_grant_foreign_access( domBid, mfn, (readonly ? 1 : 0) ); 150 151 152 .. gref is then somehow transmitted to domB for use. 153 154 155 Mapping foreign frames 156 ~~~~~~~~~~~~~~~~~~~~~~ 157 158 domB: #include <asm-xen/hypervisor.h> 159 unsigned long mmap_vstart; 160 gnttab_op_t aop[BATCH_SIZE]; 161 grant_ref_t mapped_handle[BATCH_SIZE]; 162 163 if ( (mmap_vstart = allocate_empty_lowmem_region(BATCH_SIZE)) == 0 ) 164 BUG(); 165 166 for ( i = 0; i < BATCH_SIZE; i++ ) 167 { 168 aop[i].u.map_grant_ref.host_virt_addr = 169 mmap_vstart + (i * PAGE_SIZE); 170 aop[i].u.map_grant_ref.dom = domAid; 171 aop[i].u.map_grant_ref.ref = gref[i]; 172 aop[i].u.map_grant_ref.flags = ( GNTMAP_host_map | GNTMAP_readonly ); 173 } 174 175 if ( unlikely(HYPERVISOR_grant_table_op( 176 GNTTABOP_map_grant_ref, aop, BATCH_SIZE))) 177 BUG(); 178 179 for ( i = 0; i < BATCH_SIZE; i++ ) 180 { 181 if ( unlikely(aop[i].u.map_grant_ref.handle < 0) ) 182 { 183 tidyup_all(aop, i); 184 goto panic; 185 } 186 187 phys_to_machine_mapping[__pa(mmap_vstart + (i * PAGE_SIZE))>>PAGE_SHIFT] = 188 FOREIGN_FRAME(aop[i].u.map_grant_ref.dev_bus_addr); 189 190 mapped_handle[i] = aop[i].u.map_grant_ref.handle; 191 } 192 193 194 195 Unmapping foreign frames 196 ~~~~~~~~~~~~~~~~~~~~~~~~ 197 198 domB: 199 for ( i = 0; i < BATCH_SIZE; i++ ) 200 { 201 aop[i].u.unmap_grant_ref.host_virt_addr = mmap_vstart + (i * PAGE_SIZE); 202 aop[i].u.unmap_grant_ref.dev_bus_addr = 0; 203 aop[i].u.unmap_grant_ref.handle = mapped_handle[i]; 204 } 205 if ( unlikely(HYPERVISOR_grant_table_op( 206 GNTTABOP_unmap_grant_ref, aop, BATCH_SIZE))) 207 BUG(); 208 209 210 Ending foreign access 211 ~~~~~~~~~~~~~~~~~~~~~ 212 213 Note that this only prevents further mappings; it does _not_ revoke access. 214 Should _only_ be used when the remote domain has unmapped the frame. 215 gnttab_query_foreign_access( gref ) will indicate the state of any mapping. 216 217 domA: 218 if ( gnttab_query_foreign_access( gref[i] ) == 0 ) 219 gnttab_end_foreign_access( gref[i], readonly ); 220 221 TODO: readonly yet to be implemented. 222 223 224******************************************************************************** 225 226 Transferring ownership of a frame to another domain 227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 228 229 [ XXX: Transfer mechanism is alpha-calibre code, untested, use at own risk XXX ] 230 [ XXX: show use of batch operations below, rather than single frame XXX ] 231 [ XXX: linux internal interface could/should be wrapped to be tidier XXX ] 232 233 234 Prepare to accept a frame from a foreign domain 235 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 236 237 domA: 238 if ( (p = alloc_page(GFP_HIGHUSER)) == NULL ) 239 { 240 printk("Cannot alloc a frame to surrender\n"); 241 break; 242 } 243 pfn = p - mem_map; 244 mfn = phys_to_machine_mapping[pfn]; 245 246 if ( !PageHighMem(p) ) 247 { 248 v = phys_to_virt(pfn << PAGE_SHIFT); 249 scrub_pages(v, 1); 250 queue_l1_entry_update(get_ptep((unsigned long)v), 0); 251 } 252 253 /* Ensure that ballooned highmem pages don't have cached mappings. */ 254 kmap_flush_unused(); 255 256 /* Flush updates through and flush the TLB. */ 257 xen_tlb_flush(); 258 259 phys_to_machine_mapping[pfn] = INVALID_P2M_ENTRY; 260 261 if ( HYPERVISOR_dom_mem_op( 262 MEMOP_decrease_reservation, &mfn, 1, 0) != 1 ) 263 { 264 printk("MEMOP_decrease_reservation failed\n"); 265 /* er... ok. free the page then */ 266 __free_page(p); 267 break; 268 } 269 270 accepting_pfn = pfn; 271 ref = gnttab_grant_foreign_transfer( (domid_t) args.arg[0], pfn ); 272 printk("Accepting dom %lu frame at ref (%d)\n", args.arg[0], ref); 273 274 275 Transfer a frame to a foreign domain 276 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 277 278 domB: 279 mmu_update_t update; 280 domid_t domid; 281 grant_ref_t gref; 282 unsigned long pfn, mfn, *v; 283 struct page *transfer_page = 0; 284 285 /* alloc a page and grant access. 286 * alloc page returns a page struct. */ 287 if ( (transfer_page = alloc_page(GFP_HIGHUSER)) == NULL ) 288 return -ENOMEM; 289 290 pfn = transfer_page - mem_map; 291 mfn = phys_to_machine_mapping[pfn]; 292 293 /* need to remove all references to this page */ 294 if ( !PageHighMem(transfer_page) ) 295 { 296 v = phys_to_virt(pfn << PAGE_SHIFT); 297 scrub_pages(v, 1); 298 sprintf((char *)v, "This page (%lx) was transferred.\n", mfn); 299 queue_l1_entry_update(get_ptep((unsigned long)v), 0); 300 } 301#ifdef CONFIG_XEN_SCRUB_PAGES 302 else 303 { 304 v = kmap(transfer_page); 305 scrub_pages(v, 1); 306 sprintf((char *)v, "This page (%lx) was transferred.\n", mfn); 307 kunmap(transfer_page); 308 } 309#endif 310 /* Delete any cached kmappings */ 311 kmap_flush_unused(); 312 313 /* Flush updates through and flush the TLB */ 314 xen_tlb_flush(); 315 316 /* invalidate in P2M */ 317 phys_to_machine_mapping[pfn] = INVALID_P2M_ENTRY; 318 319 domid = (domid_t)args.arg[0]; 320 gref = (grant_ref_t)args.arg[1]; 321 322 update.ptr = MMU_EXTENDED_COMMAND; 323 update.ptr |= ((gref & 0x00FF) << 2); 324 update.ptr |= mfn << PAGE_SHIFT; 325 326 update.val = MMUEXT_TRANSFER_PAGE; 327 update.val |= (domid << 16); 328 update.val |= (gref & 0xFF00); 329 330 ret = HYPERVISOR_mmu_update(&update, 1, NULL); 331 332 333 Map a transferred frame 334 ~~~~~~~~~~~~~~~~~~~~~~~ 335 336 TODO: 337 338 339 Clear the used transfer reference 340 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 341 342 TODO: 343 344 345******************************************************************************** 346 347 Using a private reserve of grant references 348 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 349 350Where it is known in advance how many grant references are required, and 351failure to allocate them on demand would cause difficulty, a batch can be 352allocated and held in a private reserve. 353 354To reserve a private batch: 355 356 /* housekeeping data - treat as opaque: */ 357 grant_ref_t gref_head, gref_terminal; 358 359 if ( 0 > gnttab_alloc_grant_references( number_to_reserve, 360 &gref_head, &gref_terminal )) 361 return -ENOSPC; 362 363 364To release a batch back to the shared pool: 365 366 gnttab_free_grant_references( number_reserved, gref_head ); 367 368 369To claim a reserved reference: 370 371 ref = gnttab_claim_grant_reference( &gref_head, gref_terminal ); 372 373 374To release a claimed reference back to the reserve pool: 375 376 gnttab_release_grant_reference( &gref_head, gref ); 377 378 379To use a claimed reference to grant access, use these alternative functions 380that take an additional parameter of the grant reference to use: 381 382 gnttab_grant_foreign_access_ref 383 gnttab_grant_foreign_transfer_ref 384