1 2 3<!-- 4 (C) Copyright 2018 The Fuchsia Authors. All rights reserved. 5 Use of this source code is governed by a BSD-style license that can be 6 found in the LICENSE file. 7--> 8 9# RAMdisk Device 10 11This document is part of the [Driver Development Kit tutorial](ddk-tutorial.md) documentation. 12 13## Overview 14 15In this section, we'll examine a simplified RAM-disk driver. 16 17This driver introduces: 18 19* the block protocol's **query()** and **queue()** ops 20* Virtual Memory Address Regions ([VMAR](../objects/vm_address_region.md)s) 21 and Virtual Memory Objects ([VMO](../objects/vm_object.md)s) 22 23The source is in `//zircon/system/dev/sample/ramdisk/demo-ramdisk.c`. 24 25As with all drivers, the first thing to look at is how the driver initializes itself: 26 27```c 28static zx_status_t ramdisk_driver_bind(void* ctx, zx_device_t* parent) { 29 zx_status_t status = ZX_OK; 30 31 // (1) create the device context block 32 ramdisk_device_t* ramdev = calloc(1, sizeof((*ramdev))); 33 if (ramdev == NULL) { 34 return ZX_ERR_NO_MEMORY; 35 } 36 37 // (2) create a VMO 38 status = zx_vmo_create(RAMDISK_SIZE, 0, &ramdev->vmo); 39 if (status != ZX_OK) { 40 goto cleanup; 41 } 42 43 // (3) map the VMO into our address space 44 status = zx_vmar_map(zx_vmar_root_self(), 0, ramdev->vmo, 0, RAMDISK_SIZE, 45 ZX_VM_FLAG_PERM_READ | ZX_VM_FLAG_PERM_WRITE, &ramdev->mapped_addr); 46 if (status != ZX_OK) { 47 goto cleanup; 48 } 49 50 // (4) add the device 51 device_add_args_t args = { 52 .version = DEVICE_ADD_ARGS_VERSION, 53 .name = "demo-ramdisk", 54 .ctx = ramdev, 55 .ops = &ramdisk_proto, 56 .proto_id = ZX_PROTOCOL_BLOCK_IMPL, 57 .proto_ops = &block_ops, 58 }; 59 60 if ((status = device_add(parent, &args, &ramdev->zxdev)) != ZX_OK) { 61 ramdisk_release(ramdev); 62 } 63 return status; 64 65 // (5) clean up after ourselves 66cleanup: 67 zx_handle_close(ramdev->vmo); 68 free(ramdev); 69 return status; 70} 71 72static zx_driver_ops_t ramdisk_driver_ops = { 73 .version = DRIVER_OPS_VERSION, 74 .bind = ramdisk_driver_bind, 75}; 76 77ZIRCON_DRIVER_BEGIN(ramdisk, ramdisk_driver_ops, "zircon", "0.1", 1) 78 BI_MATCH_IF(EQ, BIND_PROTOCOL, ZX_PROTOCOL_MISC_PARENT), 79ZIRCON_DRIVER_END(ramdisk) 80 81``` 82 83At the bottom, you can see that this driver binds to a `ZX_PROTOCOL_MISC_PARENT` type of 84protocol, and provides `ramdisk_driver_ops` as the list of operations supported. 85This is no different than any of the other drivers we've seen so far. 86 87The binding function, **ramdisk_driver_bind()**, does the following: 88 891. Allocates the device context block. 902. Creates a [VMO](../objects/vm_object.md). 91 The [VMO](../objects/vm_object.md) 92 is a kernel object that represents a chunk of memory. 93 In this simplified RAM-disk driver, we're creating a 94 [VMO](../objects/vm_object.md) that's `RAMDISK_SIZE` 95 bytes long. 96 This chunk of memory **is** the RAM-disk — that's where the data is stored. 97 The [VMO](../objects/vm_object.md) 98 creation call, [**zx_vmo_create()**](../syscalls/vmo_create.md), 99 returns the [VMO](../objects/vm_object.md) handle through 100 its third argument, which is a member in our context block. 1013. Maps the [VMO](../objects/vm_object.md) into our address space via 102 [**zx_vmar_map()**](../syscalls/vmar_map.md). 103 This function returns a pointer to a 104 [VMAR](../objects/vm_address_region.md) 105 that points to the entire 106 [VMO](../objects/vm_object.md) (because 107 we specified `RAMDISK_SIZE` as the mapping size argument) and gives us read and 108 write access (because of the `ZX_VM_FLAG_PERM_*` flags). 109 The pointer is stored in our context block's `mapped_addr` member. 1104. Adds our device via **device_add()**, 111 just like all the examples we've seen above. 112 The difference here, though is that we see two new members: `proto_id` and 113 `proto_ops`. 114 These are defined as "optional custom protocol" members. 115 As usual, we store the newly created device in the `zxdev` member of our 116 context block. 1175. Cleans up resources if there were any problems along the way. 118 119For completeness, here's the context block: 120 121```c 122typedef struct ramdisk_device { 123 zx_device_t* zxdev; 124 uintptr_t mapped_addr; 125 uint32_t flags; 126 zx_handle_t vmo; 127 bool dead; 128} ramdisk_device_t; 129``` 130 131The fields are: 132 133Type | Field | Description 134----------------|---------------|---------------- 135`zx_device_t*` | zxdev | the ramdisk device 136`uintptr_t` | mapped_addr | address of the [VMAR](../objects/vm_address_region.md) 137`uin32_t` | flags | device flags 138`zx_handle_t` | vmo | a handle to our [VMO](../objects/vm_object.md) 139`bool` | dead | indicates if the device is still alive 140 141### Operations 142 143Where this device is different from the others that we've seen, though, 144is that the **device_add()** 145function adds two sets of operations; the "regular" one, and an 146optional "protocol specific" one: 147 148```c 149static zx_protocol_device_t ramdisk_proto = { 150 .version = DEVICE_OPS_VERSION, 151 .ioctl = ramdisk_ioctl, 152 .get_size = ramdisk_getsize, 153 .unbind = ramdisk_unbind, 154 .release = ramdisk_release, 155}; 156 157static block_protocol_ops_t block_ops = { 158 .query = ramdisk_query, 159 .queue = ramdisk_queue, 160}; 161``` 162 163The `zx_protocol_device_t` one handles control ops (**ramdisk_ioctl()**), device size 164queries (**ramdisk_getsize()**), and device cleanups (**ramdisk_unbind()** and 165**ramdisk_release()**). 166 167> @@@ should I discuss the ioctls, or were they to have been removed as part of the simplification? 168 169The `block_protocol_ops_t` one contains protocol operations particular to the 170block protocol. 171We bound these to the device in the `device_add_args_t` structure (step (4) above) via 172the `.proto_ops` field. 173We also set the `.proto_id` field to `ZX_PROTOCOL_BLOCK_IMPL` — this is what 174identifies this driver as being able to handle block protocol operations. 175 176Let's tackle the trivial functions first: 177 178```c 179static zx_off_t ramdisk_getsize(void* ctx) { 180 return RAMDISK_SIZE; 181} 182 183static void ramdisk_unbind(void* ctx) { 184 ramdisk_device_t* ramdev = ctx; 185 ramdev->dead = true; 186 device_remove(ramdev->zxdev); 187} 188 189static void ramdisk_release(void* ctx) { 190 ramdisk_device_t* ramdev = ctx; 191 192 if (ramdev->vmo != ZX_HANDLE_INVALID) { 193 zx_vmar_unmap(zx_vmar_root_self(), ramdev->mapped_addr, RAMDISK_SIZE); 194 zx_handle_close(ramdev->vmo); 195 } 196 free(ramdev); 197} 198 199static void ramdisk_query(void* ctx, block_info_t* bi, size_t* bopsz) { 200 ramdisk_get_info(ctx, bi); 201 *bopsz = sizeof(block_op_t); 202} 203``` 204 205**ramdisk_getsize()** is the easiest — it simply returns the size of the resource, in bytes. 206In our simplified RAM-disk driver, this is hardcoded as a `#define` near the top of the file. 207 208Next, **ramdisk_unbind()** and **ramdisk_release()** work together. 209When the driver is being shut down, the **ramdisk_unbind()** hook is called. 210It sets the `dead` flag to indicate that the driver is shutting down (this is checked 211in the **ramdisk_queue()** handler, below). 212It's expected that the driver will finish up any I/O operations that are in progress (there 213won't be any in our RAM-disk), and it should call 214**device_remove()** 215to remove itself from the parent. 216 217After **device_remove()** is called, 218the driver's **ramdisk_release()** will be called. 219Here we unmap the [VMAR](../objects/vm_address_region.md), 220via [**zx_vmar_unmap()**](../syscalls/vmar_unmap.md), and close the 221[VMO](../objects/vm_object.md), 222via [**zx_handle_close()**](../syscalls/handle_close.md). 223As our final act, we release the device context block. 224At this point, the device is finished. 225 226### Block Operations 227 228The **ramdisk_query()** function is called by the block protocol in order to get 229information about the device. 230There's a data structure (the `block_info_t`) that's filled out by the driver: 231 232```c 233// from .../system/public/zircon/device/block.h: 234typedef struct { 235 uint64_t block_count; // The number of blocks in this block device 236 uint32_t block_size; // The size of a single block 237 uint32_t max_transfer_size; // Max size in bytes per transfer. 238 // May be BLOCK_MAX_TRANSFER_UNBOUNDED if there 239 // is no restriction. 240 uint32_t flags; 241 uint32_t reserved; 242} block_info_t; 243 244// our helper function 245static void ramdisk_get_info(void* ctx, block_info_t* info) { 246 ramdisk_device_t* ramdev = ctx; 247 memset(info, 0, sizeof(*info)); 248 info->block_size = BLOCK_SIZE; 249 info->block_count = BLOCK_COUNT; 250 // Arbitrarily set, but matches the SATA driver for testing 251 info->max_transfer_size = BLOCK_MAX_TRANSFER_UNBOUNDED; 252 info->flags = ramdev->flags; 253} 254``` 255 256In this simplified driver, the `block_size`, `block_count`, and `max_transfer_size` 257fields are hardcoded numbers. 258 259The `flags` member is used to identify if the device is read-only (`BLOCK_FLAG_READONLY`, 260otherwise it's read/write), removable (`BLOCK_FLAG_REMOVABLE`, otherwise it's not 261removable) or has a bootable partition (`BLOCK_FLAG_BOOTPART`, otherwise it doesn't). 262 263The final value that **ramdisk_query()** returns is the "block operation size" value 264through the pointer to `bopsz`. 265This is a host-maintained block that's big enough to contain the `block_op_t` *plus* 266any additional data the driver wants (appended to the `block_op_t`), like an 267extended context block. 268 269### Reading and writing 270 271Finally, it's time to discuss the actual "block" data transfers; that is, how does 272data get read from / written to the device? 273 274The second block protocol handler, **ramdisk_queue()**, performs that function. 275 276As you might suspect from the name, it's intended that this hook starts whatever 277transfer operation (a read or a write) is requested, but doesn't require that 278the operation completes before the hook returns. 279This is a little like what we saw in earlier chapters 280in the **read()** and **write()** handlers 281for devices like `/dev/misc/demo-fifo` — there, we could either return 282data immediately, or put the client to sleep, waking it up later when data (or room 283for data) became available. 284 285With **ramdisk_queue()** we get passed a block operations structure that indicates 286the expected operation: `BLOCK_OP_READ`, `BLOCK_OP_WRITE`, or `BLOCK_OP_FLUSH`. 287The structure also contains additional fields telling us the offset and size of 288the transfer (from `//zircon/system/ulib/ddk/include/ddk/protocol/block.h`): 289 290```c 291// simplified from original 292struct block_op { 293 struct { 294 uint32_t command; // command and flags 295 uint32_t extra; // available for temporary use 296 zx_handle_t vmo; // vmo of data to read or write 297 uint32_t length; // transfer length in blocks (0 is invalid) 298 uint64_t offset_dev; // device offset in blocks 299 uint64_t offset_vmo; // vmo offset in blocks 300 uint64_t* pages; // optional physical page list 301 } rw; 302 303 void (*completion_cb)(block_op_t* block, zx_status_t status); 304}; 305``` 306 307The transfer takes place to or from the `vmo` in the structure — in the case of 308a read, we transfer data to the [VMO](../objects/vm_object.md), 309and vice versa for a write. 310The `length` indicates the number of *blocks* (not bytes) to transfer, and the 311two offset fields, `offset_dev` and `offset_vmo`, indicate the relative offsets (again, 312in blocks not bytes) into the device and the [VMO](../objects/vm_object.md) 313of where the transfer should take place. 314 315The implementation is straightforward: 316 317```c 318static void ramdisk_queue(void* ctx, block_op_t* bop) { 319 ramdisk_device_t* ramdev = ctx; 320 321 // (1) see if we should still be handling requests 322 if (ramdev->dead) { 323 bop->completion_cb(bop, ZX_ERR_IO_NOT_PRESENT); 324 return; 325 } 326 327 // (2) what operation are we performing? 328 switch ((bop->command &= BLOCK_OP_MASK)) { 329 case BLOCK_OP_READ: 330 case BLOCK_OP_WRITE: { 331 // (3) perform validation common for both 332 if ((bop->rw.offset_dev >= BLOCK_COUNT) 333 || ((BLOCK_COUNT - bop->rw.offset_dev) < bop->rw.length) 334 || bop->rw.length * BLOCK_SIZE > MAX_TRANSFER_BYTES) { 335 bop->completion_cb(bop, ZX_ERR_OUT_OF_RANGE); 336 return; 337 } 338 339 // (4) compute address 340 void* addr = (void*) ramdev->mapped_addr + bop->rw.offset_dev * BLOCK_SIZE; 341 zx_status_t status; 342 343 // (5) now perform actions specific to each 344 if (bop->command == BLOCK_OP_READ) { 345 status = zx_vmo_write(bop->rw.vmo, addr, bop->rw.offset_vmo * BLOCK_SIZE, 346 bop->rw.length * BLOCK_SIZE); 347 } else { 348 status = zx_vmo_read(bop->rw.vmo, addr, bop->rw.offset_vmo * BLOCK_SIZE, 349 bop->rw.length * BLOCK_SIZE); 350 } 351 352 // (6) indicate completion 353 bop->completion_cb(bop, status); 354 break; 355 } 356 357 case BLOCK_OP_FLUSH: 358 bop->completion_cb(bop, ZX_OK); 359 break; 360 361 default: 362 bop->completion_cb(bop, ZX_ERR_NOT_SUPPORTED); 363 break; 364 } 365} 366``` 367 368As usual, we establish a context block at the top by casting the `ctx` argument. 369The `bop` argument is the "block operation" structure we saw above. 370The `command` field indicates what the **ramdisk_queue()** function should do. 371 372In step (1), we check to see if we've set the `dead` flag (**ramdisk_unbind()** 373sets it when required). 374If so, it means that our device is no longer accepting new requests, so we return 375`ZX_ERR_IO_NOT_PRESENT` in order to encourage clients to close the device. 376 377In step (3), we handle some common validation for both read and write — 378neither should allow offsets that exceed the size of the device, nor transfer 379more than the maximum transfer size. 380 381Similarly, in step (4) we compute the device address (that is, we establish a 382pointer to our [VMAR](../objects/vm_address_region.md) 383that's offset by the appropriate number of blocks as per the request). 384 385In step (5) we perform either a [**zx_vmo_read()**](../syscalls/vmo_read.md) 386or a [**zx_vmo_write()**](../syscalls/vmo_write.md), depending 387on the command. 388This is what transfers data between a pointer within our 389[VMAR](../objects/vm_address_region.md) (`addr`) 390and the client's [VMO](../objects/vm_object.md) (`bop->rw.vmo`). 391Notice that in the read case, we *write* to the [VMO](../objects/vm_object.md), 392and in the write case, we *read* from the [VMO](../objects/vm_object.md). 393 394Finally, in step (6) (and the other two cases), we signal completion via the 395`completion` callback in the block ops structure. 396 397The interesting thing about completion is that: 398 399* it doesn't have to happen right away — we could have queued this 400 operation and signalled completion some time later, 401* it is allowed to be called before this function returns (like we did). 402 403The last point simply means that we are not *forced* to defer completion until 404after the queuing function returns. 405This allows us to complete the operation directly in the function. 406For our trivial RAM-disk example, this makes sense — we have the ability to 407do the data transfer to or from media instantly; no need to defer. 408 409## How is the real one more complicated? 410 411The RAM-disk presented above is somewhat simplified from the "real" RAM-disk 412device (present at `//zircon/system/dev/block/ramdisk/ramdisk.c`). 413 414The real one adds the following functionality: 415 416* dynamic device creation via new VMO 417* ability to use an existing VMO 418* background thread 419* sleep mode 420 421> @@@ how much, if anything, do we want to say about this one? I found the 422> dynamic device creation of interest, for example... 423 424