1.. _inter-vm_communication: 2 3ACRN Inter-VM Communication 4############################################## 5 6ACRN supports three kinds of Inter-VM communication – Inter-VM vUART, 7Inter-VM network communication, and Inter-VM shared memory device (``ivshmem``). 8Each communication method has its pros and cons, as described below. 9 10Inter-VM vUART 11************** 12 13Inter-VM vUART communication is based on the vUART implementation. It is 14used to transfer data between two VMs at low speed (< 120kbps). (Refer to :ref:`vuart_virtualization` 15and :ref:`vuart_config`). 16 17:numref:`Inter-VM vUART communication` shows the Inter-VM vUART communication overview: 18 19.. figure:: images/Inter-VM_vUART_communication_overview.png 20 :align: center 21 :name: Inter-VM vUART communication 22 23 Inter-VM vUART Communication 24 25- Pros: 26 - POSIX APIs; development-friendly (easily used programmatically 27 by using standard POSIX APIs). 28 - UART drivers are available in most Operating Systems. 29 30- Cons: 31 - The communication rate is low. 32 - The communication is only between two VMs. 33 - Need to configure communication vUART in hypervisor scenario. 34 35Inter-VM network communication 36****************************** 37 38Inter-VM network communication is based on the network stack. ACRN supports 39both pass-through NICs to VMs and Virtio-Net solutions. (Refer to :ref:`virtio-net` 40background introductions of ACRN Virtio-Net Architecture and Design). 41 42:numref:`Inter-VM network communication` shows the Inter-VM network communication overview: 43 44.. figure:: images/Inter-VM_network_communication_overview.png 45 :align: center 46 :name: Inter-VM network communication 47 48 Inter-VM Network Communication 49 50- Pros: 51 - Socket-based APIs; development-friendly (easily used programmatically 52 by using Socket-based APIs). 53 - Orders of magnitude faster than vUART. 54 - Multiple VMs can communicate together. 55 56- Cons: 57 - Multiple layers are involved across the data path, which will introduce additional computation and latency. 58 - Potentially more CPU overhead in the Service VM if using ``virtio-net``. 59 60Inter-VM shared memory communication (ivshmem) 61********************************************** 62 63Inter-VM shared memory communication is based on a shared memory mechanism 64to transfer data between VMs. The ACRN Device Model or hypervisor emulates 65a virtual PCI device (called an ``ivshmem device``) to expose this shared memory's 66base address and size. (Refer to :ref:`ivshmem-hld` and :ref:`enable_ivshmem` for the 67background introductions). 68 69:numref:`Inter-VM shared memory communication` shows the Inter-VM shared memory communication overview: 70 71.. figure:: images/Inter-VM_shared_memory_communication_overview.png 72 :align: center 73 :name: Inter-VM shared memory communication 74 75 Inter-VM Shared Memory Communication 76 77- Pros: 78 - Shared memory is exposed to VMs via PCI MMIO Bar and is mapped and accessed directly. 79 - No Service VM emulation is involved if using hv-land ivshmem. Its data path is short, with high bandwidth and low latency. 80 - Multiple VMs can communicate together. 81 82- Cons: 83 - Applications talk to the device directly (via `UIO <https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html#uio>`_ 84 on Linux or `Ivshmem driver <https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem/>`_ 85 on Windows). Because the applications need to directly operate the device resources, additional development work is required. 86 - Applications need to implement protocols such as a handshake, data transfer, and data 87 integrity. 88 89.. _inter-vm_communication_ivshmem_app: 90 91How to implement an Ivshmem application on ACRN 92*********************************************** 93 94As Ivshmem is a PCI device that shares memory inside a virtual machine, the application 95running on the VM can access it as a standard PCI device. The application needs to design 96a data transfer notification mechanism between the VMs. 97 98**The following reference code are used for Inter-VM communication using ivshmem device** 99 100- Initialize Device: 101 102 Enable and bind to UIO PCI driver 103 104 .. code-block:: none 105 106 sudo modprobe uio 107 sudo modprobe uio_pci_generic 108 sudo echo "1af4 1110" > /sys/bus/pci/drivers/uio_pci_generic/new_id 109 110 .. note:: 111 - "1af4" is the Vendor ID and "1110" is the Device ID of the ivshmem device. 112 - For Linux-based User VMs, we recommend using the standard UIO and UIO_PCI_GENERIC 113 drivers through the device node (for example, ``/dev/uioX``). 114 115- UIO IRQ data struct 116 117 .. code-block:: c 118 119 struct uio_irq_data 120 { 121 int fd; 122 int vector; 123 }; 124 125- Ivshmem Device context struct: 126 127 .. code-block:: c 128 129 struct ivsh_dev_context 130 { 131 long uio_nr; 132 int bar0_fd; 133 uint32_t *p_reg; 134 135 int bar2_fd; 136 void *p_shmem; 137 long shmem_size; 138 139 /* used for doorbell mode */ 140 int uio_dev_fd; 141 int epfds_irq[IVSH_MAX_IRQ_NUM]; 142 struct uio_irq_data irq_data[IVSH_MAX_IRQ_NUM]; 143 bool opened; 144 }; 145 146- Init Ivshmem Device context 147 148 .. code-block:: c 149 150 int ivsh_init_dev_ctx(struct ivsh_dev_context *p_ivsh_dev_ctx, long uio_nr) 151 { 152 int i; 153 memset(p_ivsh_dev_ctx, 0, sizeof(*p_ivsh_dev_ctx)); 154 p_ivsh_dev_ctx->uio_nr = uio_nr; 155 p_ivsh_dev_ctx->bar0_fd = -1; 156 p_ivsh_dev_ctx->bar2_fd = -1; 157 p_ivsh_dev_ctx->uio_dev_fd = -1; 158 159 for (i = 0; i < IVSH_MAX_IRQ_NUM; i++) { 160 p_ivsh_dev_ctx->epfds_irq[i] = -1; 161 p_ivsh_dev_ctx->irq_data[i].fd = -1; 162 } 163 p_ivsh_dev_ctx->opened = false; 164 return 0; 165 } 166 167- Get Ivshmem Device shared memory size 168 169 .. code-block:: c 170 171 uint32_t ivsh_get_shmem_size(long uio_nr) 172 { 173 char config_node[PATH_MAX] = {0}; 174 uint32_t shm_size; 175 uint64_t tmp; 176 int cfg_fd; 177 178 sprintf(config_node, "/sys/class/uio/uio%ld/device/config", uio_nr); 179 cfg_fd = open(config_node, O_RDWR); 180 /*Get the size of BAR2(offset in Configuration Space is 0x18) of uio_nr device*/ 181 pread(cfg_fd, &tmp, 8, 0x18); 182 shm_size= ~0U; 183 pwrite(cfg_fd ,&shm_size, 8, 0x18); 184 pread(cfg_fd, &shm_size, 8, 0x18); 185 pwrite(cfg_fd ,&tmp, 8, 0x18); 186 shm_size &= (~0xfUL); 187 shm_size = (shm_size & ~(shm_size - 1)); 188 close(cfg_fd); 189 190 return shm_size; 191 } 192 193- Open Ivshmem Device: 194 195 .. code-block:: c 196 197 /* prepare data struct to record the ivshmem device status */ 198 ret = ivsh_init_dev_ctx(&dev_ctx, ctrl_ctx.uio_nr); 199 200 int open_ivsh_dev(struct ivsh_dev_context *p_ivsh_dev_ctx) 201 { 202 /* mmap reg mmio space from BAR0 */ 203 /* BAR0 is used for emulating interrupt related registers */ 204 sprintf(node_path, "/sys/class/uio/uio%ld/device/resource0", p_ivsh_dev_ctx->uio_nr); 205 p_ivsh_dev_ctx->bar0_fd = open(node_path, O_RDWR); 206 p_ivsh_dev_ctx->p_reg = (uint32_t *)mmap(NULL, IVSH_BAR0_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, p_ivsh_dev_ctx->bar0_fd, 0); 207 208 /* mmap shared memory from BAR2 */ 209 /* BAR2 is used for exposing a shared memory region*/ 210 sprintf(node_path,"/sys/class/uio/uio%ld/device/resource2_wc", p_ivsh_dev_ctx->uio_nr); 211 p_ivsh_dev_ctx->bar2_fd = open(node_path, O_RDWR); 212 p_ivsh_dev_ctx->p_shmem = mmap(NULL, p_ivsh_dev_ctx->shmem_size, PROT_READ|PROT_WRITE, MAP_SHARED, p_ivsh_dev_ctx->bar2_fd, 0); 213 214 /* get shared memory size from config space */ 215 p_ivsh_dev_ctx->shmem_size = ivsh_get_shmem_size(p_ivsh_dev_ctx->uio_nr); 216 // Note: ivsh_get_shmem_size details go to ivsh_get_shmem_size. 217 218 /* used for doorbell mode*/ 219 sprintf(node_path, "/dev/uio%ld", p_ivsh_dev_ctx->uio_nr); 220 p_ivsh_dev_ctx->uio_dev_fd = open(node_path, O_RDWR); 221 for (i = 0; i < IVSH_MAX_IRQ_NUM; i++) { 222 /* create a eventfd for each msix */ 223 evt_fd = eventfd(0, 0); 224 225 /* set eventfds of msix to kernel driver by ioctl */ 226 p_ivsh_dev_ctx->irq_data[i].vector = i; 227 p_ivsh_dev_ctx->irq_data[i].fd = evt_fd; 228 ioctl(p_ivsh_dev_ctx->uio_dev_fd, UIO_IRQ_DATA, &p_ivsh_dev_ctx->irq_data[i]) 229 230 /* create epoll */ 231 p_ivsh_dev_ctx->epfds_irq[i] = epoll_create1(0); 232 233 /* add eventfds of msix to epoll */ 234 events.events = EPOLLIN; 235 events.data.ptr = &p_ivsh_dev_ctx->irq_data[i]; 236 epoll_ctl(p_ivsh_dev_ctx->epfds_irq[i], EPOLL_CTL_ADD, evt_fd, &events) 237 } 238 } 239 240- Close Ivshmem Device 241 242 .. code-block:: c 243 244 void ivsh_close_dev(struct ivsh_dev_context *p_ivsh_dev_ctx) 245 { 246 /* unmap reg mmio space from BAR0 */ 247 munmap(p_ivsh_dev_ctx->p_reg, IVSH_BAR0_SIZE); 248 p_ivsh_dev_ctx->p_reg = NULL; 249 close(p_ivsh_dev_ctx->bar0_fd); 250 p_ivsh_dev_ctx->bar0_fd = -1; 251 252 /* unmap shared memory from BAR2 */ 253 munmap(p_ivsh_dev_ctx->p_shmem, p_ivsh_dev_ctx->shmem_size); 254 p_ivsh_dev_ctx->p_shmem = NULL; 255 close(p_ivsh_dev_ctx->bar2_fd); 256 p_ivsh_dev_ctx->bar2_fd = -1; 257 258 /* used for doorbell mode*/ 259 for (i = 0; i < IVSH_MAX_IRQ_NUM; i++) { 260 close(p_ivsh_dev_ctx->irq_data[i].fd); 261 p_ivsh_dev_ctx->irq_data[i].fd = -1; 262 close(p_ivsh_dev_ctx->epfds_irq[i]); 263 p_ivsh_dev_ctx->epfds_irq[i] = -1; 264 } 265 close(p_ivsh_dev_ctx->uio_dev_fd); 266 p_ivsh_dev_ctx->uio_dev_fd = -1; 267 } 268 269**The following reference code are used for Inter-VM communication based on Doorbell mode:** 270 271- Trigger Ivshmem Doorbell 272 273 .. code-block:: c 274 275 void ivsh_trigger_doorbell(struct ivsh_dev_context *p_ivsh_dev_ctx, uint16_t peer_id, uint16_t vector_id) 276 { 277 p_ivsh_dev_ctx->p_reg[IVSH_REG_DOORBELL >> 2] = (peer_id << 16) | vector_id; 278 } 279 280 281- Wait Ivshmem Device irq 282 283 .. code-block:: c 284 285 static inline int ivsh_wait_irq(struct ivsh_dev_context *p_ivsh_dev_ctx, unsigned int idx) 286 { 287 struct epoll_event ev = {0}; 288 struct uio_irq_data *irq_data = NULL; 289 eventfd_t val; 290 int n; 291 while (1) { 292 n = epoll_wait(p_ivsh_dev_ctx->epfds_irq[idx], &ev, 1, -1); 293 if (n == 1) { 294 irq_data = ev.data.ptr; 295 eventfd_read(irq_data->fd, &val); 296 break; 297 } 298 } 299 } 300 301Data Transfer State-Machine 302=========================== 303 304A state machine is introduced as a communication mechanism between the two VMs, 305which use the same ivshmem PCI device for the data transfer. 306 307It includes three states – RESET, READY, and INIT. RESET state is the initial state 308after ivshmem device is initialized. 309 310- When both VM states are in the RESET, Sender VM prepares the sending data 311 and then sets its state to INIT, and Receiver VM prepares receiving buffer 312 and sets its state to INIT. 313 314- When both VM's state is in the INIT, Sender VM sets its state in the READY 315 after sending all of the data, and Receiver VM sets its state in the READY 316 after receiving all of the transfer data. 317 318- Then both VMs change their status from READY to INIT for starting the next 319 round of data transfer. 320 321:numref:`Inter-VM ivshmem data transfer state machine` shows the state machine relationship: 322 323.. figure:: images/Inter-VM_data_transfer_state_machine.png 324 :align: center 325 :name: Inter-VM ivshmem data transfer state machine 326 327 Inter-VM Ivshmem Data Transfer State Machine 328 329:numref:`Inter-VM ivshmem handshake communication` shows the handshake communication between two machines: 330 331.. figure:: images/Inter-VM_handshake_communication_two_machine.png 332 :align: center 333 :name: Inter-VM ivshmem handshake communication 334 335 Inter-VM Ivshmem Handshake Communication 336 337 338Reference Sender and Receiver Sample Code Based Doorbell Mode 339============================================================= 340 341.. code-block:: c 342 343 struct pay_load_header 344 { 345 uint64_t p0_status; 346 uint64_t p1_status; 347 }; 348 349 void ivsh_test_sender(struct ivsh_dev_context *p_ivsh_dev_ctx, struct ivsh_ctrl_context *p_ivsh_ctrl_ctx) 350 { 351 struct ivsh_test_tx_context tx_ctx; 352 volatile struct pay_load_header *p_hdr; 353 /* Initialize the sender related data */ 354 ivsh_test_tx_init(&tx_ctx, p_ivsh_dev_ctx, p_ivsh_ctrl_ctx); 355 p_hdr = tx_ctx.p_hdr; 356 /*Set P0 status to RESET*/ 357 set_p0_status(p_hdr, SHMEM_STATUS_RESET); 358 while (!is_p1_reset(p_hdr)) 359 usleep(10000); 360 /*Prepare the data to be sent */ 361 ivsh_test_tx_pre_send(&tx_ctx, i); 362 /*Set P0 status to INIT*/ 363 set_p0_status(p_hdr, SHMEM_STATUS_INIT); 364 while (!is_p1_initialized(p_hdr)) { 365 } 366 /*Set P1 status to READY*/ 367 set_p1_status(p_hdr, SHMEM_STATUS_READY); 368 usleep(2000); 369 370 ivsh_test_tx_send(&tx_ctx); 371 ivsh_trigger_doorbell(p_ivsh_dev_ctx, p_ivsh_ctrl_ctx->peer_id, IVSH_TEST_VECTOR_ID); 372 ivsh_wait_irq(p_ivsh_dev_ctx, IVSH_TEST_VECTOR_ID); 373 374 ivsh_test_tx_deinit(&tx_ctx); 375 } 376 377 void ivsh_test_receiver(struct ivsh_dev_context *p_ivsh_dev_ctx, struct ivsh_ctrl_context *p_ivsh_ctrl_ctx) 378 { 379 struct ivsh_test_rx_context rx_ctx; 380 volatile struct pay_load_header *p_hdr; 381 382 /* Initialize the receiver related data */ 383 ivsh_test_rx_init(&rx_ctx, p_ivsh_dev_ctx, p_ivsh_ctrl_ctx); 384 p_hdr = rx_ctx.p_hdr; 385 while (!is_p0_reset(p_hdr)) 386 usleep(10000); 387 set_p1_status(p_hdr, SHMEM_STATUS_RESET); 388 389 while (!is_p0_initialized(p_hdr)) 390 usleep(100); 391 set_p1_status(p_hdr, SHMEM_STATUS_INIT); 392 set_p0_status(p_hdr, SHMEM_STATUS_READY); 393 394 /* waiting for p0 write done */ 395 ivsh_wait_irq(p_ivsh_dev_ctx, IVSH_TEST_VECTOR_ID); 396 ivsh_test_rx_recv(&rx_ctx); 397 usleep(100); 398 399 ivsh_test_rx_deinit(&rx_ctx); 400 } 401 402Reference Sender and Receiver Sample Code Based Polling Mode 403============================================================ 404 405.. code-block:: c 406 407 /*open_ivshmem_device*/ 408 p_ivsh_dev_ctx->tfd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK); 409 p_ivsh_dev_ctx->epfd_timer = epoll_create1(0); /* create epoll */ 410 events.events = EPOLLIN; 411 epoll_ctl(p_ivsh_dev_ctx->epfd_timer, EPOLL_CTL_ADD, p_ivsh_dev_ctx->tfd, &events) 412 413 /*close_ivshmem_device*/ 414 close(p_ivsh_dev_ctx->tfd); 415 p_ivsh_dev_ctx->tfd = -1; 416 close(p_ivsh_dev_ctx->epfd_timer); 417 p_ivsh_dev_ctx->epfd_timer = -1; 418 419 struct pay_load_header 420 { 421 uint64_t p0_status; 422 uint64_t p1_status; 423 }; 424 425 void ivsh_test_sender(struct ivsh_dev_context *p_ivsh_dev_ctx, struct ivsh_ctrl_context *p_ivsh_ctrl_ctx) 426 { 427 struct ivsh_test_tx_context tx_ctx; 428 volatile struct pay_load_header *p_hdr; 429 430 ivsh_test_tx_init(&tx_ctx, p_ivsh_dev_ctx, p_ivsh_ctrl_ctx); 431 p_hdr = tx_ctx.p_hdr; 432 set_p0_status(p_hdr, SHMEM_STATUS_RESET); 433 while (!is_p1_reset(p_hdr)) 434 usleep(10000); 435 436 ivsh_test_tx_pre_send(&tx_ctx, i); 437 set_p0_status(p_hdr, SHMEM_STATUS_INIT); 438 while (!is_p1_initialized(p_hdr)) { 439 } 440 /*Set P1 status to READY*/ 441 set_p1_status(p_hdr, SHMEM_STATUS_READY); 442 usleep(2000); 443 444 ivsh_test_tx_send(&tx_ctx); 445 ivsh_poll(p_ivsh_dev_ctx); 446 ivsh_test_tx_deinit(&tx_ctx); 447 } 448 449 void ivsh_test_receiver(struct ivsh_dev_context *p_ivsh_dev_ctx, struct ivsh_ctrl_context *p_ivsh_ctrl_ctx) 450 { 451 struct ivsh_test_rx_context rx_ctx; 452 volatile struct pay_load_header *p_hdr; 453 454 ivsh_test_rx_init(&rx_ctx, p_ivsh_dev_ctx, p_ivsh_ctrl_ctx); 455 p_hdr = rx_ctx.p_hdr; 456 while (!is_p0_reset(p_hdr)) 457 usleep(10000); 458 set_p1_status(p_hdr, SHMEM_STATUS_RESET); 459 460 while (!is_p0_initialized(p_hdr)) 461 usleep(100); 462 set_p1_status(p_hdr, SHMEM_STATUS_INIT); 463 set_p0_status(p_hdr, SHMEM_STATUS_READY); 464 465 ivsh_poll(p_ivsh_dev_ctx); 466 ivsh_test_rx_recv(&rx_ctx); 467 usleep(100); 468 469 ivsh_test_rx_deinit(&rx_ctx); 470 } 471 472 int ivsh_poll(struct ivsh_dev_context *p_ivsh_dev_ctx) 473 { 474 struct epoll_event ev = {0}; 475 uint64_t res; 476 int n; 477 assert(p_ivsh_dev_ctx->cb); 478 479 while (1) { 480 if (p_ivsh_dev_ctx->epfd_timer < 0) { 481 if (p_ivsh_dev_ctx->cb(p_ivsh_dev_ctx->param)) 482 break; 483 } else { 484 n = epoll_wait(p_ivsh_dev_ctx->epfd_timer, &ev, 1, -1); 485 if (n == 1) { 486 read(p_ivsh_dev_ctx->tfd, &res, sizeof(res)); 487 break; 488 } 489 if (n < 0 && errno != EINTR) 490 printf("epoll wait error %s\n", strerror(errno)); 491 } 492 } 493 } 494