1.. _inter-vm_communication:
2
3ACRN Inter-VM Communication
4##############################################
5
6ACRN supports three kinds of Inter-VM communication – Inter-VM vUART,
7Inter-VM network communication, and Inter-VM shared memory device (``ivshmem``).
8Each communication method has its pros and cons, as described below.
9
10Inter-VM vUART
11**************
12
13Inter-VM vUART communication is based on the vUART implementation. It is
14used to transfer data between two VMs at low speed (< 120kbps). (Refer to :ref:`vuart_virtualization`
15and :ref:`vuart_config`).
16
17:numref:`Inter-VM vUART communication` shows the Inter-VM vUART communication overview:
18
19.. figure:: images/Inter-VM_vUART_communication_overview.png
20   :align: center
21   :name: Inter-VM vUART communication
22
23   Inter-VM vUART Communication
24
25- Pros:
26   - POSIX APIs; development-friendly (easily used programmatically
27     by using standard POSIX APIs).
28   - UART drivers are available in most Operating Systems.
29
30- Cons:
31   - The communication rate is low.
32   - The communication is only between two VMs.
33   - Need to configure communication vUART in hypervisor scenario.
34
35Inter-VM network communication
36******************************
37
38Inter-VM network communication is based on the network stack. ACRN supports
39both pass-through NICs to VMs and Virtio-Net solutions. (Refer to :ref:`virtio-net`
40background introductions of ACRN Virtio-Net Architecture and Design).
41
42:numref:`Inter-VM network communication` shows the Inter-VM network communication overview:
43
44.. figure:: images/Inter-VM_network_communication_overview.png
45   :align: center
46   :name: Inter-VM network communication
47
48   Inter-VM Network Communication
49
50- Pros:
51   - Socket-based APIs; development-friendly (easily used programmatically
52     by using Socket-based APIs).
53   - Orders of magnitude faster than vUART.
54   - Multiple VMs can communicate together.
55
56- Cons:
57   - Multiple layers are involved across the data path, which will introduce additional computation and latency.
58   - Potentially more CPU overhead in the Service VM if using ``virtio-net``.
59
60Inter-VM shared memory communication (ivshmem)
61**********************************************
62
63Inter-VM shared memory communication is based on a shared memory mechanism
64to transfer data between VMs. The ACRN Device Model or hypervisor emulates
65a virtual PCI device (called an ``ivshmem device``) to expose this shared memory's
66base address and size. (Refer to :ref:`ivshmem-hld` and :ref:`enable_ivshmem` for the
67background introductions).
68
69:numref:`Inter-VM shared memory communication` shows the Inter-VM shared memory communication overview:
70
71.. figure:: images/Inter-VM_shared_memory_communication_overview.png
72   :align: center
73   :name: Inter-VM shared memory communication
74
75   Inter-VM Shared Memory Communication
76
77- Pros:
78   - Shared memory is exposed to VMs via PCI MMIO Bar and is mapped and accessed directly.
79   - No Service VM emulation is involved if using hv-land ivshmem. Its data path is short, with high bandwidth and low latency.
80   - Multiple VMs can communicate together.
81
82- Cons:
83   - Applications talk to the device directly (via `UIO <https://doc.dpdk.org/guides/linux_gsg/linux_drivers.html#uio>`_
84     on Linux or `Ivshmem driver <https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem/>`_
85     on Windows). Because the applications need to directly operate the device resources, additional development work is required.
86   - Applications need to implement protocols such as a handshake, data transfer, and data
87     integrity.
88
89.. _inter-vm_communication_ivshmem_app:
90
91How to implement an Ivshmem application on ACRN
92***********************************************
93
94As Ivshmem is a PCI device that shares memory inside a virtual machine, the application
95running on the VM can access it as a standard PCI device. The application needs to design
96a data transfer notification mechanism between the VMs.
97
98**The following reference code are used for Inter-VM communication using ivshmem device**
99
100- Initialize Device:
101
102  Enable and bind to UIO PCI driver
103
104  .. code-block:: none
105
106     sudo modprobe uio
107     sudo modprobe uio_pci_generic
108     sudo echo "1af4 1110" > /sys/bus/pci/drivers/uio_pci_generic/new_id
109
110  .. note::
111     - "1af4" is the Vendor ID and "1110" is the Device ID of the ivshmem device.
112     - For Linux-based User VMs, we recommend using the standard UIO and UIO_PCI_GENERIC
113       drivers through the device node (for example, ``/dev/uioX``).
114
115- UIO IRQ data struct
116
117  .. code-block:: c
118
119     struct uio_irq_data
120     {
121         int fd;
122         int vector;
123     };
124
125- Ivshmem Device context struct:
126
127  .. code-block:: c
128
129     struct ivsh_dev_context
130     {
131         long uio_nr;
132         int bar0_fd;
133         uint32_t *p_reg;
134
135         int bar2_fd;
136         void *p_shmem;
137         long shmem_size;
138
139         /* used for doorbell mode */
140         int uio_dev_fd;
141         int epfds_irq[IVSH_MAX_IRQ_NUM];
142         struct uio_irq_data irq_data[IVSH_MAX_IRQ_NUM];
143         bool opened;
144     };
145
146- Init Ivshmem Device context
147
148  .. code-block:: c
149
150     int ivsh_init_dev_ctx(struct ivsh_dev_context *p_ivsh_dev_ctx, long uio_nr)
151     {
152         int i;
153         memset(p_ivsh_dev_ctx, 0, sizeof(*p_ivsh_dev_ctx));
154         p_ivsh_dev_ctx->uio_nr = uio_nr;
155         p_ivsh_dev_ctx->bar0_fd = -1;
156         p_ivsh_dev_ctx->bar2_fd = -1;
157         p_ivsh_dev_ctx->uio_dev_fd = -1;
158
159         for (i = 0; i < IVSH_MAX_IRQ_NUM; i++) {
160             p_ivsh_dev_ctx->epfds_irq[i] = -1;
161             p_ivsh_dev_ctx->irq_data[i].fd = -1;
162         }
163         p_ivsh_dev_ctx->opened = false;
164         return 0;
165     }
166
167- Get Ivshmem Device shared memory size
168
169  .. code-block:: c
170
171     uint32_t ivsh_get_shmem_size(long uio_nr)
172     {
173         char config_node[PATH_MAX] = {0};
174         uint32_t shm_size;
175         uint64_t tmp;
176         int cfg_fd;
177
178         sprintf(config_node, "/sys/class/uio/uio%ld/device/config", uio_nr);
179         cfg_fd = open(config_node, O_RDWR);
180         /*Get the size of BAR2(offset in Configuration Space is 0x18) of uio_nr device*/
181         pread(cfg_fd, &tmp, 8, 0x18);
182         shm_size= ~0U;
183         pwrite(cfg_fd ,&shm_size, 8, 0x18);
184         pread(cfg_fd, &shm_size, 8, 0x18);
185         pwrite(cfg_fd ,&tmp, 8, 0x18);
186         shm_size &= (~0xfUL);
187         shm_size = (shm_size & ~(shm_size - 1));
188         close(cfg_fd);
189
190         return shm_size;
191     }
192
193- Open Ivshmem Device:
194
195  .. code-block:: c
196
197     /* prepare data struct to record the ivshmem device status */
198     ret = ivsh_init_dev_ctx(&dev_ctx, ctrl_ctx.uio_nr);
199
200     int open_ivsh_dev(struct ivsh_dev_context *p_ivsh_dev_ctx)
201     {
202         /* mmap reg mmio space from BAR0 */
203         /* BAR0 is used for emulating interrupt related registers */
204         sprintf(node_path, "/sys/class/uio/uio%ld/device/resource0", p_ivsh_dev_ctx->uio_nr);
205         p_ivsh_dev_ctx->bar0_fd = open(node_path, O_RDWR);
206         p_ivsh_dev_ctx->p_reg = (uint32_t *)mmap(NULL, IVSH_BAR0_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, p_ivsh_dev_ctx->bar0_fd, 0);
207
208          /* mmap shared memory from BAR2 */
209          /* BAR2 is used for exposing a shared memory region*/
210          sprintf(node_path,"/sys/class/uio/uio%ld/device/resource2_wc", p_ivsh_dev_ctx->uio_nr);
211          p_ivsh_dev_ctx->bar2_fd = open(node_path, O_RDWR);
212          p_ivsh_dev_ctx->p_shmem = mmap(NULL, p_ivsh_dev_ctx->shmem_size, PROT_READ|PROT_WRITE, MAP_SHARED, p_ivsh_dev_ctx->bar2_fd, 0);
213
214         /* get shared memory size from config space */
215         p_ivsh_dev_ctx->shmem_size = ivsh_get_shmem_size(p_ivsh_dev_ctx->uio_nr);
216         // Note: ivsh_get_shmem_size details go to ivsh_get_shmem_size.
217
218         /* used for doorbell mode*/
219         sprintf(node_path, "/dev/uio%ld", p_ivsh_dev_ctx->uio_nr);
220         p_ivsh_dev_ctx->uio_dev_fd = open(node_path, O_RDWR);
221         for (i = 0; i < IVSH_MAX_IRQ_NUM; i++) {
222             /* create a eventfd for each msix */
223             evt_fd = eventfd(0, 0);
224
225             /* set eventfds of msix to kernel driver by ioctl */
226             p_ivsh_dev_ctx->irq_data[i].vector = i;
227             p_ivsh_dev_ctx->irq_data[i].fd = evt_fd;
228             ioctl(p_ivsh_dev_ctx->uio_dev_fd, UIO_IRQ_DATA, &p_ivsh_dev_ctx->irq_data[i])
229
230             /* create epoll */
231             p_ivsh_dev_ctx->epfds_irq[i] = epoll_create1(0);
232
233             /* add eventfds of msix to epoll */
234             events.events = EPOLLIN;
235             events.data.ptr = &p_ivsh_dev_ctx->irq_data[i];
236             epoll_ctl(p_ivsh_dev_ctx->epfds_irq[i], EPOLL_CTL_ADD, evt_fd, &events)
237         }
238     }
239
240- Close Ivshmem Device
241
242  .. code-block:: c
243
244     void ivsh_close_dev(struct ivsh_dev_context *p_ivsh_dev_ctx)
245     {
246         /* unmap reg mmio space from BAR0 */
247         munmap(p_ivsh_dev_ctx->p_reg, IVSH_BAR0_SIZE);
248         p_ivsh_dev_ctx->p_reg = NULL;
249         close(p_ivsh_dev_ctx->bar0_fd);
250         p_ivsh_dev_ctx->bar0_fd = -1;
251
252         /* unmap shared memory from BAR2 */
253         munmap(p_ivsh_dev_ctx->p_shmem, p_ivsh_dev_ctx->shmem_size);
254         p_ivsh_dev_ctx->p_shmem = NULL;
255         close(p_ivsh_dev_ctx->bar2_fd);
256         p_ivsh_dev_ctx->bar2_fd = -1;
257
258         /* used for doorbell mode*/
259         for (i = 0; i < IVSH_MAX_IRQ_NUM; i++) {
260             close(p_ivsh_dev_ctx->irq_data[i].fd);
261             p_ivsh_dev_ctx->irq_data[i].fd = -1;
262             close(p_ivsh_dev_ctx->epfds_irq[i]);
263             p_ivsh_dev_ctx->epfds_irq[i] = -1;
264         }
265         close(p_ivsh_dev_ctx->uio_dev_fd);
266         p_ivsh_dev_ctx->uio_dev_fd = -1;
267     }
268
269**The following reference code are used for Inter-VM communication based on Doorbell mode:**
270
271- Trigger Ivshmem Doorbell
272
273  .. code-block:: c
274
275     void ivsh_trigger_doorbell(struct ivsh_dev_context *p_ivsh_dev_ctx, uint16_t peer_id, uint16_t vector_id)
276     {
277         p_ivsh_dev_ctx->p_reg[IVSH_REG_DOORBELL >> 2] = (peer_id << 16) | vector_id;
278     }
279
280
281- Wait Ivshmem Device irq
282
283  .. code-block:: c
284
285     static inline int ivsh_wait_irq(struct ivsh_dev_context *p_ivsh_dev_ctx, unsigned int idx)
286     {
287         struct epoll_event ev = {0};
288         struct uio_irq_data *irq_data = NULL;
289         eventfd_t val;
290         int n;
291         while (1) {
292             n = epoll_wait(p_ivsh_dev_ctx->epfds_irq[idx], &ev, 1, -1);
293             if (n == 1) {
294                 irq_data = ev.data.ptr;
295                 eventfd_read(irq_data->fd, &val);
296                 break;
297             }
298         }
299     }
300
301Data Transfer State-Machine
302===========================
303
304A state machine is introduced as a communication mechanism between the two VMs,
305which use the same ivshmem PCI device for the data transfer.
306
307It includes three states – RESET, READY, and INIT. RESET state is the initial state
308after ivshmem device is initialized.
309
310- When both VM states are in the RESET, Sender VM prepares the sending data
311  and then sets its state to INIT, and Receiver VM prepares receiving buffer
312  and sets its state to INIT.
313
314- When both VM's state is in the INIT, Sender VM sets its state in the READY
315  after sending all of the data, and Receiver VM sets its state in the READY
316  after receiving all of the transfer data.
317
318- Then both VMs change their status from READY to INIT for starting the next
319  round of data transfer.
320
321:numref:`Inter-VM ivshmem data transfer state machine` shows the state machine relationship:
322
323.. figure:: images/Inter-VM_data_transfer_state_machine.png
324   :align: center
325   :name: Inter-VM ivshmem data transfer state machine
326
327   Inter-VM Ivshmem Data Transfer State Machine
328
329:numref:`Inter-VM ivshmem handshake communication` shows the handshake communication between two machines:
330
331.. figure:: images/Inter-VM_handshake_communication_two_machine.png
332   :align: center
333   :name: Inter-VM ivshmem handshake communication
334
335   Inter-VM Ivshmem Handshake Communication
336
337
338Reference Sender and Receiver Sample Code Based Doorbell Mode
339=============================================================
340
341.. code-block:: c
342
343   struct pay_load_header
344   {
345       uint64_t	p0_status;
346      uint64_t	p1_status;
347   };
348
349   void ivsh_test_sender(struct ivsh_dev_context *p_ivsh_dev_ctx, struct ivsh_ctrl_context *p_ivsh_ctrl_ctx)
350   {
351       struct ivsh_test_tx_context tx_ctx;
352       volatile struct pay_load_header *p_hdr;
353       /* Initialize the sender related data */
354       ivsh_test_tx_init(&tx_ctx, p_ivsh_dev_ctx, p_ivsh_ctrl_ctx);
355       p_hdr = tx_ctx.p_hdr;
356       /*Set P0 status to RESET*/
357       set_p0_status(p_hdr, SHMEM_STATUS_RESET);
358       while (!is_p1_reset(p_hdr))
359           usleep(10000);
360       /*Prepare the data to be sent */
361       ivsh_test_tx_pre_send(&tx_ctx, i);
362       /*Set P0 status to INIT*/
363       set_p0_status(p_hdr, SHMEM_STATUS_INIT);
364       while (!is_p1_initialized(p_hdr)) {
365       }
366       /*Set P1 status to READY*/
367       set_p1_status(p_hdr, SHMEM_STATUS_READY);
368       usleep(2000);
369
370       ivsh_test_tx_send(&tx_ctx);
371       ivsh_trigger_doorbell(p_ivsh_dev_ctx, p_ivsh_ctrl_ctx->peer_id, IVSH_TEST_VECTOR_ID);
372       ivsh_wait_irq(p_ivsh_dev_ctx, IVSH_TEST_VECTOR_ID);
373
374       ivsh_test_tx_deinit(&tx_ctx);
375   }
376
377   void ivsh_test_receiver(struct ivsh_dev_context *p_ivsh_dev_ctx, struct ivsh_ctrl_context *p_ivsh_ctrl_ctx)
378   {
379      struct ivsh_test_rx_context rx_ctx;
380      volatile struct pay_load_header *p_hdr;
381
382       /* Initialize the receiver related data */
383       ivsh_test_rx_init(&rx_ctx, p_ivsh_dev_ctx, p_ivsh_ctrl_ctx);
384       p_hdr = rx_ctx.p_hdr;
385       while (!is_p0_reset(p_hdr))
386           usleep(10000);
387       set_p1_status(p_hdr, SHMEM_STATUS_RESET);
388
389       while (!is_p0_initialized(p_hdr))
390           usleep(100);
391       set_p1_status(p_hdr, SHMEM_STATUS_INIT);
392       set_p0_status(p_hdr, SHMEM_STATUS_READY);
393
394       /* waiting for p0 write done */
395       ivsh_wait_irq(p_ivsh_dev_ctx, IVSH_TEST_VECTOR_ID);
396       ivsh_test_rx_recv(&rx_ctx);
397       usleep(100);
398
399       ivsh_test_rx_deinit(&rx_ctx);
400   }
401
402Reference Sender and Receiver Sample Code Based Polling Mode
403============================================================
404
405.. code-block:: c
406
407   /*open_ivshmem_device*/
408   p_ivsh_dev_ctx->tfd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK);
409   p_ivsh_dev_ctx->epfd_timer = epoll_create1(0);    /* create epoll */
410   events.events = EPOLLIN;
411   epoll_ctl(p_ivsh_dev_ctx->epfd_timer, EPOLL_CTL_ADD, p_ivsh_dev_ctx->tfd, &events)
412
413   /*close_ivshmem_device*/
414   close(p_ivsh_dev_ctx->tfd);
415   p_ivsh_dev_ctx->tfd = -1;
416   close(p_ivsh_dev_ctx->epfd_timer);
417   p_ivsh_dev_ctx->epfd_timer = -1;
418
419   struct pay_load_header
420   {
421       uint64_t	p0_status;
422       uint64_t	p1_status;
423   };
424
425   void ivsh_test_sender(struct ivsh_dev_context *p_ivsh_dev_ctx, struct ivsh_ctrl_context *p_ivsh_ctrl_ctx)
426   {
427       struct ivsh_test_tx_context tx_ctx;
428       volatile struct pay_load_header *p_hdr;
429
430       ivsh_test_tx_init(&tx_ctx, p_ivsh_dev_ctx, p_ivsh_ctrl_ctx);
431       p_hdr = tx_ctx.p_hdr;
432       set_p0_status(p_hdr, SHMEM_STATUS_RESET);
433       while (!is_p1_reset(p_hdr))
434           usleep(10000);
435
436       ivsh_test_tx_pre_send(&tx_ctx, i);
437       set_p0_status(p_hdr, SHMEM_STATUS_INIT);
438       while (!is_p1_initialized(p_hdr)) {
439       }
440       /*Set P1 status to READY*/
441       set_p1_status(p_hdr, SHMEM_STATUS_READY);
442       usleep(2000);
443
444       ivsh_test_tx_send(&tx_ctx);
445       ivsh_poll(p_ivsh_dev_ctx);
446       ivsh_test_tx_deinit(&tx_ctx);
447   }
448
449   void ivsh_test_receiver(struct ivsh_dev_context *p_ivsh_dev_ctx, struct ivsh_ctrl_context *p_ivsh_ctrl_ctx)
450   {
451       struct ivsh_test_rx_context rx_ctx;
452       volatile struct pay_load_header *p_hdr;
453
454       ivsh_test_rx_init(&rx_ctx, p_ivsh_dev_ctx, p_ivsh_ctrl_ctx);
455       p_hdr = rx_ctx.p_hdr;
456       while (!is_p0_reset(p_hdr))
457           usleep(10000);
458       set_p1_status(p_hdr, SHMEM_STATUS_RESET);
459
460       while (!is_p0_initialized(p_hdr))
461           usleep(100);
462       set_p1_status(p_hdr, SHMEM_STATUS_INIT);
463       set_p0_status(p_hdr, SHMEM_STATUS_READY);
464
465       ivsh_poll(p_ivsh_dev_ctx);
466       ivsh_test_rx_recv(&rx_ctx);
467       usleep(100);
468
469       ivsh_test_rx_deinit(&rx_ctx);
470   }
471
472   int ivsh_poll(struct ivsh_dev_context *p_ivsh_dev_ctx)
473   {
474       struct epoll_event ev = {0};
475       uint64_t res;
476       int n;
477       assert(p_ivsh_dev_ctx->cb);
478
479       while (1) {
480           if (p_ivsh_dev_ctx->epfd_timer < 0) {
481               if (p_ivsh_dev_ctx->cb(p_ivsh_dev_ctx->param))
482                   break;
483           } else {
484               n = epoll_wait(p_ivsh_dev_ctx->epfd_timer, &ev, 1, -1);
485               if (n == 1) {
486                   read(p_ivsh_dev_ctx->tfd, &res, sizeof(res));
487                   break;
488               }
489               if (n < 0 && errno != EINTR)
490                   printf("epoll wait error %s\n", strerror(errno));
491           }
492       }
493   }
494