1.. _virtio-net: 2 3Virtio-Net 4########## 5 6Virtio-net is the para-virtualization solution used in ACRN for 7networking. The ACRN Device Model emulates virtual NICs for User VM and the 8frontend virtio network driver, simulating the virtual NIC and following 9the virtio specification. (Refer to :ref:`introduction` and 10:ref:`virtio-hld` background introductions to ACRN and Virtio.) 11 12Here are some notes about Virtio-net support in ACRN: 13 14- Legacy devices are supported, modern devices are not supported 15- Two virtqueues are used in virtio-net: RX queue and TX queue 16- Indirect descriptor is supported 17- TAP backend is supported 18- Control queue is not supported 19- NIC multiple queues are not supported 20 21Network Virtualization Architecture 22*********************************** 23 24ACRN's network virtualization architecture is shown below in 25:numref:`net-virt-arch`, and illustrates the many necessary network 26virtualization components that must cooperate for the User VM to send and 27receive data from the outside world. 28 29.. figure:: images/network-virt-arch.png 30 :align: center 31 :width: 900px 32 :name: net-virt-arch 33 34 Network Virtualization Architecture 35 36(The green components are parts of the ACRN solution, while the gray 37components are parts of the Linux kernel.) 38 39Let's explore these components further. 40 41Service VM/User VM Network Stack: 42 This is the standard Linux TCP/IP stack and the most 43 feature-rich TCP/IP implementation. 44 45virtio-net Frontend Driver: 46 This is the standard driver in the Linux Kernel for virtual Ethernet 47 devices. This driver matches devices with PCI vendor ID 0x1AF4 and PCI 48 Device ID 0x1000 (for legacy devices in our case) or 0x1041 (for modern 49 devices). The virtual NIC supports two virtqueues, one for transmitting 50 packets and the other for receiving packets. The frontend driver places 51 empty buffers into one virtqueue for receiving packets, and enqueues 52 outgoing packets into another virtqueue for transmission. The size of 53 each virtqueue is 1024, configurable in the virtio-net backend driver. 54 55ACRN Hypervisor: 56 The ACRN hypervisor is a type 1 hypervisor, running directly on the 57 bare-metal hardware, and suitable for a variety of IoT and embedded 58 device solutions. It fetches and analyzes the guest instructions, puts 59 the decoded information into the shared page as an IOREQ, and notifies 60 or interrupts the HSM module in the Service VM for processing. 61 62HSM Module: 63 The Hypervisor Service Module (HSM) is a kernel module in the 64 Service VM acting as a middle layer to support the Device Model 65 and hypervisor. The HSM forwards a IOREQ to the virtio-net backend 66 driver for processing. 67 68ACRN Device Model and virtio-net Backend Driver: 69 The ACRN Device Model (DM) gets an IOREQ from a shared page and calls 70 the virtio-net backend driver to process the request. The backend driver 71 receives the data in a shared virtqueue and sends it to the TAP device. 72 73Bridge and TAP Device: 74 Bridge and TAP are standard virtual network infrastructures. They play 75 an important role in communication among the Service VM, the User VM, and the 76 outside world. 77 78IGB Driver: 79 IGB is the physical Network Interface Card (NIC) Linux kernel driver 80 responsible for sending data to and receiving data from the physical 81 NIC. 82 83The virtual network card (NIC) is implemented as a virtio legacy device 84in the ACRN Device Model (DM). It is registered as a PCI virtio device 85to the guest OS (User VM) and uses the standard virtio-net in the Linux kernel as 86its driver (the guest kernel should be built with 87``CONFIG_VIRTIO_NET=y``). 88 89The virtio-net backend in DM forwards the data received from the 90frontend to the TAP device, then from the TAP device to the bridge, and 91finally from the bridge to the physical NIC driver, and vice versa for 92returning data from the NIC to the frontend. 93 94ACRN Virtio-Network Calling Stack 95********************************* 96 97Various components of ACRN network virtualization are shown in the 98architecture diagram shows in :numref:`net-virt-arch`. In this section, 99we will use User VM data transmission (TX) and reception (RX) examples to 100explain step-by-step how these components work together to implement 101ACRN network virtualization. 102 103Initialization in Device Model 104============================== 105 106**virtio_net_init** 107 108- Present frontend for a virtual PCI based NIC 109- Setup control plan callbacks 110- Setup data plan callbacks, including TX, RX 111- Setup TAP backend 112 113Initialization in Virtio-Net Frontend Driver 114============================================ 115 116**virtio_pci_probe** 117 118- Construct virtio device using virtual PCI device and register it to 119 virtio bus 120 121**virtio_dev_probe --> virtnet_probe --> init_vqs** 122 123- Register network driver 124- Setup shared virtqueues 125 126ACRN User VM TX FLOW 127==================== 128 129The following shows the ACRN User VM network TX flow, using TCP as an 130example, showing the flow through each layer: 131 132**User VM TCP Layer** 133 134.. code-block:: c 135 136 tcp_sendmsg --> 137 tcp_sendmsg_locked --> 138 tcp_push_one --> 139 tcp_write_xmit --> 140 tcp_transmit_skb --> 141 142**User VM IP Layer** 143 144.. code-block:: c 145 146 ip_queue_xmit --> 147 ip_local_out --> 148 __ip_local_out --> 149 dst_output --> 150 ip_output --> 151 ip_finish_output --> 152 ip_finish_output2 --> 153 neigh_output --> 154 neigh_resolve_output --> 155 156**User VM MAC Layer** 157 158.. code-block:: c 159 160 dev_queue_xmit --> 161 __dev_queue_xmit --> 162 dev_hard_start_xmit --> 163 xmit_one --> 164 netdev_start_xmit --> 165 __netdev_start_xmit --> 166 167 168**User VM MAC Layer virtio-net Frontend Driver** 169 170.. code-block:: c 171 172 start_xmit --> // virtual NIC driver xmit in virtio_net 173 xmit_skb --> 174 virtqueue_add_outbuf --> // add out buffer to shared virtqueue 175 virtqueue_add --> 176 177 virtqueue_kick --> // notify the backend 178 virtqueue_notify --> 179 vp_notify --> 180 iowrite16 --> // trap here, HV will first get notified 181 182**ACRN Hypervisor** 183 184.. code-block:: c 185 186 vmexit_handler --> // vmexit because VMX_EXIT_REASON_IO_INSTRUCTION 187 pio_instr_vmexit_handler --> 188 emulate_io --> // ioreq cant be processed in HV, forward it to HSM 189 acrn_insert_request_wait --> 190 fire_hsm_interrupt --> // interrupt Service VM, HSM will get notified 191 192**HSM Module** 193 194.. code-block:: c 195 196 hsm_intr_handler --> // HSM interrupt handler 197 tasklet_schedule --> 198 io_req_tasklet --> 199 acrn_ioreq_distribute_request --> // ioreq can't be processed in HSM, forward it to device DM 200 acrn_ioreq_notify_client --> 201 wake_up_interruptible --> // wake up DM to handle ioreq 202 203**ACRN Device Model / virtio-net Backend Driver** 204 205.. code-block:: c 206 207 handle_vmexit --> 208 vmexit_inout --> 209 emulate_inout --> 210 pci_emul_io_handler --> 211 virtio_pci_write --> 212 virtio_pci_legacy_write --> 213 virtio_net_ping_txq --> // start TX thread to process, notify thread return 214 virtio_net_tx_thread --> // this is TX thread 215 virtio_net_proctx --> // call corresponding backend (tap) to process 216 virtio_net_tap_tx --> 217 writev --> // write data to tap device 218 219**Service VM TAP Device Forwarding** 220 221.. code-block:: c 222 223 do_writev --> 224 vfs_writev --> 225 do_iter_write --> 226 do_iter_readv_writev --> 227 call_write_iter --> 228 tun_chr_write_iter --> 229 tun_get_user --> 230 netif_receive_skb --> 231 netif_receive_skb_internal --> 232 __netif_receive_skb --> 233 __netif_receive_skb_core --> 234 235 236**Service VM Bridge Forwarding** 237 238.. code-block:: c 239 240 br_handle_frame --> 241 br_handle_frame_finish --> 242 br_forward --> 243 __br_forward --> 244 br_forward_finish --> 245 br_dev_queue_push_xmit --> 246 247**Service VM MAC Layer** 248 249.. code-block:: c 250 251 dev_queue_xmit --> 252 __dev_queue_xmit --> 253 dev_hard_start_xmit --> 254 xmit_one --> 255 netdev_start_xmit --> 256 __netdev_start_xmit --> 257 258 259**Service VM MAC Layer IGB Driver** 260 261.. code-block:: c 262 263 igb_xmit_frame --> // IGB physical NIC driver xmit function 264 265ACRN User VM RX FLOW 266==================== 267 268The following shows the ACRN User VM network RX flow, using TCP as an example. 269Let's start by receiving a device interrupt. (Note that the hypervisor 270will first get notified when receiving an interrupt even in passthrough 271cases.) 272 273**Hypervisor Interrupt Dispatch** 274 275.. code-block:: c 276 277 vmexit_handler --> // vmexit because VMX_EXIT_REASON_EXTERNAL_INTERRUPT 278 external_interrupt_vmexit_handler --> 279 dispatch_interrupt --> 280 common_handler_edge --> 281 ptdev_interrupt_handler --> 282 ptdev_enqueue_softirq --> // Interrupt will be delivered in bottom-half softirq 283 284 285**Hypervisor Interrupt Injection** 286 287.. code-block:: c 288 289 do_softirq --> 290 ptdev_softirq --> 291 vlapic_intr_msi --> // insert the interrupt into Service VM 292 293 start_vcpu --> // VM Entry here, will process the pending interrupts 294 295**Service VM MAC Layer IGB Driver** 296 297.. code-block:: c 298 299 do_IRQ --> 300 ... 301 igb_msix_ring --> 302 igbpoll --> 303 napi_gro_receive --> 304 napi_skb_finish --> 305 netif_receive_skb_internal --> 306 __netif_receive_skb --> 307 __netif_receive_skb_core -- 308 309**Service VM Bridge Forwarding** 310 311.. code-block:: c 312 313 br_handle_frame --> 314 br_handle_frame_finish --> 315 br_forward --> 316 __br_forward --> 317 br_forward_finish --> 318 br_dev_queue_push_xmit --> 319 320**Service VM MAC Layer** 321 322.. code-block:: c 323 324 dev_queue_xmit --> 325 __dev_queue_xmit --> 326 dev_hard_start_xmit --> 327 xmit_one --> 328 netdev_start_xmit --> 329 __netdev_start_xmit --> 330 331**Service VM MAC Layer TAP Driver** 332 333.. code-block:: c 334 335 tun_net_xmit --> // Notify and wake up reader process 336 337**ACRN Device Model / virtio-net Backend Driver** 338 339.. code-block:: c 340 341 virtio_net_rx_callback --> // the tap fd get notified and this function invoked 342 virtio_net_tap_rx --> // read data from tap, prepare virtqueue, insert interrupt into the User VM 343 vq_endchains --> 344 vq_interrupt --> 345 pci_generate_msi --> 346 347**HSM Module** 348 349.. code-block:: c 350 351 hsm_dev_ioctl --> // process the IOCTL and call hypercall to inject interrupt 352 hcall_inject_msi --> 353 354**ACRN Hypervisor** 355 356.. code-block:: c 357 358 vmexit_handler --> // vmexit because VMX_EXIT_REASON_VMCALL 359 vmcall_vmexit_handler --> 360 hcall_inject_msi --> // insert interrupt into User VM 361 vlapic_intr_msi --> 362 363**User VM MAC Layer virtio_net Frontend Driver** 364 365.. code-block:: c 366 367 vring_interrupt --> // virtio-net frontend driver interrupt handler 368 skb_recv_done --> // registered by virtnet_probe-->init_vqs-->virtnet_find_vqs 369 virtqueue_napi_schedule --> 370 __napi_schedule --> 371 virtnet_poll --> 372 virtnet_receive --> 373 receive_buf --> 374 375**User VM MAC Layer** 376 377.. code-block:: c 378 379 napi_gro_receive --> 380 napi_skb_finish --> 381 netif_receive_skb_internal --> 382 __netif_receive_skb --> 383 __netif_receive_skb_core --> 384 385**User VM IP Layer** 386 387.. code-block:: c 388 389 ip_rcv --> 390 ip_rcv_finish --> 391 dst_input --> 392 ip_local_deliver --> 393 ip_local_deliver_finish --> 394 395 396**User VM TCP Layer** 397 398.. code-block:: c 399 400 tcp_v4_rcv --> 401 tcp_v4_do_rcv --> 402 tcp_rcv_established --> 403 tcp_data_queue --> 404 tcp_queue_rcv --> 405 __skb_queue_tail --> 406 407 sk->sk_data_ready --> // application will get notified 408 409How to Use TAP Interface 410======================== 411 412The network infrastructure shown in :numref:`net-virt-infra` needs to be 413prepared in the Service VM before we start. We need to create a bridge and at 414least one TAP device (two TAP devices are needed to create a dual 415virtual NIC) and attach a physical NIC and TAP device to the bridge. 416 417.. figure:: images/network-virt-service-vm-infrastruct.png 418 :align: center 419 :width: 900px 420 :name: net-virt-infra 421 422 Network Infrastructure in Service VM 423 424You can use Linux commands (e.g. ip, brctl) to create this network. In 425our case, we use systemd to automatically create the network by default. 426You can check the files with prefix 50- in the Service VM 427``/usr/lib/systemd/network/``: 428 429- :acrn_raw:`50-acrn.netdev <misc/services/acrn_bridge/acrn.netdev>` 430- :acrn_raw:`50-acrn.network <misc/services/acrn_bridge/acrn.network>` 431- :acrn_raw:`50-tap0.netdev <misc/services/acrn_bridge/tap0.netdev>` 432- :acrn_raw:`50-eth.network <misc/services/acrn_bridge/eth.network>` 433 434When the Service VM is started, run ``ifconfig`` to show the devices created by 435this systemd configuration: 436 437.. code-block:: none 438 439 acrn-br0 Link encap:Ethernet HWaddr B2:50:41:FE:F7:A3 440 inet addr:10.239.154.43 Bcast:10.239.154.255 Mask:255.255.255.0 441 inet6 addr: fe80::b050:41ff:fefe:f7a3/64 Scope:Link 442 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 443 RX packets:226932 errors:0 dropped:21383 overruns:0 frame:0 444 TX packets:14816 errors:0 dropped:0 overruns:0 carrier:0 445 collisions:0 txqueuelen:1000 446 RX bytes:100457754 (95.8 Mb) TX bytes:83481244 (79.6 Mb) 447 448 tap0 Link encap:Ethernet HWaddr F6:A7:7E:52:50:C6 449 UP BROADCAST MULTICAST MTU:1500 Metric:1 450 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 451 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 452 collisions:0 txqueuelen:1000 453 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) 454 455 enp3s0 Link encap:Ethernet HWaddr 98:4F:EE:14:5B:74 456 inet6 addr: fe80::9a4f:eeff:fe14:5b74/64 Scope:Link 457 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 458 RX packets:279174 errors:0 dropped:0 overruns:0 frame:0 459 TX packets:69923 errors:0 dropped:0 overruns:0 carrier:0 460 collisions:0 txqueuelen:1000 461 RX bytes:107312294 (102.3 Mb) TX bytes:87117507 (83.0 Mb) 462 Memory:82200000-8227ffff 463 464 lo Link encap:Local Loopback 465 inet addr:127.0.0.1 Mask:255.0.0.0 466 inet6 addr: ::1/128 Scope:Host 467 UP LOOPBACK RUNNING MTU:65536 Metric:1 468 RX packets:16 errors:0 dropped:0 overruns:0 frame:0 469 TX packets:16 errors:0 dropped:0 overruns:0 carrier:0 470 collisions:0 txqueuelen:1000 471 RX bytes:1216 (1.1 Kb) TX bytes:1216 (1.1 Kb) 472 473Run ``brctl show`` to see the bridge ``acrn-br0`` and attached devices: 474 475.. code-block:: none 476 477 bridge name bridge id STP enabled interfaces 478 479 acrn-br0 8000.b25041fef7a3 no tap0 480 enp3s0 481 482Add a PCI slot to the Device Model acrn-dm command line (mac address is 483optional): 484 485.. code-block:: none 486 487 -s 4,virtio-net,tap=<name>,[mac=<XX:XX:XX:XX:XX:XX>] 488 489When the User VM is launched, run ``ifconfig`` to check the network. enp0s4r 490is the virtual NIC created by acrn-dm: 491 492.. code-block:: none 493 494 enp0s4 Link encap:Ethernet HWaddr 00:16:3E:39:0F:CD 495 inet addr:10.239.154.186 Bcast:10.239.154.255 Mask:255.255.255.0 496 inet6 addr: fe80::216:3eff:fe39:fcd/64 Scope:Link 497 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 498 RX packets:140 errors:0 dropped:8 overruns:0 frame:0 499 TX packets:46 errors:0 dropped:0 overruns:0 carrier:0 500 collisions:0 txqueuelen:1000 501 RX bytes:110727 (108.1 Kb) TX bytes:4474 (4.3 Kb) 502 503 lo Link encap:Local Loopback 504 inet addr:127.0.0.1 Mask:255.0.0.0 505 inet6 addr: ::1/128 Scope:Host 506 UP LOOPBACK RUNNING MTU:65536 Metric:1 507 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 508 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 509 collisions:0 txqueuelen:1000 510 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) 511 512How to Use MacVTap Interface 513============================ 514In addition to TAP interface, ACRN also supports MacVTap interface. 515MacVTap replaces the combination of the TAP and bridge drivers with 516a single module based on MacVLan driver. With MacVTap, each 517virtual network interface is assigned its own MAC and IP address 518and is directly attached to the physical interface of the host machine 519to improve throughput and latencies. 520 521Create a MacVTap interface in the Service VM as shown here: 522 523.. code-block:: none 524 525 sudo ip link add link eth0 name macvtap0 type macvtap 526 527where ``eth0`` is the name of the physical network interface, and 528``macvtap0`` is the name of the MacVTap interface being created. 529 530Once the MacVTap interface is created, the User VM can be launched by adding 531a PCI slot to the Device Model acrn-dm as shown below. 532 533.. code-block:: none 534 535 -s 4,virtio-net,tap=macvtap0,[mac=<XX:XX:XX:XX:XX:XX>] 536 537Performance Estimation 538====================== 539 540We've introduced the network virtualization solution in ACRN, from the 541top level architecture to the detailed TX and RX flow. The 542control plane and data plane are all processed in ACRN Device Model, 543which may bring some overhead. But this is not a bottleneck for 1000Mbit 544NICs or below. Network bandwidth for virtualization can be very close to 545the native bandwidth. For a high-speed NIC (for example, 10Gb or above), it is 546necessary to separate the data plane from the control plane. We can use 547vhost for acceleration. For most IoT scenarios, processing in user space 548is simple and reasonable. 549 550 551