1.. _hld-security: 2 3Security High-Level Design 4########################## 5 6.. primary author: Bing Zhu 7 contributor: Yadong Qi 8 9Introduction 10************ 11 12This document describes the security high-level design in ACRN, 13including information about: 14 15- Secure booting in ACRN 16- Hypervisor security enhancement, including memory management, secure 17 hypervisor interfaces, etc. 18- Platform security features virtualization, such as the virtualization 19 of TPM (vTPM) and SGX (vSGX) 20 21This document is for developers, validation teams, architects, and 22maintainers of ACRN. 23 24Readers should be familiar with the basic concepts of system 25virtualization and the ACRN hypervisor implementation. 26 27 28Background 29********** 30 31The ACRN hypervisor is a type-1 hypervisor, built for running multiple 32guest OS instances, typical of an automotive infotainment system, on a 33single Apollo Lake-I SoC platform. See :numref:`security-ACRN`. 34 35.. figure:: images/security-image-HV-overview.png 36 :width: 900px 37 :align: center 38 :name: security-ACRN 39 40 ACRN Hypervisor Overview 41 42This document focuses only on the security part of the automotive 43system built on top of the ACRN hypervisor. This includes how to build a 44secure system as well as how to virtualize the security features that 45the system can provide. 46 47Usages 48====== 49 50As shown in :numref:`security-vehicle`, the ACRN hypervisor can be 51used to build a Software Defined Cockpit (SDC) or an In-Vehicle Experience 52(IVE) Solution that consolidates multiple VMs together on a single Intel 53SoC in-vehicle platform. 54 55.. figure:: images/security-image13.png 56 :width: 900px 57 :align: center 58 :name: security-vehicle 59 60 SDC and IVE System In-Vehicle 61 62 63In this system, the ACRN hypervisor is running at the most privileged 64level, VMX root mode, in virtualization technology terms. The hypervisor 65has full control of platform resources, including the processor, memory, 66devices, and in some cases, secrets of the guest OS. The ACRN 67hypervisor supports multiple guest VMs running in parallel in the less 68privileged level called VMX non-root mode. 69 70The Service VM is a special VM. While it runs as a guest VM in 71VMX non-root mode, it behaves as a privileged guest VM controlling the 72behavior of other guest VMs. The Service VM can create a guest VM, suspend and 73resume a guest VM, and provide device mediation services (Device 74Models) for other guest VMs it creates. 75 76In an SDC system, the Service VM also contains safety-critical IC (Instrument 77Cluster) applications. ACRN is designed to make sure the IC applications 78are well isolated from other applications in the Service VM such as Device 79Models (Mediators). A crash in other guest VM systems must not impact 80the IC applications, and must not cause any DoS (Deny of Service) attacks. 81Functional safety is out of scope of this document. 82 83In :numref:`security-ACRN`, the other guest VMs are referred to as User VM. 84These other VMs provide infotainment services (such as 85navigation, music, and FM/AM radio) for the front seat or rear seat. 86 87The User VM systems can be based on Linux (LaaG, Linux as a Guest) or 88Android (AaaG, Android as a Guest) depending on the customer's needs 89and board configuration. It can also be a mix of Linux and Android 90systems. 91 92In each User VM, a "side-car" OS system can accompany the normal OS system. We 93call these two OS systems "secure world" and 94"non-secure world", and they are isolated from each other by the 95hypervisor. The secure world has a higher "privilege level" than the non-secure 96world; for example, the secure world can access the non-secure world's 97physical memory but not vice versa. This document discusses how this 98security works and why it is required. 99 100Careful consideration should be made when evaluating using the Service 101VM as the Trusted Computing Base (TCB). The Service VM may be a 102fairly large system running many lines of code; thus, treating it as a 103TCB doesn't make sense from a security perspective. To achieve the 104design purpose of "defense in depth", system security designers 105should always ask themselves, "What if the Service VM is compromised?" and 106"What's the impact if this happens?" This HLD document discusses how to 107security-harden the Service VM system and mitigate attacks on the Service VM. 108 109ACRN High-Level Security Architecture 110************************************* 111 112This chapter provides a high-level architecture design overview of ACRN 113security features and their development. 114 115Secure / Verified Boot 116====================== 117 118The security of the entire system built on top of the ACRN hypervisor 119depends on the security from platform boot to User VM launching. Each layer 120or module must verify the security of the next layer or module before 121transferring control to it. Verification can be checking a 122cryptographic signature on the executable of the next step before it is 123launched. 124 125Note that measured boot (as described well in this `boot security 126technologies document 127<https://firmwaresecurity.com/2015/07/29/survey-of-boot-security-technologies/>`_) 128is not supported for ACRN and its guest VMs. 129 130Boot Flow 131--------- 132ACRN supports two verified boot sequences. 133 1341) Verified Boot Sequence With SBL 135~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 136As shown in :numref:`security-bootflow-sbl`, the Converged Security Engine 137Firmware (CSE FW) behaves as the root of trust in this platform boot 138flow. It authenticates and starts the BIOS (SBL), whereupon the SBL is 139responsible for authenticating and verifying the ACRN hypervisor image. 140The Service VM kernel is built together with the ACRN hypervisor as 141one image bundle, so this whole image signature is verified by SBL 142before launching. 143 144.. figure:: images/security-image-bootflow-sbl.png 145 :width: 900px 146 :align: center 147 :name: security-bootflow-sbl 148 149 ACRN Boot Flow with SBL 150 1512) Verified Boot Sequence With UEFI 152~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 153As shown in :numref:`security-bootflow-uefi`, in this boot sequence, UEFI 154authenticates and starts the ACRN hypervisor. Then the hypervisor returns 155to the UEFI environment to authenticate and load the Service VM kernel 156bootloader. 157 158.. figure:: images/security-image-bootflow-uefi.png 159 :width: 900px 160 :align: center 161 :name: security-bootflow-uefi 162 163 ACRN Boot Flow with UEFI 164 165As long as the Service VM kernel starts, the Service VM kernel will load all its 166subsystems subsequently. In order to launch a User VM, a DM process is 167started to launch the virtual BIOS (OVMF). Eventually, the OVMF is 168responsible for verifying and launching the User VM kernel (or the 169Android OS loader for an Android User VM). 170 171Secure Boot 172----------- 173 174In the entire boot flow, the chain of trust must be unbroken. This is 175achieved by the secure boot mechanism. Each module in the boot flow must 176authenticate and verify the next module by using a cryptographic digital 177signature algorithm. 178 179The well-known image signing algorithm uses cryptographic hashing and 180public key cryptography with PKCS1.5 padding. 181 182The 2018 minimal requirements for cryptographic strength are: 183 184#. SHA256 for image cryptographic hashing. 185#. RSA2048 for cryptographic digital signature signing and verification. 186 187We strongly recommend that SHA512 and RSA3072+ be used for a product shipped 188in 2018, especially for a product that has a long production life such as 189an automotive vehicle. 190 191The CSE FW image is signed with an Intel RSA private key. All other 192images should be signed by the responsible OEM. Our customers and 193partners are responsible for image signing, ensuring the key strength 194meets security requirements, and storing the secret RSA private key 195securely. 196 197Guest Secure Boot With OVMF 198--------------------------- 199Open Virtual Machine Firmware (OVMF) is an EDK II based project to enable UEFI 200support for virtual machines in a virtualized environment. In ACRN, OVMF is 201deployed to launch a User VM, as if the User VM is booted on a machine with 202UEFI firmware. 203 204UEFI Secure Boot defines how a platform's firmware can authenticate a digitally 205signed UEFI image, such as an operating system loader or a UEFI driver stored 206in an option ROM. This provides the capability to ensure that those UEFI images 207are only loaded in an owner-authorized fashion and provides a common means to 208ensure the platform's security and integrity over systems running UEFI-based 209firmware. 210UEFI Secure Boot is already supported by OVMF. 211 212:numref:`security-secure-boot-uefi` shows a Secure Boot overview in UEFI. 213 214.. figure:: images/security-image-secure-boot-uefi.png 215 :width: 500px 216 :align: center 217 :name: security-secure-boot-uefi 218 219 UEFI Secure Boot Overview 220 221UEFI Secure Boot is controlled by a set of UEFI Authenticated Variables that specify 222the UEFI Secure Boot Policy; the platform manufacturer or the platform owner enrolls the 223policy objects, which include the n-tuple of keys {PK, KEK, db,dbx} as step 1. 224During each successive boot, the UEFI secure boot implementation will assess the 225policy in order to verify the signed images that are discovered in a host-bus adapter 226or on a disk. If the images pass the policy, they are invoked. 227 228UEFI Secure Boot implementations use these keys: 229 230#. Platform Key (PK) is the top-level key in Secure Boot; UEFI supports a single PK, 231 which is generally provided by the manufacturer. 232#. Key Exchange Key (KEK) is used to sign Signature and Forbidden Signature Database updates. 233#. Signature Database (db) contains keys and/or hashes of allowed EFI binaries. 234 235And keys and certificates are in multiple formats: 236 237#. ``.key`` PEM format private keys for EFI binary and EFI signature list signing. 238#. ``.crt`` PEM format certificates for sbsign. 239#. ``.cer`` DER format certificates for firmware. 240 241In ACRN, User VM Secure Boot can be enabled as follows: 242 243#. Generate keys (PK/KEK/DB) with a key generation tool such as Ubuntu 244 KeyGeneration. ``PK.der``, ``KEK.der``, and ``db.der`` will be enrolled in UEFI 245 BIOS. ``db.key`` and ``db.crt`` will be used to sign the User VM 246 bootloader/kernel. 247#. Create a virtual disk to hold ``PK.der``, ``KEK.der``, and ``db.der``, then launch 248 the User VM with this virtual disk. 249#. Start the OVMF in writeback mode to ensure the keys are persistently stored 250 in the OVMF image. 251#. Enroll the keys in the OVMF GUI by following the Secure Boot configuration 252 flow and enable Secure Boot mode. 253#. Perform writeback via reset in OVMF. 254#. Sign the User VM images with ``db.key`` and ``db.crt``. 255#. Boot the User VM with Secure Boot enabled. 256 257.. _service_vm_hardening: 258 259Service VM Hardening 260-------------------- 261 262In the ACRN project, the reference Service VM is based on Ubuntu. 263Customers may choose to use different open source OSes or their own 264proprietary OS systems. To minimize the attack surfaces and achieve the 265goal of "defense in depth", there are many common guidelines to ensure the 266security of the Service VM system. 267 268As shown in :numref:`security-bootflow-sbl` and 269:numref:`security-bootflow-uefi` above, the integrity of the User VM 270depends on the integrity of the DM module and vBIOS/vOSloader in the 271Service VM. Hence, Service VM integrity is critical to the entire User VM security. 272If the Service VM system is compromised, all the other User VMs may be 273jeopardized. 274 275In practice, the Service VM designer and implementer should obey at least the 276following rules: 277 278#. Verify that the Service VM is a closed system and doesn't allow the user to 279 install any unauthorized third-party software or components. 280#. Verify that external peripherals are constrained. 281#. Enable kernel-based hardening techniques, for example, dm-verity (to 282 ensure the integrity of the DM and vBIOS/vOSloaders), and kernel module 283 signing. 284#. Enable system level hardening such as MAC (Mandatory Access Control). 285 286Detailed configurations and policies are out of scope for this document. 287For good references on OS system security hardening and enhancement, 288see `AGL security 289<https://docs.automotivelinux.org/en/lamprey/#2_Architecture_Guides/2_Security_Blueprint/0_Overview/>`_ 290and `Android security <https://source.android.com/security/>`_. 291 292Hypervisor Security Enhancement 293=============================== 294 295This section describes the ACRN hypervisor security enhancement for 296memory boundary access and interfaces between VMs and the hypervisor, 297such as Hypercall APIs, I/O emulations, and EPT violation handling. 298 299The main security goal of the ACRN hypervisor design is to prevent 300Privilege Escalation and enforce Isolation, for example: 301 302- VMM privilege escalation (VMX non-root -> VMX root) 303- Non-secure OS software (running in AaaG) accessing secure world TEE 304 assets 305- Unauthorized software from executing in the hypervisor 306- Cross-guest VM attacks 307- Hypervisor secret information leakage 308 309Memory Management Enhancement 310----------------------------- 311 312Background 313~~~~~~~~~~ 314 315The ACRN hypervisor has ultimate access control of all the platform 316memory spaces (see :ref:`memmgt-hld`). Note that on the APL platform, 317`SGX <https://www.intel.com/content/www/us/en/developer/tools/software-guard-extensions/overview.html>`_ and `TME 318<https://itpeernetwork.intel.com/memory-encryption/>`_ 319are not supported. 320 321The hypervisor can read and write any physical memory space allocated 322to any guest VM, and can even fetch instructions and execute the code in 323the memory space from any guest VM. If the hypervisor has MMU 324misconfiguration or is compromised by an attacker, it must be 325constrained in some manner to prevent the hypervisor from accessing 326guest memory space either maliciously or accidentally. As a best 327security practice, any memory content from a guest VM memory space must 328not be trusted by the hypervisor. In other words, there must be a trust 329boundary for memory space between the hypervisor and guest VMs. 330 331.. figure:: images/security-image14.png 332 :width: 500px 333 :align: center 334 :name: security-hgmem 335 336 Hypervisor and Guest Memory Layout 337 338The hypervisor must appropriately configure the EPT tables to disallow 339any guest to access (read/write/execution) the memory space owned by 340the hypervisor. 341 342Memory Access Restrictions 343~~~~~~~~~~~~~~~~~~~~~~~~~~ 344 345The fundamental rules of restricting hypervisor memory access are: 346 347#. By default, prohibit any access to all guest VM memory. This means 348 that when the hypervisor initially sets up its own MMU paging tables 349 (HVA->HPA mapping), it only grants permissions for hypervisor memory 350 space (excluding guest VM memory). 351#. Grant access permission for the hypervisor to read/write a specific guest 352 VM memory region on demand. The hypervisor must never grant execution 353 permission for itself to fetch any code instructions from guest 354 memory space because there is no reason to do that. 355 356In addition to these rules, the hypervisor must also implement generic 357best-practice memory configurations for access to its own memory in host 358CR3 MMU paging tables, such as splitting hypervisor code and data 359(stack/heap) sections, and then applying W |oplus| X policy, which means if memory 360is Writable, then the hypervisor must make it non-eXecutable. The 361hypervisor must configure its code as read-only and executable, and 362configure its data as read/write. Optionally, if there are read-only 363data sections, it would be best if the hypervisor configures them as 364read-only. 365 366The following sections focus on the rules mentioned above for 367memory access restriction on guest VM memory (not restrictions on the 368hypervisor's own memory access). 369 370SMAP/SMEP Enablement in the Hypervisor 371~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 372 373For the hypervisor to isolate access to the guest VM memory space, 374three typical solutions exist: 375 376#. **Configure the hypervisor/VMM MMU CR3 paging tables by removing the 377 execution permission (setting NX bit) or removing mapping completely 378 (setting not-present) for the guest memory space.** 379 380 In practice, this works very well for NX setting to disable 381 instruction fetching from any guest memory space. However, it is not 382 suitable for read/write access isolation. For example, if the 383 hypervisor removes the mapping to a guest memory page in host CR3 384 paging tables, when the hypervisor wants to access that specific 385 guest memory page, the hypervisor must first add mapping back to its 386 CR3 paging tables before accessing that page, and revert the mapping 387 after accessing. 388 389 This remapping causes code complexity and a performance penalty and 390 may even require the hypervisor to flush the TLB. This solution won't 391 be used by the ACRN hypervisor. 392 393#. **Use CR0.WP (write-protection) bit.** 394 395 This processor feature allows 396 pages to be protected from supervisor-mode write access. 397 If the host/VMM CR0.WP = 0, supervisor-mode write access is 398 allowed to linear addresses with read-only access rights. If CR0.WP = 399 1, they are not allowed. User-mode write access is never allowed 400 for linear addresses with read-only access rights, regardless of the 401 value of CR0.WP. 402 403 To implement this WP protection, the hypervisor must first configure 404 all the guest memory space as "user-mode" accessible memory, and as 405 read-only access. In other words, the corresponding paging table 406 entry U/S bit and R/W bit must be set in host CR3 paging tables for 407 all those guest memory pages. 408 409 .. figure:: images/security-image3.png 410 :width: 500px 411 :align: center 412 :name: security-gmem 413 414 Configure Guest Memory as User-accessible 415 416 This setting seems meaningless since all the code in the ACRN hypervisor 417 is running in Ring 0 (supervisor-mode), and no code in the hypervisor 418 will be executed in Ring 3 (no user-mode applications in the hypervisor / 419 vmx-root). 420 421 However, these settings are made in order to make use of the CR0.WP 422 protection capability, because if CR0.WP = 1, if the hypervisor code is 423 running in Ring 0 and maliciously attempts to write a user-accessible 424 read-only memory page (in guest memory space), then this malicious 425 behavior can be thwarted with a page fault (#PF) by the processor in the 426 hypervisor. Whenever the hypervisor has a valid reason to have a write 427 access to user-accessible read-only memory (guest memory), it can 428 disable CR0.WP (clear CR0.WP) before writing, and then set CR0.WP 429 back to 1. 430 431 This solution is better than the 1st solution above because it doesn't 432 need to change the host CR3 paging tables to map or unmap guest memory 433 pages and doesn't need to flush the TLB. 434 However, it cannot prevent the hypervisor (running in Ring 0 mode) from 435 reading guest memory space because this CR0.WP bit doesn't control read 436 access behaviors. This read access protection is essentially required 437 because sometimes there may be secrets in guest memory and if the 438 hypervisor can be hacked to read those memory contents, then it may 439 cause secret leaking to attackers. 440 4413. **Use processor SMEP and SMAP capabilities.** 442 443 This solution is the best solution because SMAP can prevent the 444 hypervisor from both reading and writing guest memory, and SMEP can 445 prevent the hypervisor from fetching/executing code in guest memory. This 446 solution also has minimal performance impact; like the CR0.WP 447 protection, it doesn't require TLB flush (incurring a performance 448 penalty) and has less code complexity. 449 450The following sections will focus on this SMEP/SMAP protection. SMEP 451and SMAP are widely used by all modern Operating System software such as 452Windows and Linux, for isolating kernel and user memory, and can 453mitigate many vulnerability exploits. 454 455Guest Memory Execution Prevention 456+++++++++++++++++++++++++++++++++ 457 458SMEP is designed to prevent user memory malware (typically 459attacker-supplied) from being executed in the kernel (Ring 0) privilege 460level. As long as the CR4.SMEP = 1, software operating in supervisor 461mode cannot fetch instructions from linear addresses that are accessible 462in user mode. 463 464In the ACRN hypervisor, the attacker-supplied memory could be any guest 465memory, because the hypervisor doesn't trust all the data/code from guest 466memory by design. 467 468In order to activate SMEP protection, the ACRN hypervisor must: 469 470#. Configure all the guest memory as user-accessible memory (U/S = 1). 471 No matter what settings for NX bit and R/W bit in corresponding host 472 CR3 paging tables. 473#. Set CR4.SMEP bit. In the entire life cycle of the hypervisor, this bit 474 value always remains one. 475 476As an alternative, NX feature is used for this purpose by setting the 477corresponding NX (non-execution) bit for all the guest memory mapping 478in host CR3 paging tables. 479 480Since the hypervisor code never runs in Ring 3 mode, either of these two 481solutions works very well. Both solutions are enabled in the ACRN 482hypervisor. 483 484Guest Memory Access Prevention 485++++++++++++++++++++++++++++++ 486 487Supervisor Mode Access Prevention (SMAP) is yet another powerful 488processor feature that makes it harder for malware to 489"trick" the kernel into using instructions or data from a user-space 490application program. 491 492This feature is controlled by the CR4.SMAP bit. When that bit is set, 493any attempt to access user-accessible memory pages while running in a 494privileged or kernel mode will lead to a page fault. 495 496However, there are times when the kernel legitimately needs to work with 497user-accessible memory pages. The Intel processor defines a separate 498"AC" flag (in RFLAGS register) that control the SMAP feature. If the AC 499flag is clear, SMAP protection is in force when CR4.SMAP=1; otherwise 500access to user-accessible memory pages is allowed even if CR4.SMAP=1. 501The "AC" flag provides suppression for SMAP enforcement. 502 503To manipulate that flag relatively quickly, STAC (set AC flag) and CLAC 504(clear AC flag) instructions are introduced for this purpose. Note that 505STAC and CLAC can only be executed in kernel mode (CPL=0). 506 507To activate SMAP protection in the ACRN hypervisor: 508 509#. Configure all the guest memory as user-writable memory (U/S bit = 1, 510 and R/W bit = 1) in corresponding host CR3 paging table entries, as 511 shown in :numref:`security-smap` below. 512#. Set CR4.SMAP bit. In the entire life cycle of the hypervisor, this bit 513 value always remains one. 514#. When needed, use STAC instruction to suppress SMAP protection, and 515 use CLAC instruction to restore SMAP protection. 516 517.. figure:: images/security-image5.png 518 :width: 500px 519 :align: center 520 :name: security-smap 521 522 Setting SMAP and Configuring U/S=1, R/W=1 for All Guest Memory Pages 523 524For example, :numref:`security-smap` shows a module of the hypervisor code 525(running in Ring 0 mode) attempting to perform a legitimate read (or 526write) access to a data area in guest memory page. 527 528.. figure:: images/security-image4.png 529 :width: 500px 530 :align: center 531 :name: security-hagm 532 533 Hypervisor Access to Guest Memory 534 535The hypervisor can do these steps: 536 537#. Execute STAC instruction to suppress SMAP protection. 538#. Perform read/write access on guest DATA area. 539#. Execute CLAC instruction to restore SMAP protection. 540 541The attack surface can be minimized because there is only a 542very small window between step 1 and step 3 in which the guest memory 543can be accessed by hypervisor code running in ring 0. 544 545Rules to Access Guest Memory in the Hypervisor 546~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 547 548In the ACRN hypervisor, functions ``stac()`` and ``clac()`` wrap 549STAC and CLAC instructions respectively, and functions 550``copy_to_gpa()`` and ``copy_from_gpa()`` can be used to copy 551an arbitrary amount of data to or from the VM memory area. 552 553Whenever the hypervisor needs to perform legitimate read/write access to 554guest memory pages, one of the functions above must be used. Otherwise, the 555#PF will be triggered by the processor to prevent malware or 556unintended access from or to the guest memory pages. 557 558These functions must also internally check the address availabilities, 559for example, ensuring the input address accessed by the hypervisor must have 560a valid mapping (GVA->GPA mapping, GPA->HPA EPT mapping and HVA->HPA 561host MMU mapping), and must not be in the range of the hypervisor memory. 562Details of these ordinary checks are out of scope in this document. 563 564 565Avoidance of Memory Information Leakage 566--------------------------------------- 567 568Protecting the hypervisor's memory is critical to the security of the 569entire platform. The hypervisor must prevent any memory content (e.g., 570stack or heap) from leaking to guest VMs. Some of the hypervisor memory 571content may contain platform secrets such as SEEDs, which are used as 572the root key for its guest VMs. `Xen Advisories 573<https://xenbits.xen.org/xsa/>`_ have many examples of past hypervisor 574memory leaks, ACRN developers can refer to this link to understand how 575to avoid this in coding. 576 577Memory content from one guest VM might be leaked to another guest VM. 578In ACRN and Device Model design, when one guest VM is destroyed or 579crashes, its memory content should be scrubbed either by the hypervisor 580or the Service VM Device Model process, in case its memory content is 581re-allocated to another guest VM that could otherwise leave the 582previous guest VM secrets in memory. 583 584.. _secure-hypervisor-interface: 585 586Secure Hypervisor Interface 587--------------------------- 588 589Hypercall API Interface Hardening 590~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 591 592The hypercall API is the primary interface between a guest VM and the 593hypervisor. 594 595.. figure:: images/security-image-HC-interface-restriction.png 596 :width: 900px 597 :align: center 598 :name: security-hir 599 600 Hypercall Interface Restriction 601 602As shown in :numref:`security-hir`, there are some restrictions for 603hypercall invocation in the hypervisor design: 604 605#. Hypercalls from ring 1~3 of any guest VM are not allowed. The 606 hypervisor must discard such hypercalls and inject ``#GP(0)`` instead. Only ring-0 607 hypercalls from the guest VM are handled by the hypervisor. 608#. All the hypercalls (except world\_switch hypercall) must be called 609 from the ring-0 driver of the Service VM. 610 World\_switch Hypercall is used by the TIPC (Trusty IPC) driver to 611 switch guest VM context between secure world and non-secure world. 612 Further details will be discussed in the :ref:`secure_trusty` section. 613 When a vCPU issues an unpermitted hypercall, the hypervisor shall either 614 inject ``#UD`` (if the VM cannot issue hypercalls at all) or return ``-EINVAL`` 615 (if the VM is allowed to issue hypercalls but not this specific one). 616#. For those hypercalls that may result in data inconsistent intra hypervisor 617 when they are executed concurrently, such as ``hcall_create_vm()`` or 618 ``hcll_destroy_vm()``, spinlock is used to ensure these hypercalls 619 are processed in the hypervisor in a serializing way. 620 621In addition to the above rules, there are other regular checks in the 622hypercall implementation to prevent hypercalls from being misused. For 623example, all the parameters must be sanitized, unexpected hypervisor 624memory overwrite must be avoided, any hypervisor memory content/secrets 625must not be leaked to guests, and any memory/code injection must be 626eliminated. 627 628I/O Emulation Handler 629~~~~~~~~~~~~~~~~~~~~~ 630 631I/O port monitoring is also widely used by the ACRN hypervisor to 632emulate legacy I/O access behaviors. 633 634Typically, the I/O instructions could be IN, INS/INSB/INSW/INSD, OUT, 635OUTS/OUTSB/OUTSW/OUTSD with arbitrary port (although not all the I/O 636ports are monitored by the hypervisor). As with other interfaces (e.g., 637hypercalls), the hypervisor performs security checks for all the I/O 638access parameters to make sure the emulation behaviors are correct. 639 640EPT Violation Handler 641~~~~~~~~~~~~~~~~~~~~~ 642 643The Extended Page Table (EPT) is typically used by the hypervisor to 644monitor MMIO (or other types of ordinary memory access) operation from a 645guest VM. The hypervisor then emulates the MMIO instructions with design 646behaviors. 647 648As done for I/O emulation, this interface could also be manipulated by 649malware in a guest VM to compromise system security. 650 651Other VMEXIT Handlers 652~~~~~~~~~~~~~~~~~~~~~ 653 654There are some other VMEXIT handlers in the hypervisor that might take 655untrusted parameters and registers from a guest VM, for example, MSR write 656VMEXIT, APIC VMEXIT. 657 658Sanity checks are performed by the hypervisor to avoid security issues when 659handling those special VMEXIT. 660 661Guest Instruction Emulation 662~~~~~~~~~~~~~~~~~~~~~~~~~~~ 663 664Instruction emulation implemented by the hypervisor must also be checked 665securely. Emulating x86 instruction is complicated, and there are many 666known security CVEs reported by attackers in the KVM/XEN/QEMU 667community. This is a "hotspot" where the hypervisor may potentially 668have vulnerability bugs. 669 670Security validation process and secure code review must ensure all the 671instruction emulations behave as defined in the `IA32 SDM 672document <https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html>`_. 673 674Virtual Power Life Cycle Management 675----------------------------------- 676 677In a virtualization environment, each User VM can have its 678virtual power managed just like native behavior. For example, if a User VM 679is required to enter S3 (Suspend to RAM) for power consumption saving, 680then the hypervisor and DM processor in the Service VM must handle it correctly. 681Similarly, virtual cold/warm reboot is also supported. How to implement 682virtual power life cycle management is out of scope in this document. 683 684This subsection is intended to describe the security issues for those 685power cycles. 686 687User VM Power On and Shutdown 688~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 689 690The memory of the User VM is allocated dynamically by the DM 691process in the Service VM before the User VM is launched. When the User VM 692is shut down (or crashed), its memory will be freed to Service VM memory space. 693Later on, if there is a new User VM launch event occurring, DM may potentially allocate 694the same memory content (or some overlaps) for this new User VM. 695 696In the virtualization environment, a security goal is to ensure User VM 697isolation, not only for runtime memory isolation (e.g., with EPT), 698but also for data at rest isolation. 699 700Under this situation, if the memory content of a previous User VM is not 701scrubbed by either DM or the hypervisor, then the new launched User VM could 702access the previous User VM's secrets by scanning the memory regions 703allocated for the new User VM. 704 705In ACRN, the memory content is scrubbed in the Device Model after the guest 706VM is shut down. 707 708User VM Reboot 709~~~~~~~~~~~~~~ 710 711The behaviors of **cold** boot of virtual User VM reboot are the same as those of 712previous virtual power-on and shutdown events. There is a special case: 713virtual **warm** reboot. 714 715When a User VM encounters a panic, its kernel may trigger a warm reboot, so 716that in the next power cycle, a special purpose-built OS image is 717launched to dump the memory content for debugging analysis. In a warm 718reboot, the memory content must be preserved after a virtual power 719cycle. However, this violates the security rules above. 720 721This typically is fine in project ACRN, because in the next virtual 722power cycle, the same memory content won't be re-allocated to another 723User VM. 724 725But there is a new issue when the secure world (TEE/Trusty) is considered, 726because the memory content of the secure world must not be dumped by a 727non-secure world User VM. More details will be discussed in 728the section on :ref:`platform_root_of_trust`. 729 730Normally, this warm reboot (crashdump) feature is a debug feature, and 731must be disabled in a production release. Users who want to use this 732feature must possess the private signing key to re-sign the image after 733enabling the configuration. 734 735.. _user_vm_suspend_resume: 736 737User VM Suspend/Resume 738~~~~~~~~~~~~~~~~~~~~~~ 739 740There are no special design considerations for normal User VMs without secure 741world supported, as long as the EPT/VT-d memory protection/isolation is 742active during the entire suspended time. 743 744The secure world (Trusty/TEE) is a special case for virtual suspend. Unlike 745the non-secure world of User VMs, whose memory content can be read/written by 746the Service VM, the memory content of the secure world of User VMs must not be 747visible to the Service VM. This is designed for security with defense in depth. 748 749During the entire process of User VM sleep/suspend, the memory protection 750for the secure world is preserved too. The physical memory region of the 751secure world is removed from EPT paging tables of any guest VM, 752even including the Service VM. 753 754Third-Party Libraries 755--------------------- 756 757All the third-party libraries must be examined before use to verify 758there are no known vulnerabilities in the library source code. 759Typically, the CVE site https://cve.mitre.org/cve/search_cve_list.html 760can be used to search for known vulnerabilities. 761 762.. _platform_root_of_trust: 763 764Platform Root of Trust Key/Seed Derivation 765========================================== 766 767For security reasons, each guest VM requires a root key, which is used to 768derive many other individual keys for different purposes, for example, 769secure storage encryption, keystore master key, and HMAC keys. 770 771In the APL platform, CSE FW will generate platform SEED (pSEED, 256bit) 772unique per device since it is derived from a unique chipset secret 773burned into the chip. 774 775Then on each boot, the SBL BIOS is responsible for retrieving the pSEED 776from CSE FW, and deriving two other derivatives (dSEED, and uSEED). 777 778.. figure:: images/security-image-platform-seed-derivation.png 779 :width: 900px 780 :align: center 781 :name: security-seed 782 783 Platform SEED (pSEED) Derivation 784 785As shown in :numref:`security-seed` above, the hypervisor then derives 786multiple child SEEDs for multiple guest VMs. A guest VM must not be able 787to know the SEEDs of any other guest VMs. 788 789The algorithm used in the hypervisor to derive keys is HKDF (HMAC-based 790Extract-and-Expand Key Derivation Function), `RFC5869 791<https://tools.ietf.org/html/rfc5869>`_. The crypto library `mbedtls 792<https://github.com/ARMmbed/mbedtls>`_ has been chosen for project ACRN. 793 794The parameters of HKDF derivation in the hypervisor are: 795 796#. VMInfo= vm name (from the hypervisor configuration file) 797#. theHash=SHA-256 798#. OutSeedLen = 64 in bytes 799#. Guest Dev and User SEED (dvSEED/uvSEED) 800 801 ``dvSEED = HKDF(theHash, nil, dSEEd, VMInfo\|"devseed", OutSeedLen)`` 802 803 ``uvSEED = HKDF(theHash, nil, uSEEd, VMInfo\|"userseed", OutSeedLen)`` 804 805.. _secure_trusty: 806 807Secure Isolated World (Trusty) 808============================== 809 810This section explains how to build a secure isolated world in a specific 811guest VM such as the Android User VM. (See :ref:`trusty_tee` for more 812information.) 813 814On the APL platform, the secure world is used to run a 815virtualization-based Trusty TEE in an isolated world that serves 816Android as a Guest (AaaG) to get Google's Android relevant certificates 817by fulfilling Android CDD requirements. Also as a plan, Trusty will be 818supported to provide security services for LaaG User VMs as well. 819 820Refer to this Google website for `Trusty details 821<https://source.android.com/security/trusty/>`_ and for `Android CCD 822documents <https://source.android.com/compatibility/cdd>`_. 823 824Secure World Architecture Design 825-------------------------------- 826 827To support a VT-TEE (Virtualization Technology based TEE) Trusty on 828ACRN, the hypervisor creates an isolated secure world in a User VM. 829 830.. figure:: images/security-image-secure-world.png 831 :width: 900px 832 :align: center 833 :name: security-secure-world 834 835 Secure World 836 837In :numref:`security-secure-world`, the Trusty OS runs in the User VM secure 838world and a Linux- or Android-based User VM runs in the non-secure world. 839 840By design, the secure world is able to read and write to all the non-secure 841world's memory space. But non-secure world applications cannot have 842access to the secure world's memory. This is guaranteed by switching 843different EPT tables when a world switch (WS) hypercall is invoked. The 844WS hypercall can have parameters to specify the services cmd ID 845requested from the non-secure world. 846 847To design the "one VM, two worlds" architecture, there is a single 848User VM structure per-User VM in the hypervisor, but two vCPU structures that 849save non-secure world and secure world virtual logical processor states 850respectively. 851 852Whenever there is a WS hypercall from the non-secure world, the hypervisor 853will copy non-secure world CPU contexts from Guest VMCS to the non-secure 854world-vCPU structure for saving contexts, and then copy secure-world CPU 855contexts from the secure-world-vCPU structure to Guest VMCS, then do 856VMRESUME to the secure-world, and vice versa. The EPTP pointer will also be 857updated accordingly in VMCS (not shown in 858:numref:`security-secure-world`). 859 860Trusty (Secure World) Memory Mapping View 861----------------------------------------- 862 863As per the secure world design, Trusty can have read/write access to the 864non-secure world's memory, but the non-secure world cannot access the Trusty 865secure world's memory. In the hypervisor EPT configuration shown in 866:numref:`security-mem-view` below, the secure world EPTP page table 867hierarchy must contain the non-secure world address space, while the Trusty 868world's address space must be removed from the non-secure world EPTP 869page table hierarchy. 870 871Since there is no need to allow Trusty to execute memory from the non-secure 872world, for security reasons, the execution (X) permission must be removed 873for the non-secure world address space in the secure world EPT table 874configuration. 875 876To save page tables and share the mappings for the non-secure world address 877space, the hypervisor relocates the secure world's GPA to a very high 878position: 511G-512G. Hence, the PML4 for Trusty World is separated from the 879non-secure world. PDPT/PD/PT for low memory (<511G) are shared in both the 880Trusty World's EPT and non-secure world's EPT. PDPT/PD/PT for high 881memory (>=511G) are valid for the Trusty World's EPT only. 882 883.. figure:: images/security-image8.png 884 :width: 900px 885 :align: center 886 :name: security-mem-view 887 888 Memory View for User VM Non-secure World and Secure World 889 890Trusty/Tee Hypercalls 891--------------------- 892 893Two hypercalls are introduced to assist in secure world (Trusty/TEE) 894execution on top of the hypervisor. 895 896Hypercall - Trusty Initialization 897~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 898 899When a User VM is created by the DM in the Service VM, if this User VM 900supports a secure isolated world, then this hypercall will be invoked 901by OSLoader (it could be the Android OS loader in 902:numref:`security-bootflow-sbl` and 903:numref:`security-bootflow-uefi` above) to create or initialize the 904secure world (Trusty/TEE). 905 906.. figure:: images/security-image9.png 907 :width: 900px 908 :align: center 909 :name: security-start-flow 910 911 Secure World Start Flow 912 913In :numref:`security-start-flow` above, the OSLoader is responsible for 914loading the TEE/Trusty image to a dedicated and reserved memory region, and 915locating its entry point of TEE/Trusty executable, then executes a 916hypercall that exits to the hypervisor handler. 917 918In the hypervisor, from a security perspective, it removes GPA->HPA 919mapping of the secure world from EPT paging tables of both the User VM 920non-secure world and even the Service VM. This is intended to disallow the 921non-secure world and Service VM to access the memory region of the secure world 922for security reasons as previously mentioned. 923 924After all is set up by the hypervisor, including vCPU context 925initialization, the hypervisor eventually does vmresume (step 4 in 926:numref:`security-start-flow` above) to the entry point of the secure world 927TEE/Trusty, then the Trusty OS gets started in VMX non-root mode to 928initialize itself, and loads its TAs (Trusted Applications) so that the 929security services can be ready right before the non-secure OS gets started. 930 931After the Trusty OS completes its initialization, a world switching (WS, see 932subsection below) hypercall is invoked (step 5 in 933:numref:`security-start-flow` above), and then the hypervisor takes 934control back, and resumes to the OSLoader (step 6 in 935:numref:`security-start-flow` above) to continue execution in the guest 936VM non-secure world context. 937 938Note that this Trusty initialization hypercall can only be called once 939in the User VM life cycle. 940 941Hypercall - Trusty Switching 942~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 943 944There is another special hypercall introduced only for world switching 945between the non-secure world and secure world in a User VM. 946 947.. figure:: images/security-image-world-switching-HC.png 948 :width: 900px 949 :align: center 950 :name: security-ws 951 952 World Switching Hypercall 953 954Whenever this hypercall is invoked in a User VM, the hypervisor will 955unconditionally switch to the other world. For example, if it is called 956in the non-secure world, the hypervisor will then switch context to the secure 957world. After the secure world completes its security tasks (or an external 958interrupt occurs), this hypercall will be called again, then the hypervisor 959will switch context back to the non-secure world. 960 961During the entire world switching process, the Service VM is not involved. This 962hypervisor is only available to a User VM with duo-worlds supported. 963 964Secure Storage Virtualization 965----------------------------- 966 967Secure storage is one of the security services provided by the secure world 968(TEE/Trusty). In the current implementation, secure storage is built up 969on the RPMB partition in eMMC (or UFS, and NVMe storage). Details of how 970RPMB works are out of scope for this document. 971 972Since the eMMC in APL SoC platforms only has a single RPMB 973partition for tamper-resistant and anti-replay secure storage, the 974secure storage (RPMB) should be virtualized in order to support multiple 975guest User VMs. However, although future generations of flash storage 976(e.g., UFS 3.0 and NVMe) support multiple RPMB partitions, this 977document still only focuses on the virtualization solution for 978single-RPMB flash storage devices in APL SoC platforms. 979 980The following :numref:`security-storage` illustrates the virtualization 981of secure storage high-level architecture overview. 982 983.. figure:: images/security-image-secure-storage-virt.png 984 :width: 900px 985 :align: center 986 :name: security-storage 987 988 Secure Storage Virtualization 989 990In :numref:`security-storage`, the rKey is the physical RPMB 991authentication key used for data authenticated read/write access between 992the Service VM kernel and the physical RPMB controller in eMMC device. The 993VrKey is the virtual RPMB authentication key used for authentication 994between the DM module in the Service VM and its corresponding User VM secure software. 995Each User VM (if secure storage is supported) has its own VrKey, generated 996randomly when the DM process starts, and is securely distributed to the User VM 997secure world for each reboot. The rKey is fixed on a specific platform 998unless the eMMC is replaced with another one. 999 1000The details of physical RPMB key (rKey) provisioning are out of scope. In 1001the current project ACRN on APL platforms, the rKey is provisioned by 1002BIOS (SBL) right after a production device ends its manufacturing process. 1003 1004For each reboot, the BIOS/SBL always retrieves the rKey from CSE FW 1005(or generated from a special SEED that is retrieved from CSE FW, refer 1006to :ref:`platform_root_of_trust`). The SBL hands this over to the 1007ACRN hypervisor, and the hypervisor in turn sends it to the Service VM kernel. 1008 1009As an example, secure storage virtualization workflow for data write 1010access is like this: 1011 1012#. User VM secure world (e.g., Trusty) packs the encrypted data and signs it 1013 with the vRPMB authentication key (VrKey), and sends the data along 1014 with its signature over the RPMB FE driver in the User VM non-secure world. 1015#. After the DM process in the Service VM receives the data and signature, the 1016 vRPMB module in the DM verifies them with the shared secret (vRPMB 1017 authentication key, VrKey). 1018#. If verification is successful, the vRPMB module does data address remap 1019 (remembering that the multiple User VMs share a single physical RPMB 1020 partition), and forwards the data to the Service VM kernel. The kernel packs 1021 the data and signs it with the physical RPMB authentication key 1022 (rKey). Eventually, the data and its signature will be sent to the 1023 physical eMMC device. 1024#. If the verification is successful in the eMMC RPMB controller, the 1025 data will be written into the storage device. 1026 1027This workflow of authenticated data read is very similar to this flow 1028above, but in reverse order. 1029 1030Note that there are some security considerations in this design: 1031 1032#. The rKey protection is very critical in this system. If it is 1033 leaked, an attacker can overwrite the data on RPMB, which 1034 violates the "tamper-resistant & anti-replay" capability. 1035#. Typically, the vRPMB module in the DM process of the Service VM system can 1036 filter 1037 data access, preventing one User VM from performing read/write access to the 1038 data from another User VM. If the vRPMB module in the DM process is 1039 compromised, one User VM may also change/overwrite the secure data of the 1040 other User VM. 1041 1042Keeping the Service VM system as secure as possible is a very important goal in 1043the system security design. Follow the recommendations in 1044:ref:`service_vm_hardening`. 1045 1046SEED Derivation 1047--------------- 1048 1049Refer to the previous section: :ref:`platform_root_of_trust`. 1050 1051Trusty/TEE S3 (Suspend to RAM) 1052------------------------------ 1053 1054Secure world S3 design is not yet finalized. However, there is a 1055temporary solution as explained below to make it work on top of ACRN. 1056 1057Two new hypercalls are introduced: one saves the secure world processor 1058contexts/states; the other one restores the secure world processor 1059contexts/states. 1060 1061The save state hypercall is called only in the secure world (Trusty/TEE OS) 1062as long as the Trusty receives a signal when the entire system (actually 1063the non-secure OS issues this power event) is about to enter S3. While 1064the restore state hypercall is called only by vBIOS when the User VM is ready to 1065resume from suspend state. 1066 1067For security design considerations of handling secure world S3, 1068read the previous section: :ref:`user_vm_suspend_resume`. 1069 1070Platform Security Feature Virtualization and Enablement 1071======================================================= 1072 1073This section talks about how the hypervisor enables host CPU features 1074(e.g., SGX) and enables platform features (e.g., HECI), to allow guest 1075VMs the ability to use those features. 1076 1077TPM 2.0 Virtualization (vTPM) 1078----------------------------- 1079 1080On APL platforms, Intel |reg| PTT (Platform Trust Technology) implements TPM 1081functionalities based on the TCG TPM 2.0 industry standard specification. 1082PTT exposes the TPM hardware interface as CRB (Command Response Buffer) 1083defined in the TCG TPM 2.0 spec. 1084 1085However, in project ACRN, TPM virtualization doesn't assume it is based 1086on PTT or discrete TPM; both TPMs (2.0) are supported by design. 1087Customers are free to use either PTT or discrete TPM (but not at the same 1088time). PTT, however, is a built-in TPM 2.0 implementation in APL 1089platforms and does not require extra BOM cost (unlike discrete TPM). 1090 1091Note that the underlying CSE FW/HW implements PTT functionalities; 1092however, this TPM 2.0 feature does not rely on MEI/HECI virtualization. 1093 1094Unlike regular hardware, implementation of virtualizing a TPM must 1095address both security and trust. 1096 1097The goal of virtualization is to provide TPM functionality to each guest 1098VM, such as: 1099 1100#. Allows programs to interact with a TPM in a virtual system the same 1101 way they interact with a TPM on the physical system. 1102#. Each User VM gets its own unique, emulated, software TPM, for example, 1103 vPCR and vNVRAM. 1104#. One-to-one mapping between running vTPM instances and logical vTPM in 1105 each VM. 1106 1107 1108