1.. _hld_splitlock: 2 3Handling Split-Locked Access in ACRN 4#################################### 5 6A split lock is any atomic operation whose operand crosses two cache 7lines. Because the operation must be atomic, the system locks the bus 8while the CPU accesses the two cache lines. Blocking bus access from 9other CPUs plus the bus locking protocol overhead degrades overall 10system performance. 11 12This document explains Split-locked Access, how to detect it, and how 13ACRN handles it. 14 15Split-Locked Access Introduction 16******************************** 17Intel-64 and IA32 multiple-processor systems support locked atomic 18operations on locations in system memory. For example, The LOCK instruction 19prefix can be prepended to the following instructions: ADD, ADC, AND, BTC, BTR, BTS, 20CMPXCHG, CMPXCH8B, CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, 21and XCHG, when these instructions use memory destination operand forms. 22Reading or writing a byte in system memory is always guaranteed to be 23atomic, otherwise, these locked atomic operations can impact system in two 24ways: 25 26- **The destination operand is located in the same cache line.** 27 28 Cache coherency protocols ensure that atomic operations can be 29 carried out on cached data structures with cache lock. 30 31- **The destination operand is located in two cache lines.** 32 33 This atomic operation is called a Split-locked Access. For this situation, 34 the LOCK# bus signal is asserted to lock the system bus, to ensure 35 the operation is atomic. See `Intel 64 and IA-32 Architectures Software Developer's Manual (SDM), Volume 3, (Section 8.1.2 Bus Locking) <https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html>`_. 36 37Split-locked Access can cause unexpected long latency to ordinary memory 38operations by other CPUs while the bus is locked. This degraded system 39performance can be hard to investigate. 40 41Split-Locked Access Detection 42***************************** 43The `Intel Tremont Microarchitecture 44<https://newsroom.intel.com/news/intel-introduces-tremont-microarchitecture>`_ 45introduced a new CPU capability for detecting Split-locked Access. When 46this feature is enabled, an alignment check exception (#AC) with error 47code 0 is raised for instructions causing a Split-locked Access. Because 48#AC is a fault, the instruction is not executed, giving the #AC handler 49an opportunity to decide how to handle this instruction: 50 51- It can allow the instruction to run with LOCK# bus signal potentially 52 impacting performance of other CPUs. 53- It can disable LOCK# assertion for split locked access, but 54 improperly makes the instruction non-atomic. 55- It can terminate the software at this instruction. 56 57Feature Enumeration and Control 58******************************* 59#AC for Split-locked Access feature is enumerated and controlled via CPUID and 60MSR registers. 61 62- CPUID.(EAX=0x7, ECX=0):EDX[30], the 30th bit of output value in EDX indicates 63 if the platform has IA32_CORE_CAPABILITIES MSR. 64 65- The 5th bit of IA32_CORE_CAPABILITIES MSR(0xcf), enumerates whether the CPU 66 supports #AC for Split-locked Access (and has TEST_CTRL MSR). 67 68- The 29th bit of TEST_CTL MSR(0x33) controls enabling and disabling #AC for Split-locked 69 Access. 70 71ACRN Handling Split-Locked Access 72********************************* 73Split-locked Access is not expected in the ACRN hypervisor itself, and 74should never happen. However, such access could happen inside a VM. ACRN 75support for handling split-locked access follows these design principles: 76 77- Always enable #AC on Split-locked Access for the physical processors. 78 79- Present a virtual split lock capability to guest (VMs), and directly 80 deliver the alignment check exception (#AC) to the guest. (This 81 virtual split-lock capability helps the guest isolate violations from 82 user land). 83 84- Guest write of MSR_TEST_CTL is ignored, and guest read gets the written value. 85 86- Any Split-locked Access in the ACRN hypervisor is a software bug we must fix. 87 88- If split-locked Access happens in a guest kernel, the guest may not be able to 89 fix the issue gracefully. (The guest may behave differently than the 90 native OS). The real-time (RT) guest must avoid a Split-locked Access 91 and consider it a software bug. 92 93Enable Split-Locked Access Handling Early 94========================================== 95This feature is enumerated at the Physical CPU (pCPU) pre-initialization 96stage, where ACRN detects CPU capabilities. If the pCPU supports this 97feature: 98 99- Enable it at each pCPU post-initialization stage. 100 101- ACRN hypervisor presents a virtual emulated TEST_CTRL MSR to each 102 Virtual CPU (vCPU). 103 Setting or clearing TEST_CTRL[bit 29] in a vCPU, has no effect. 104 105If pCPU does not have this capability, a vCPU does not have the virtual 106TEST_CTRL either. 107 108Expected Behavior in ACRN 109========================= 110The ACRN hypervisor should never trigger Split-locked Access and it is 111not allowed to run with Split-locked Access. If ACRN does trigger a 112split-locked access, ACRN reports #AC at the instruction and stops 113running. The offending HV instruction is considered a bug that must be 114fixed. 115 116Expected Behavior in VM 117======================= 118If a VM process has a Split-locked Access in user space, it will be 119terminated by SIGBUS. When debugging inside a VM, you may find it 120triggers an #AC even if alignment checking is disabled. 121 122If a VM kernel has a Split-locked Access, it will hang or oops on an 123#AC. A VM kernel may try to disable #AC for Split-locked Access and 124continue, but it will fail. The ACRN hypervisor helps identify the 125problem by reporting a warning message that the VM tried writing to 126TEST_CTRL MSR. 127 128 129Disable Split-Locked Access Detection 130===================================== 131If the CPU supports Split-locked Access detection, the ACRN hypervisor 132uses it to prevent any VM running with potential system performance 133impacting split-locked instructions. This detection can be disabled 134(by deselecting the :term:`Enable split lock detection` option in 135the ACRN Configurator tool) for customers not 136caring about system performance. 137