1# Runtime Lock Validation in Zircon and Fuchsia 2 3## Introduction 4 5Lock validation is a technique for checking the consistency of locking behavior 6in a program to find potential deadlock hazards. This document discusses 7relevant aspects of the static and dynamic approaches to lock validation and 8presents the foundation for the runtime lock validation library available in 9Zircon and Fuchsia. 10 11## Background 12 13Lock validation may be performed either statically or dynamically. The following 14summarizes the important differences between static and dynamic approaches to 15lock validation: 16 17* When the validation is performed: compile time vs. run time. 18* How effective the validator is at finding potential problems. 19* What level of involvement is required by the programmer. 20* The overhead cost of the validation itself. 21 22### Static Validation 23 24Static validation is typically performed at compile time by analyzing the call 25graphs produced by the compiler or other source-level processor. With this 26approach it is necessary to instrument the code and locking primitives with 27annotations to inform the validator about which types represent locks and which 28rules to apply (or not) to the code that uses the lock types. 29 30The benefits of static validation include early detection of issues at build 31time, deterministic validation results, and zero runtime overhead. This 32combination of properties make it attractive to always enable static validation, 33ensuring that locking issues are often caught before code makes it into the 34build, without impacting the performance of the build artifacts. 35 36Static validation also has some down sides. One problem is that static 37validation requires correct, consistent application of a variety of annotations 38to both locks and code to provide useful results. This can cause maintenance 39issues unless diligent code review standards are implemented. Another issue is 40that static validation has limited visibility and can be fooled by conditional 41paths, dynamic dispatch, move semantics, and lock dependencies that span 42compilation units. 43 44### Dynamic Validation 45 46Dynamic validation is performed online at runtime by observing the relationships 47between locks as the code executes. With this approach it is generally 48sufficient to instrument just the locking primitives and acquire/release 49operations to provide the information required for validation. 50 51The benefits of dynamic validation include simpler instrumentation (from the 52user's perspective) and potentially greater visibility into the actual runtime 53behavior of the program. This makes dynamic validation useful in large code 54bases, where it may not be possible for static validation to see the full 55set of possible lock interactions. 56 57The main downsides of dynamic validation are runtime overhead and execution 58coverage requirements. Because dynamic validation must track lock interactions 59at runtime, each acquire and release incurs a non-zero execution cost to update 60tracking data, in addition to the memory overhead of the tracking data itself. 61Runtime tracking also has the consequence that code paths that are not executed 62cannot be analyzed by the validator. This may increase the burden on the 63developer and QA to ensure sufficient execution coverage if that is not already 64a project requirement. 65 66 67### Locking Ordering Invariant 68 69The job of the lock validator is to determine whether or not the lock invariants 70of the program hold. The primary invariant is the order between two or more 71locks: all paths in a program that acquire two or more locks must do so in an 72order consistent with every other path involving two or more of the same locks to 73avoid the potential for deadlock. Environments that deal with hardware 74interrupts, such as embedded systems and kernels, have an additional ordering 75invariant to avoid interrupt-induced deadlocks. These invariants are illustrated 76in the following subsections. 77 78##### Basic Inversion 79 80The simplest form of inversion occurs when a program has two locks that are 81both acquired sequentially with inconsistent orders in different paths. 82 83For example, a program with the locks **A** and **B** and code paths 84**P<sub>1</sub>** and **P<sub>2</sub>** and the following behavior has the 85potential for deadlock: 86 87Path **P<sub>1</sub>** acquires and releases the locks in the sequence: 88 891. Acquire(**A**) 902. Acquire(**B**) 913. Release(**B**) 924. Release(**A**) 93 94Path **P<sub>2</sub>** acquires and releases the locks in the inverted sequence: 95 961. Acquire(**B**) 972. Acquire(**A**) 983. Release(**A**) 994. Release(**B**) 100 101With the right interleaving, perhaps due to both paths executing concurrently 102on different threads, a deadlock occurs when path **P<sub>1</sub>** holds lock 103**A** and blocks waiting for lock **B**, while path **P<sub>2</sub>** holds lock 104**B** and blocks waiting for lock **A**. 105 106##### Circular Dependency 107 108Inversion may also occur between more than two locks and paths. This kind of 109inversion is much harder to recognize through manual inspection because each 110pair of locks involved may appear to be correctly ordered in every path involving 111just the pairs, and yet a potential deadlock may still exist given overall 112ordering of the locks. 113 114For example, a program with the locks **A**, **B**, and **C**; paths 115**P<sub>1</sub>**, **P<sub>2</sub>**, and **P<sub>3</sub>**; with the following 116behavior has the potential for deadlock: 117 118Path **P<sub>1</sub>** acquires and releases the locks in the sequence: 119 1201. Acquire(**A**) 1212. Acquire(**B**) 1223. Release(**B**) 1234. Release(**A**) 124 125Path **P<sub>2</sub>** acquires and releases the locks in the sequence: 126 1271. Acquire(**B**) 1282. Acquire(**C**) 1293. Release(**C**) 1304. Release(**B**) 131 132Path **P<sub>3</sub>** acquires and releases the locks in the sequence: 133 1341. Acquire(**C**) 1352. Acquire(**A**) 1363. Release(**A**) 1374. Release(**C**) 138 139With the right interleaving of paths **P<sub>1</sub>**, **P<sub>2</sub>**, 140and **P<sub>3</sub>** a deadlock occurs as each path acquires the lock at the 141first step and waits for the lock at the second step. In practice this situation 142may be compounded by the existence of many different paths that produce the same 143pairwise lock sequences. 144 145##### IRQ-Safe Ordering 146 147In systems that deal with hardware interrupts the ordering between irq-safe and 148non-irq-safe locks is critical: a non-irq-safe lock must never be acquired while 149holding an irq-safe lock to prevent indirect lock inversions. Irq-safe locks 150preserve ordering between irq and non-irq context; a consistent order of two or 151more irq-safe locks is guaranteed to be safe for paths running in both irq and 152non-irq context. The same is not true for non-irq-safe locks. The reason for this 153is that non-irq-safe locks permit irq handlers to effectively insert the locks 154acquired by the handler at arbitrary points in the interrupted task's lock 155sequences. 156 157For example, a system with non-irq-safe lock **A** and irq-safe lock 158**B<sub>irq</sub>**; paths **P<sub>1</sub>**, **P<sub>2</sub>**, and irq path 159**P<sub>irq</sub>**; with the following behavior has the potential for deadlock: 160 161Path **P<sub>1</sub>** on **CPU1** acquires and releases the lock in sequence: 162 1631. Acquire(**A**) 1642. _**P<sub>irq</sub>** interrupts here on **CPU1**_ 1653. Release(**A**) 166 167Path **P<sub>irq</sub>** on **CPU1** acquires and releases the lock in sequence: 168 1691. Acquire(**B<sub>irq</sub>**) 1702. Release(**B<sub>irq</sub>**) 171 172Path **P<sub>2</sub>** on **CPU2** acquires and releases the locks in sequence: 173 1741. Acquire(**B<sub>irq</sub>**) 1752. Acquire(**A**) 1763. Release(**A**) 1774. Release(**B<sub>irq</sub>**) 178 179With the right interleaving of paths **P<sub>1</sub>**, **P<sub>2</sub>**, and 180**P<sub>irq</sub>** a deadlock occurs as **P<sub>irq</sub>** attempts to acquire 181**B<sub>irq</sub>** while **P<sub>2</sub>** holds **B<sub>irq</sub>** and blocks 182waiting for **A**. This is an indirect lock inversion: **P<sub>irq</sub>** 183effectively inserts an acquire/release sequence of **B<sub>irq</sub>** in the 184middle of the acquire/release sequence of **A** in path **P<sub>1</sub>**, which 185is inconsistent with the lock sequence for the same locks in path 186**P<sub>2</sub>**. 187 188### Performing Validation 189 190The invariants discussed in the previous section can be validated using a finite 191directed graph. The directed graph tracks the identity and order of locks as the 192analysis traverses the code paths. Such a graph can be built either by traversing 193the call graphs generated by a compiler or source-level processor (static 194analysis) or by observing the ordering of locks during program execution (dynamic 195analysis). This section introduce the process in abstract terms that apply to 196either approach, in preparation for developing a concrete dynamic analysis 197strategy later on. 198 199In the most general terms, building a directed graph from a code path requires 200maintaining a list of actively held locks as the path is traversed: a node 201representing a lock is added to the list whenever the lock is acquired and 202removed from the list whenever the lock is released. In addition to maintaining 203the active list, a directed edge is added to the graph from a vertex representing 204the newly acquired lock to each vertex representing a lock already in the list. 205 206#### Basic Inversion Example 207 208This section illustrates a directed graph approach to detect a basic two-lock 209inversion. 210 211Recall from the earlier example a program with the locks **A** and **B**; code 212paths **P<sub>1</sub>** and **P<sub>2</sub>**; and the following behavior: 213 214Path **P<sub>1</sub>** acquires and releases the locks in the sequence: 215 2161. Acquire(**A**) 2172. Acquire(**B**) 2183. Release(**B**) 2194. Release(**A**) 220 221Path **P<sub>2</sub>** acquires and releases the locks in the inverted sequence: 222 2231. Acquire(**B**) 2242. Acquire(**A**) 2253. Release(**A**) 2264. Release(**B**) 227 228##### Analysis of Path **P<sub>1</sub>** 229 230Starting with path **P<sub>1</sub>** we define and update the directed graph. 231 232Let **L<sub>1</sub>** be the ordered _active_ list of locks held by path 233**P<sub>1</sub>**. 234 235Let **G** = (**V**, **E**) be the directed graph, having the set of vertices 236**V** representing observed locks and the set of directed edges between vertices 237**E**. 238 239Initial state: 240 241| **L<sub>1</sub>** | **V** | **E** | 242|-------------------|-------|-------| 243| () | {} | {} | 244 245After **P<sub>1</sub>** step 1: 246 247| **L<sub>1</sub>** | **V** | **E** | 248|-------------------|---------|-------| 249| (**A**) | {**A**} | {} | 250 251This step adds lock **A** to the active list and introduces a vertex for the 252same lock to the directed graph. Since there are no other locks in the active 253list no edges are added. 254 255After **P<sub>1</sub>** step 2: 256 257| **L<sub>1</sub>** | **V** | **E** | 258|-------------------|----------------|------------------| 259| (**A**, **B**) | {**A**, **B**} | {(**B**, **A**)} | 260 261This step adds lock **B** to the active list and also introduces a corresponding 262vertex to the graph. This time the active list does contain a lock, so an edge 263from the new lock to the existing lock is added to the graph. This edge 264represents that lock **B** now _depends_ on lock **A** preceding it in any 265other path that involves both locks. 266 267After **P<sub>1</sub>** step 3: 268 269| **L<sub>1</sub>** | **V** | **E** | 270|-------------------|----------------|------------------| 271| (**A**) | {**A**, **B**} | {(**B**, **A**)} | 272 273Lock **B** is removed from the active list. No updates to the graph. 274 275After **P<sub>1</sub>** step 4: 276 277| **L<sub>1</sub>** | **V** | **E** | 278|-------------------|----------------|------------------| 279| () | {**A**, **B**} | {(**B**, **A**)} | 280 281Lock **A** is removed from the active list. No updates to the graph. 282 283##### Analysis of Path **P<sub>2</sub>** 284 285Let **L<sub>2</sub>** be the active list of locks held by **P<sub>2</sub>**. 286 287Initial state: 288 289| **L<sub>2</sub>** | **V** | **E** | 290|-------------------|----------------|------------------| 291| () | {**A**, **B**} | {(**B**, **A**)} | 292 293In this case the initial state is the final state from path **P<sub>1</sub>**. 294 295After **P<sub>2</sub>** step 1: 296 297| **L<sub>2</sub>** | **V** | **E** | 298|-------------------|----------------|------------------| 299| (**B**) | {**A**, **B**} | {(**B**, **A**)} | 300 301This step adds lock **B** to the active list. As there are no other locks in the 302active list no edges are added to the graph. Since **B** already has a vertex in 303the graph there is also no change to **V**. 304 305After **P<sub>2</sub>** step 2: 306 307| **L<sub>2</sub>** | **V** | **E** | 308|-------------------|----------------|----------------------------------| 309| (**B**, **A**) | {**A**, **B**} | {(**B**, **A**), (**A**, **B**)} | 310 311This step adds lock **A** to the active list. Since this lock already has a 312vertex there is no change to **V**. However, because there is a lock in the 313active list an edge from the new lock to the existing lock is added to the 314graph. With this new edge the graph now forms a cycle between vertices **A** and 315**B**, indicating that ordering between these locks is not consistent between 316the two paths considered thus far and that a potential deadlock exists. 317 318#### Circular Dependency Example 319 320This section illustrates a directed graph approach to detect a circular 321dependency inversion using previously discussed example from the invariants 322section. This illustration is somewhat abbreviated due to the similarity to the 323previous illustration. 324 325Consider a program with the locks **A**, **B**, and **C** and paths 326**P<sub>1</sub>**, **P<sub>2</sub>**, and **P<sub>3</sub>** and the following 327behavior: 328 329Path **P<sub>1</sub>** acquires and releases the locks in the sequence: 330 3311. Acquire(**A**) 3322. Acquire(**B**) 3333. Release(**B**) 3344. Release(**A**) 335 336Path **P<sub>2</sub>** acquires and releases the locks in the sequence: 337 3381. Acquire(**B**) 3392. Acquire(**C**) 3403. Release(**C**) 3414. Release(**B**) 342 343Path **P<sub>3</sub>** acquires and releases the locks in the sequence: 344 3451. Acquire(**C**) 3462. Acquire(**A**) 3473. Release(**A**) 3484. Release(**C**) 349 350##### Analysis of Path **P<sub>1</sub>** 351 352Let **L<sub>1</sub>** be the ordered _active_ list of locks held by path 353**P<sub>1</sub>**. 354 355Let **G** = (**V**, **E**) be the directed graph, having the set of vertices 356**V** representing observed locks and the set of directed edges between vertices 357**E**. 358 359Initial state: 360 361| **L<sub>1</sub>** | **V** | **E** | 362|-------------------|-------|-------| 363| () | {} | {} | 364 365After **P<sub>1</sub>** step 1: 366 367| **L<sub>1</sub>** | **V** | **E** | 368|-------------------|---------|-------| 369| (**A**) | {**A**} | {} | 370 371 372After **P<sub>1</sub>** step 2: 373 374| **L<sub>1</sub>** | **V** | **E** | 375|-------------------|----------------|------------------| 376| (**A**, **B**) | {**A**, **B**} | {(**B**, **A**)} | 377 378 379After **P<sub>1</sub>** step 3: 380 381| **L<sub>1</sub>** | **V** | **E** | 382|-------------------|----------------|------------------| 383| (**A**) | {**A**, **B**} | {(**B**, **A**)} | 384 385 386After **P<sub>1</sub>** step 4: 387 388| **L<sub>1</sub>** | **V** | **E** | 389|-------------------|----------------|------------------| 390| () | {**A**, **B**} | {(**B**, **A**)} | 391 392##### Analysis of Path **P<sub>2</sub>** 393 394Let **L<sub>2</sub>** be the ordered _active_ list of locks held by path 395**P<sub>2</sub>**. 396 397Initial state: 398 399| **L<sub>2</sub>** | **V** | **E** | 400|-------------------|----------------|------------------| 401| () | {**A**, **B**} | {(**B**, **A**)} | 402 403 404After **P<sub>2</sub>** step 1: 405 406| **L<sub>2</sub>** | **V** | **E** | 407|-------------------|----------------|------------------| 408| (**B**) | {**A**, **B**} | {(**B**, **A**)} | 409 410After **P<sub>2</sub>** step 2: 411 412| **L<sub>2</sub>** | **V** | **E** | 413|-------------------|-----------------------|----------------------------------| 414| (**B**, **C**) | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} | 415 416This step adds lock **C** to the active list and also introduces a corresponding 417vertex to the graph. The active list contains the lock **B**, so an edge is added 418from **C** to **B**. 419 420After **P<sub>2</sub>** step 3: 421 422| **L<sub>2</sub>** | **V** | **E** | 423|-------------------|-----------------------|----------------------------------| 424| (**B**) | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} | 425 426 427After **P<sub>2</sub>** step 4: 428 429| **L<sub>2</sub>** | **V** | **E** | 430|-------------------|-----------------------|----------------------------------| 431| () | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} | 432 433##### Analysis of Path **P<sub>3</sub>** 434 435Let **L<sub>3</sub>** be the ordered _active_ list of locks held by path 436**P<sub>3</sub>**. 437 438Initial state: 439 440| **L<sub>3</sub>** | **V** | **E** | 441|-------------------|-----------------------|----------------------------------| 442| () | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} | 443 444 445After **P<sub>3</sub>** step 1: 446 447| **L<sub>3</sub>** | **V** | **E** | 448|-------------------|-----------------------|----------------------------------| 449| (**C**) | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} | 450 451After **P<sub>3</sub>** step 2: 452 453| **L<sub>3</sub>** | **V** | **E** | 454|-------------------|-----------------------|--------------------------------------------------| 455| (**C**, **A**) | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**), (**A**, **C**)} | 456 457This step adds lock **A** to the active list. The active list contains the lock 458**C**, so an edge is added from **A** to **C**. With this new edge the graph now 459forms a cycle in the vertices (**A**, **B**, **C**), indicating a circular 460dependency and the potential for deadlock if paths **P<sub>1</sub>**, 461**P<sub>2</sub>**, and **P<sub>3</sub>** are interleaved in the right way. 462 463#### IRQ-Safe Ordering Example 464 465This section illustrates a directed graph approach to detect irq-safe order 466violations using the previously discussed example from the invariants section. 467 468Recall the example system with non-irq-safe lock **A** and irq-safe lock 469**B<sub>irq</sub>**; paths **P<sub>1</sub>**, **P<sub>2</sub>**, and irq path 470**P<sub>irq</sub>**; with the following behavior: 471 472Path **P<sub>1</sub>** acquires and releases the lock in sequence: 473 4741. Acquire(**A**) 4752. Release(**A**) 476 477Path **P<sub>irq</sub>** acquires and releases the lock in sequence: 478 4791. Acquire(**B<sub>irq</sub>**) 4802. Release(**B<sub>irq</sub>**) 481 482Path **P<sub>2</sub>** acquires and releases the locks in sequence: 483 4841. Acquire(**B<sub>irq</sub>**) 4852. Acquire(**A**) 4863. Release(**A**) 4874. Release(**B<sub>irq</sub>**) 488 489##### Analysis of Path **P<sub>1</sub>** 490 491Let **L<sub>1</sub>** be the ordered _active_ list of locks held by path 492**P<sub>1</sub>**. 493 494Let **G** = (**V**, **E**) be the directed graph, having the set of vertices 495**V** representing observed locks and the set of directed edges between vertices 496**E**. 497 498Initial state: 499 500| **L<sub>1</sub>** | **V** | **E** | 501|-------------------|-------|-------| 502| () | {} | {} | 503 504After **P<sub>1</sub>** step 1: 505 506| **L<sub>1</sub>** | **V** | **E** | 507|-------------------|---------|-------| 508| (**A**) | {**A**} | {} | 509 510After **P<sub>1</sub>** step 2: 511 512| **L<sub>1</sub>** | **V** | **E** | 513|-------------------|---------|-------| 514| () | {**A**} | {} | 515 516##### Analysis of Path **P<sub>irq</sub>** 517 518Let **L<sub>irq</sub>** be the ordered _active_ list of locks held by path 519**P<sub>irq</sub>**. 520 521Initial state: 522 523| **L<sub>irq</sub>** | **V** | **E** | 524|---------------------|---------|-------| 525| () | {**A**} | {} | 526 527After **P<sub>irq</sub>** step 1: 528 529| **L<sub>irq</sub>** | **V** | **E** | 530|-----------------------|------------------------------|-------| 531| (**B<sub>irq</sub>**) | {**A**, **B<sub>irq</sub>**} | {} | 532 533After **P<sub>irq</sub>** step 2: 534 535| **L<sub>irq</sub>** | **V** | **E** | 536|---------------------|------------------------------|-------| 537| () | {**A**, **B<sub>irq</sub>**} | {} | 538 539##### Analysis of Path **P<sub>irq</sub>** 540 541Let **L<sub>2</sub>** be the ordered _active_ list of locks held by path 542**P<sub>2</sub>**. 543 544Initial state: 545 546| **L<sub>2</sub>** | **V** | **E** | 547|-------------------|---------|-------| 548| () | {**A**} | {} | 549 550After **P<sub>2</sub>** step 1: 551 552| **L<sub>2</sub>** | **V** | **E** | 553|-----------------------|------------------------------|-------| 554| (**B<sub>irq</sub>**) | {**A**, **B<sub>irq</sub>**} | {} | 555 556After **P<sub>2</sub>** step 2: 557 558| **L<sub>2</sub>** | **V** | **E** | 559|------------------------------|------------------------------|--------------------------------| 560| (**B<sub>irq</sub>**, **A**) | {**A**, **B<sub>irq</sub>**} | {(**A**, **B<sub>irq</sub>**)} | 561 562This step adds lock **A** to the active list. The active list contains lock 563**B<sub>irq</sub>**, so an edge is added from **A** to **B<sub>irq</sub>**. 564Because this is an edge from a non-irq-safe lock to an irq-safe lock the irq-safe 565ordering invariant is violated and a potential deadlock exists. 566 567## From Theory to Implementation 568 569This section develops a concrete strategy to implement a directed graph 570validator, based on the analysis techniques of the previous section. 571 572The implementation strategy has the following goals: 573 5741. Avoid dynamic allocation if possible. 5752. Minimize the overhead of validation. 5763. Support environments that manage hardware interrupts. 577 578### Removing Redundancy with Lock Classes 579 580In the analysis earlier in this document, locks are considered abstractly with 581the implication that the tracked objects are individual instances of locks. 582While tracking individual instances produces correct results, it has several 583consequences that might be avoided: 584 5851. Tracking structures must be dynamically adjusted as lock instances come into 586 and out of existence, possibly requiring dynamic allocation or other 587 per-instance data storage. 5882. The graph contains redundant information when multiple instances of locks are 589 used identically by the same code paths. 5903. Relatedly, it may take longer to identify violations by locks that serve 591 identical functions, but have not yet individually propagated through all of 592 the necessary code paths. 593 594A key observation is that locks that serve identical functions should follow the 595same ordering rules, regardless of the number of instances. 596 597Consider the following types with lock members and an operation that mutates 598both types: 599 600```C++ 601struct Foo { 602 Mutex lock; 603 int data; GUARDED_BY(lock); 604}; 605 606struct Bar { 607 Mutex lock; 608 int data; GUARDED_BY(lock); 609}; 610 611void Swap(Foo* foo, Bar* bar) { 612 foo->lock.Acquire(); 613 bar->lock.Acquire(); 614 615 int temp = foo->data; 616 foo->data = bar->data; 617 bar->data = temp; 618 619 bar->Release(); 620 foo->Release(); 621} 622``` 623 624Since operation `Swap` may operate on any instance of `Foo` and any instance of 625`Bar` it follows that `Swap` establishes an order between the locks of all 626instances of `Foo` and `Bar`; failure to apply this order consistently in other 627parts of a program could result in a deadlock when the same instances of `Foo` 628and `Bar` are locked concurrently in different orders. 629 630Note that it is possible to intentionally or unintentionally segregate different 631collections of `Foo` and `Bar` such that instances locked in different orders 632never overlap. This is still dangerous however, because seemingly innocuous 633changes to the inputs, structure, or timing of the program could defeat the 634segregation and introduce a potential deadlock. This problem can be avoided 635entirely by treating all instances of `Foo` and `Bar` equivalently and applying 636the same ordering rules throughout the program. 637 638Ensuring universal ordering throughout the program can be achieved by tracking 639classes of locks instead of lock instances: each lock member in each type 640represents a unique lock class. The relationships between each lock class can 641be tracked and analyzed using the same directed graph techniques as with 642individual locks. 643 644Tracking lock classes has the following benefits: 645 6461. Statically allocated memory: because all lock classes are known at compile 647 time, tracking structures can be allocated up front as static global data. 6482. Elimination of redundant graph nodes: locks in the same class use the same 649 tracking structures. 6503. Faster detection of invariant violations: violations are detected when 651 lock class orders are inconsistent, even if the individual instances involved 652 have never been used together. 653 654#### Additional Ordering Rules 655 656Tracking lock classes introduces additional ordering considerations when locking 657multiple locks of the same class. Because individual instances are not tracked 658it is necessary to take additional steps to ensure consistency when multiple 659locks of the same class must be acquired at the same time. 660 661##### Externally Ordered Locks 662 663Nesting locks of the same class is necessary when a hierarchical or other 664ordered data structure has locks in each node and more than one per-node lock 665must be held at a time. In this situation the data structure or access pattern 666must provide a stable ordering that is used to guarantee ordering of the locks. 667 668Validation of nestable lock classes requires only that the external order is 669recorded in the active locks list for each nestable lock and compared when new 670locks of the same class are added to the list. A consequence of this design is 671that other lock classes may not be interspersed between nested locks of the 672same class, only wholly before or after a collection of nested locks. 673 674For example, non-nestable lock classes **A** and **B**, and nestable lock class 675**N** may be interspersed like this: 676 677**A**, **N<sub>0</sub>**, **N<sub>1</sub>**, ... **N<sub>n</sub>**, **B** 678 679But not like this: 680 681**A**, **N<sub>0</sub>**, **B**, **N<sub>1</sub>**, ... **N<sub>n</sub>** or 682**A**, **N<sub>0</sub>**, **N<sub>1</sub>**, **B**, ... **N<sub>n</sub>** or 683... etc 684 685In most situations this is a reasonable constraint, as interspersing other locks 686within a nested structure with arbitrary depth is likely to result in inversions 687as the structure is updated at runtime. On the other hand, in situations where 688nesting is bounded to a few levels it may be more effective to define separate 689lock classes for each level instead of using a nested class -- in this case 690other locks may be allowed at a specific level following normal lock ordering 691rules. 692 693##### Address Ordering 694 695It is difficult to generalize lock ordering between locks of the same class 696without an externally provided order when the locks are acquired at different 697times. It is possible however, to provide an ordering guarantee when acquiring 698multiple locks at the same time, without temporal separation. In this situation 699the locks may be ordered by address, guaranteeing that any path that acquires 700the same set locks produces a consistent locking order. 701 702For example, consider an operation **F**(**S<sub>a</sub>**, **S<sub>b</sub>**) 703that operates on two instances of structure **S**, each with a lock of class 704**L** and, as part of the operation **F** must lock both locks. 705 706If instance **S<usb>0</sub>** is ordered in memory before instance 707**S<sub>1</sub>** then the locks have the same relative ordering as their 708containing instances. We can consider the locks to have the subclasses 709**L<sub>0</sub>** and **L<sub>1</sub>** respectively. 710 711A lock ordering problem arises if we perform the operation with different 712orders: 713 714**F**(**S<sub>0</sub>**, **S<sub>1</sub>**) and 715**F**(**S<sub>1</sub>**, **S<sub>0</sub>**) 716 717Without intervention these produce the inverted lock sequences: 718 719**L<sub>0</sub>**, **L<sub>1</sub>** and **L<sub>1</sub>**, **L<sub>0</sub>** 720 721Since **F** has simultaneous access to both locks at the same time, it is 722possible to order the locks by address, resulting in a consistent lock 723sequence regardless of the original order of the arguments. 724 725Now suppose we add two more lock classes to the sequence: class **A** acquired 726before operation **F** and class **B** acquired after operation **F**. The 727resulting lock sequence is: 728 729**A**, **L<sub>0</sub>**, **L<sub>1</sub>**, **B** 730 731Note that this looks similar to the nested lock class sequence diagram in the 732previous section. It is in fact the same situation, only the ordering of locks 733is provided by address rather than an external order. This means that the same 734bookkeeping in the active threads list can be used for both situations. 735 736#### Lock Class Tracking Data Structure 737 738This section discusses implementation details for tracking lock classes and 739concrete processing techniques to detect potential deadlocks. 740 741Each lock class has a statically allocated node in the directed graph 742representing all locks belonging to that class. Each node has the following data 743structures: 744 745##### Lock-Free, Wait-Free Hash Set 746 747Each lock class node has a hash set that tracks the edges from the lock class to 748the lock classes ordered before it. 749 750**TODO**: Add implementation details of the hash set. 751 752##### Lock-Free, Wait-Free Disjoint Set Structures 753 754Each lock class node has a parent pointer used to track nodes that are connected 755in cycles in the directed graph. This permits reporting cycles that have been 756previously by the loop detection algorithm without fully re-traversing the graph. 757 758**TODO**: Add implementation details of the disjoint set structure. 759 760##### Thread-Local Lock List 761 762Each thread maintains a thread-local list of the locks it currently holds. 763 764**TODO**: Add implementation details of the thread-local lock list. 765 766##### Loop Detection Thread 767 768Whenever a new edge is added to the directed graph, the loop detection thread is 769triggered to traverse the graph to find circular dependencies involving more than 770two locks. Tarjan's strongly connected sets algorithm is an efficient choice, 771with worst case complexity of **O**(|**E**| + |**V**|). This algorithm is stable 772even when traversing a graph that is updated concurrently by other threads. 773 774**TODO**: Add implementation details of the loop detection thread. 775 776## References 777 7781. Clang static [thread safety analysis](https://clang.llvm.org/docs/ThreadSafetyAnalysis.html). 7792. LLVM runtime [thread sanitizer](https://github.com/google/sanitizers/wiki/ThreadSanitizerDeadlockDetector). 7803. Linux Kernel [lockdep subsystem](https://www.kernel.org/doc/Documentation/locking/lockdep-design.txt). 781 782