1# Runtime Lock Validation in Zircon and Fuchsia
2
3## Introduction
4
5Lock validation is a technique for checking the consistency of locking behavior
6in a program to find potential deadlock hazards. This document discusses
7relevant aspects of the static and dynamic approaches to lock validation and
8presents the foundation for the runtime lock validation library available in
9Zircon and Fuchsia.
10
11## Background
12
13Lock validation may be performed either statically or dynamically. The following
14summarizes the important differences between static and dynamic approaches to
15lock validation:
16
17* When the validation is performed: compile time vs. run time.
18* How effective the validator is at finding potential problems.
19* What level of involvement is required by the programmer.
20* The overhead cost of the validation itself.
21
22### Static Validation
23
24Static validation is typically performed at compile time by analyzing the call
25graphs produced by the compiler or other source-level processor. With this
26approach it is necessary to instrument the code and locking primitives with
27annotations to inform the validator about which types represent locks and which
28rules to apply (or not) to the code that uses the lock types.
29
30The benefits of static validation include early detection of issues at build
31time, deterministic validation results, and zero runtime overhead. This
32combination of properties make it attractive to always enable static validation,
33ensuring that locking issues are often caught before code makes it into the
34build, without impacting the performance of the build artifacts.
35
36Static validation also has some down sides. One problem is that static
37validation requires correct, consistent application of a variety of annotations
38to both locks and code to provide useful results. This can cause maintenance
39issues unless diligent code review standards are implemented. Another issue is
40that static validation has limited visibility and can be fooled by conditional
41paths, dynamic dispatch, move semantics, and lock dependencies that span
42compilation units.
43
44### Dynamic Validation
45
46Dynamic validation is performed online at runtime by observing the relationships
47between locks as the code executes. With this approach it is generally
48sufficient to instrument just the locking primitives and acquire/release
49operations to provide the information required for validation.
50
51The benefits of dynamic validation include simpler instrumentation (from the
52user's perspective) and potentially greater visibility into the actual runtime
53behavior of the program. This makes dynamic validation useful in large code
54bases, where it may not be possible for static validation to see the full
55set of possible lock interactions.
56
57The main downsides of dynamic validation are runtime overhead and execution
58coverage requirements. Because dynamic validation must track lock interactions
59at runtime, each acquire and release incurs a non-zero execution cost to update
60tracking data, in addition to the memory overhead of the tracking data itself.
61Runtime tracking also has the consequence that code paths that are not executed
62cannot be analyzed by the validator. This may increase the burden on the
63developer and QA to ensure sufficient execution coverage if that is not already
64a project requirement.
65
66
67### Locking Ordering Invariant
68
69The job of the lock validator is to determine whether or not the lock invariants
70of the program hold. The primary invariant is the order between two or more
71locks: all paths in a program that acquire two or more locks must do so in an
72order consistent with every other path involving two or more of the same locks to
73avoid the potential for deadlock. Environments that deal with hardware
74interrupts, such as embedded systems and kernels, have an additional ordering
75invariant to avoid interrupt-induced deadlocks. These invariants are illustrated
76in the following subsections.
77
78##### Basic Inversion
79
80The simplest form of inversion occurs when a program has two locks that are
81both acquired sequentially with inconsistent orders in different paths.
82
83For example, a program with the locks **A** and **B** and code paths
84**P<sub>1</sub>** and **P<sub>2</sub>** and the following behavior has the
85potential for deadlock:
86
87Path **P<sub>1</sub>** acquires and releases the locks in the sequence:
88
891. Acquire(**A**)
902. Acquire(**B**)
913. Release(**B**)
924. Release(**A**)
93
94Path **P<sub>2</sub>** acquires and releases the locks in the inverted sequence:
95
961. Acquire(**B**)
972. Acquire(**A**)
983. Release(**A**)
994. Release(**B**)
100
101With the right interleaving, perhaps due to both paths executing concurrently
102on different threads, a deadlock occurs when path **P<sub>1</sub>** holds lock
103**A** and blocks waiting for lock **B**, while path **P<sub>2</sub>** holds lock
104**B** and blocks waiting for lock **A**.
105
106##### Circular Dependency
107
108Inversion may also occur between more than two locks and paths. This kind of
109inversion is much harder to recognize through manual inspection because each
110pair of locks involved may appear to be correctly ordered in every path involving
111just the pairs, and yet a potential deadlock may still exist given overall
112ordering of the locks.
113
114For example, a program with the locks **A**, **B**, and **C**; paths
115**P<sub>1</sub>**, **P<sub>2</sub>**, and **P<sub>3</sub>**; with the following
116behavior has the potential for deadlock:
117
118Path **P<sub>1</sub>** acquires and releases the locks in the sequence:
119
1201. Acquire(**A**)
1212. Acquire(**B**)
1223. Release(**B**)
1234. Release(**A**)
124
125Path **P<sub>2</sub>** acquires and releases the locks in the sequence:
126
1271. Acquire(**B**)
1282. Acquire(**C**)
1293. Release(**C**)
1304. Release(**B**)
131
132Path **P<sub>3</sub>** acquires and releases the locks in the sequence:
133
1341. Acquire(**C**)
1352. Acquire(**A**)
1363. Release(**A**)
1374. Release(**C**)
138
139With the right interleaving of paths **P<sub>1</sub>**, **P<sub>2</sub>**,
140and **P<sub>3</sub>** a deadlock occurs as each path acquires the lock at the
141first step and waits for the lock at the second step. In practice this situation
142may be compounded by the existence of many different paths that produce the same
143pairwise lock sequences.
144
145##### IRQ-Safe Ordering
146
147In systems that deal with hardware interrupts the ordering between irq-safe and
148non-irq-safe locks is critical: a non-irq-safe lock must never be acquired while
149holding an irq-safe lock to prevent indirect lock inversions. Irq-safe locks
150preserve ordering between irq and non-irq context; a consistent order of two or
151more irq-safe locks is guaranteed to be safe for paths running in both irq and
152non-irq context. The same is not true for non-irq-safe locks. The reason for this
153is that non-irq-safe locks permit irq handlers to effectively insert the locks
154acquired by the handler at arbitrary points in the interrupted task's lock
155sequences.
156
157For example, a system with non-irq-safe lock **A** and irq-safe lock
158**B<sub>irq</sub>**; paths **P<sub>1</sub>**, **P<sub>2</sub>**, and irq path
159**P<sub>irq</sub>**; with the following behavior has the potential for deadlock:
160
161Path **P<sub>1</sub>** on **CPU1** acquires and releases the lock in sequence:
162
1631. Acquire(**A**)
1642. _**P<sub>irq</sub>** interrupts here on **CPU1**_
1653. Release(**A**)
166
167Path **P<sub>irq</sub>** on **CPU1** acquires and releases the lock in sequence:
168
1691. Acquire(**B<sub>irq</sub>**)
1702. Release(**B<sub>irq</sub>**)
171
172Path **P<sub>2</sub>** on **CPU2** acquires and releases the locks in sequence:
173
1741. Acquire(**B<sub>irq</sub>**)
1752. Acquire(**A**)
1763. Release(**A**)
1774. Release(**B<sub>irq</sub>**)
178
179With the right interleaving of paths **P<sub>1</sub>**, **P<sub>2</sub>**, and
180**P<sub>irq</sub>** a deadlock occurs as **P<sub>irq</sub>** attempts to acquire
181**B<sub>irq</sub>** while **P<sub>2</sub>** holds **B<sub>irq</sub>** and blocks
182waiting for **A**. This is an indirect lock inversion: **P<sub>irq</sub>**
183effectively inserts an acquire/release sequence of **B<sub>irq</sub>** in the
184middle of the acquire/release sequence of **A** in path **P<sub>1</sub>**, which
185is inconsistent with the lock sequence for the same locks in path
186**P<sub>2</sub>**.
187
188### Performing Validation
189
190The invariants discussed in the previous section can be validated using a finite
191directed graph. The directed graph tracks the identity and order of locks as the
192analysis traverses the code paths. Such a graph can be built either by traversing
193the call graphs generated by a compiler or source-level processor (static
194analysis) or by observing the ordering of locks during program execution (dynamic
195analysis). This section introduce the process in abstract terms that apply to
196either approach, in preparation for developing a concrete dynamic analysis
197strategy later on.
198
199In the most general terms, building a directed graph from a code path requires
200maintaining a list of actively held locks as the path is traversed: a node
201representing a lock is added to the list whenever the lock is acquired and
202removed from the list whenever the lock is released. In addition to maintaining
203the active list, a directed edge is added to the graph from a vertex representing
204the newly acquired lock to each vertex representing a lock already in the list.
205
206#### Basic Inversion Example
207
208This section illustrates a directed graph approach to detect a basic two-lock
209inversion.
210
211Recall from the earlier example a program with the locks **A** and **B**; code
212paths **P<sub>1</sub>** and **P<sub>2</sub>**; and the following behavior:
213
214Path **P<sub>1</sub>** acquires and releases the locks in the sequence:
215
2161. Acquire(**A**)
2172. Acquire(**B**)
2183. Release(**B**)
2194. Release(**A**)
220
221Path **P<sub>2</sub>** acquires and releases the locks in the inverted sequence:
222
2231. Acquire(**B**)
2242. Acquire(**A**)
2253. Release(**A**)
2264. Release(**B**)
227
228##### Analysis of Path **P<sub>1</sub>**
229
230Starting with path **P<sub>1</sub>** we define and update the directed graph.
231
232Let **L<sub>1</sub>** be the ordered _active_ list of locks held by path
233**P<sub>1</sub>**.
234
235Let **G** = (**V**, **E**) be the directed graph, having the set of vertices
236**V** representing observed locks and the set of directed edges between vertices
237**E**.
238
239Initial state:
240
241| **L<sub>1</sub>** | **V** | **E** |
242|-------------------|-------|-------|
243| ()                | {}    | {}    |
244
245After **P<sub>1</sub>** step 1:
246
247| **L<sub>1</sub>** | **V**   | **E** |
248|-------------------|---------|-------|
249| (**A**)           | {**A**} | {}    |
250
251This step adds lock **A** to the active list and introduces a vertex for the
252same lock to the directed graph. Since there are no other locks in the active
253list no edges are added.
254
255After **P<sub>1</sub>** step 2:
256
257| **L<sub>1</sub>** | **V**          | **E**            |
258|-------------------|----------------|------------------|
259| (**A**, **B**)    | {**A**, **B**} | {(**B**, **A**)} |
260
261This step adds lock **B** to the active list and also introduces a corresponding
262vertex to the graph. This time the active list does contain a lock, so an edge
263from the new lock to the existing lock is added to the graph. This edge
264represents that lock **B** now _depends_ on lock **A** preceding it in any
265other path that involves both locks.
266
267After **P<sub>1</sub>** step 3:
268
269| **L<sub>1</sub>** | **V**          | **E**            |
270|-------------------|----------------|------------------|
271| (**A**)           | {**A**, **B**} | {(**B**, **A**)} |
272
273Lock **B** is removed from the active list. No updates to the graph.
274
275After **P<sub>1</sub>** step 4:
276
277| **L<sub>1</sub>** | **V**          | **E**            |
278|-------------------|----------------|------------------|
279| ()                | {**A**, **B**} | {(**B**, **A**)} |
280
281Lock **A** is removed from the active list. No updates to the graph.
282
283##### Analysis of Path **P<sub>2</sub>**
284
285Let **L<sub>2</sub>** be the active list of locks held by **P<sub>2</sub>**.
286
287Initial state:
288
289| **L<sub>2</sub>** | **V**          | **E**            |
290|-------------------|----------------|------------------|
291| ()                | {**A**, **B**} | {(**B**, **A**)} |
292
293In this case the initial state is the final state from path **P<sub>1</sub>**.
294
295After **P<sub>2</sub>** step 1:
296
297| **L<sub>2</sub>** | **V**          | **E**            |
298|-------------------|----------------|------------------|
299| (**B**)           | {**A**, **B**} | {(**B**, **A**)} |
300
301This step adds lock **B** to the active list. As there are no other locks in the
302active list no edges are added to the graph. Since **B** already has a vertex in
303the graph there is also no change to **V**.
304
305After **P<sub>2</sub>** step 2:
306
307| **L<sub>2</sub>** | **V**          | **E**                            |
308|-------------------|----------------|----------------------------------|
309| (**B**, **A**)    | {**A**, **B**} | {(**B**, **A**), (**A**, **B**)} |
310
311This step adds lock **A** to the active list. Since this lock already has a
312vertex there is no change to **V**. However, because there is a lock in the
313active list an edge from the new lock to the existing lock is added to the
314graph. With this new edge the graph now forms a cycle between vertices **A** and
315**B**, indicating that ordering between these locks is not consistent between
316the two paths considered thus far and that a potential deadlock exists.
317
318#### Circular Dependency Example
319
320This section illustrates a directed graph approach to detect a circular
321dependency inversion using previously discussed example from the invariants
322section. This illustration is somewhat abbreviated due to the similarity to the
323previous illustration.
324
325Consider a program with the locks **A**, **B**, and **C** and paths
326**P<sub>1</sub>**, **P<sub>2</sub>**, and **P<sub>3</sub>** and the following
327behavior:
328
329Path **P<sub>1</sub>** acquires and releases the locks in the sequence:
330
3311. Acquire(**A**)
3322. Acquire(**B**)
3333. Release(**B**)
3344. Release(**A**)
335
336Path **P<sub>2</sub>** acquires and releases the locks in the sequence:
337
3381. Acquire(**B**)
3392. Acquire(**C**)
3403. Release(**C**)
3414. Release(**B**)
342
343Path **P<sub>3</sub>** acquires and releases the locks in the sequence:
344
3451. Acquire(**C**)
3462. Acquire(**A**)
3473. Release(**A**)
3484. Release(**C**)
349
350##### Analysis of Path **P<sub>1</sub>**
351
352Let **L<sub>1</sub>** be the ordered _active_ list of locks held by path
353**P<sub>1</sub>**.
354
355Let **G** = (**V**, **E**) be the directed graph, having the set of vertices
356**V** representing observed locks and the set of directed edges between vertices
357**E**.
358
359Initial state:
360
361| **L<sub>1</sub>** | **V** | **E** |
362|-------------------|-------|-------|
363| ()                | {}    | {}    |
364
365After **P<sub>1</sub>** step 1:
366
367| **L<sub>1</sub>** | **V**   | **E** |
368|-------------------|---------|-------|
369| (**A**)           | {**A**} | {}    |
370
371
372After **P<sub>1</sub>** step 2:
373
374| **L<sub>1</sub>** | **V**          | **E**            |
375|-------------------|----------------|------------------|
376| (**A**, **B**)    | {**A**, **B**} | {(**B**, **A**)} |
377
378
379After **P<sub>1</sub>** step 3:
380
381| **L<sub>1</sub>** | **V**          | **E**            |
382|-------------------|----------------|------------------|
383| (**A**)           | {**A**, **B**} | {(**B**, **A**)} |
384
385
386After **P<sub>1</sub>** step 4:
387
388| **L<sub>1</sub>** | **V**          | **E**            |
389|-------------------|----------------|------------------|
390| ()                | {**A**, **B**} | {(**B**, **A**)} |
391
392##### Analysis of Path **P<sub>2</sub>**
393
394Let **L<sub>2</sub>** be the ordered _active_ list of locks held by path
395**P<sub>2</sub>**.
396
397Initial state:
398
399| **L<sub>2</sub>** | **V**          | **E**            |
400|-------------------|----------------|------------------|
401| ()                | {**A**, **B**} | {(**B**, **A**)} |
402
403
404After **P<sub>2</sub>** step 1:
405
406| **L<sub>2</sub>** | **V**          | **E**            |
407|-------------------|----------------|------------------|
408| (**B**)           | {**A**, **B**} | {(**B**, **A**)} |
409
410After **P<sub>2</sub>** step 2:
411
412| **L<sub>2</sub>** | **V**                 | **E**                            |
413|-------------------|-----------------------|----------------------------------|
414| (**B**, **C**)    | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} |
415
416This step adds lock **C** to the active list and also introduces a corresponding
417vertex to the graph. The active list contains the lock **B**, so an edge is added
418from **C** to **B**.
419
420After **P<sub>2</sub>** step 3:
421
422| **L<sub>2</sub>** | **V**                 | **E**                            |
423|-------------------|-----------------------|----------------------------------|
424| (**B**)           | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} |
425
426
427After **P<sub>2</sub>** step 4:
428
429| **L<sub>2</sub>** | **V**                 | **E**                            |
430|-------------------|-----------------------|----------------------------------|
431| ()                | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} |
432
433##### Analysis of Path **P<sub>3</sub>**
434
435Let **L<sub>3</sub>** be the ordered _active_ list of locks held by path
436**P<sub>3</sub>**.
437
438Initial state:
439
440| **L<sub>3</sub>** | **V**                 | **E**                            |
441|-------------------|-----------------------|----------------------------------|
442| ()                | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} |
443
444
445After **P<sub>3</sub>** step 1:
446
447| **L<sub>3</sub>** | **V**                 | **E**                            |
448|-------------------|-----------------------|----------------------------------|
449| (**C**)           | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**)} |
450
451After **P<sub>3</sub>** step 2:
452
453| **L<sub>3</sub>** | **V**                 | **E**                                            |
454|-------------------|-----------------------|--------------------------------------------------|
455| (**C**, **A**)    | {**A**, **B**, **C**} | {(**B**, **A**), (**C**, **B**), (**A**, **C**)} |
456
457This step adds lock **A** to the active list. The active list contains the lock
458**C**, so an edge is added from **A** to **C**. With this new edge the graph now
459forms a cycle in the vertices (**A**, **B**, **C**), indicating a circular
460dependency and the potential for deadlock if paths **P<sub>1</sub>**,
461**P<sub>2</sub>**, and **P<sub>3</sub>** are interleaved in the right way.
462
463#### IRQ-Safe Ordering Example
464
465This section illustrates a directed graph approach to detect irq-safe order
466violations using the previously discussed example from the invariants section.
467
468Recall the example system with non-irq-safe lock **A** and irq-safe lock
469**B<sub>irq</sub>**; paths **P<sub>1</sub>**, **P<sub>2</sub>**, and irq path
470**P<sub>irq</sub>**; with the following behavior:
471
472Path **P<sub>1</sub>** acquires and releases the lock in sequence:
473
4741. Acquire(**A**)
4752. Release(**A**)
476
477Path **P<sub>irq</sub>** acquires and releases the lock in sequence:
478
4791. Acquire(**B<sub>irq</sub>**)
4802. Release(**B<sub>irq</sub>**)
481
482Path **P<sub>2</sub>** acquires and releases the locks in sequence:
483
4841. Acquire(**B<sub>irq</sub>**)
4852. Acquire(**A**)
4863. Release(**A**)
4874. Release(**B<sub>irq</sub>**)
488
489##### Analysis of Path **P<sub>1</sub>**
490
491Let **L<sub>1</sub>** be the ordered _active_ list of locks held by path
492**P<sub>1</sub>**.
493
494Let **G** = (**V**, **E**) be the directed graph, having the set of vertices
495**V** representing observed locks and the set of directed edges between vertices
496**E**.
497
498Initial state:
499
500| **L<sub>1</sub>** | **V** | **E** |
501|-------------------|-------|-------|
502| ()                | {}    | {}    |
503
504After **P<sub>1</sub>** step 1:
505
506| **L<sub>1</sub>** | **V**   | **E** |
507|-------------------|---------|-------|
508| (**A**)           | {**A**} | {}    |
509
510After **P<sub>1</sub>** step 2:
511
512| **L<sub>1</sub>** | **V**   | **E** |
513|-------------------|---------|-------|
514| ()                | {**A**} | {}    |
515
516##### Analysis of Path **P<sub>irq</sub>**
517
518Let **L<sub>irq</sub>** be the ordered _active_ list of locks held by path
519**P<sub>irq</sub>**.
520
521Initial state:
522
523| **L<sub>irq</sub>** | **V**   | **E** |
524|---------------------|---------|-------|
525| ()                  | {**A**} | {}    |
526
527After **P<sub>irq</sub>** step 1:
528
529| **L<sub>irq</sub>**   | **V**                        | **E** |
530|-----------------------|------------------------------|-------|
531| (**B<sub>irq</sub>**) | {**A**, **B<sub>irq</sub>**} | {}    |
532
533After **P<sub>irq</sub>** step 2:
534
535| **L<sub>irq</sub>** | **V**                        | **E** |
536|---------------------|------------------------------|-------|
537| ()                  | {**A**, **B<sub>irq</sub>**} | {}    |
538
539##### Analysis of Path **P<sub>irq</sub>**
540
541Let **L<sub>2</sub>** be the ordered _active_ list of locks held by path
542**P<sub>2</sub>**.
543
544Initial state:
545
546| **L<sub>2</sub>** | **V**   | **E** |
547|-------------------|---------|-------|
548| ()                | {**A**} | {}    |
549
550After **P<sub>2</sub>** step 1:
551
552| **L<sub>2</sub>**     | **V**                        | **E** |
553|-----------------------|------------------------------|-------|
554| (**B<sub>irq</sub>**) | {**A**, **B<sub>irq</sub>**} | {}    |
555
556After **P<sub>2</sub>** step 2:
557
558| **L<sub>2</sub>**            | **V**                        | **E**                          |
559|------------------------------|------------------------------|--------------------------------|
560| (**B<sub>irq</sub>**, **A**) | {**A**, **B<sub>irq</sub>**} | {(**A**, **B<sub>irq</sub>**)} |
561
562This step adds lock **A** to the active list. The active list contains lock
563**B<sub>irq</sub>**, so an edge is added from **A** to **B<sub>irq</sub>**.
564Because this is an edge from a non-irq-safe lock to an irq-safe lock the irq-safe
565ordering invariant is violated and a potential deadlock exists.
566
567## From Theory to Implementation
568
569This section develops a concrete strategy to implement a directed graph
570validator, based on the analysis techniques of the previous section.
571
572The implementation strategy has the following goals:
573
5741. Avoid dynamic allocation if possible.
5752. Minimize the overhead of validation.
5763. Support environments that manage hardware interrupts.
577
578### Removing Redundancy with Lock Classes
579
580In the analysis earlier in this document, locks are considered abstractly with
581the implication that the tracked objects are individual instances of locks.
582While tracking individual instances produces correct results, it has several
583consequences that might be avoided:
584
5851. Tracking structures must be dynamically adjusted as lock instances come into
586   and out of existence, possibly requiring dynamic allocation or other
587   per-instance data storage.
5882. The graph contains redundant information when multiple instances of locks are
589   used identically by the same code paths.
5903. Relatedly, it may take longer to identify violations by locks that serve
591   identical functions, but have not yet individually propagated through all of
592   the necessary code paths.
593
594A key observation is that locks that serve identical functions should follow the
595same ordering rules, regardless of the number of instances.
596
597Consider the following types with lock members and an operation that mutates
598both types:
599
600```C++
601struct Foo {
602    Mutex lock;
603    int data; GUARDED_BY(lock);
604};
605
606struct Bar {
607    Mutex lock;
608    int data; GUARDED_BY(lock);
609};
610
611void Swap(Foo* foo, Bar* bar) {
612    foo->lock.Acquire();
613    bar->lock.Acquire();
614
615    int temp = foo->data;
616    foo->data = bar->data;
617    bar->data = temp;
618
619    bar->Release();
620    foo->Release();
621}
622```
623
624Since operation `Swap` may operate on any instance of `Foo` and any instance of
625`Bar` it follows that `Swap` establishes an order between the locks of all
626instances of `Foo` and `Bar`; failure to apply this order consistently in other
627parts of a program could result in a deadlock when the same instances of `Foo`
628and `Bar` are locked concurrently in different orders.
629
630Note that it is possible to intentionally or unintentionally segregate different
631collections of `Foo` and `Bar` such that instances locked in different orders
632never overlap. This is still dangerous however, because seemingly innocuous
633changes to the inputs, structure, or timing of the program could defeat the
634segregation and introduce a potential deadlock. This problem can be avoided
635entirely by treating all instances of `Foo` and `Bar` equivalently and applying
636the same ordering rules throughout the program.
637
638Ensuring universal ordering throughout the program can be achieved by tracking
639classes of locks instead of lock instances: each lock member in each type
640represents a unique lock class. The relationships between each lock class can
641be tracked and analyzed using the same directed graph techniques as with
642individual locks.
643
644Tracking lock classes has the following benefits:
645
6461. Statically allocated memory: because all lock classes are known at compile
647   time, tracking structures can be allocated up front as static global data.
6482. Elimination of redundant graph nodes: locks in the same class use the same
649   tracking structures.
6503. Faster detection of invariant violations: violations are detected when
651   lock class orders are inconsistent, even if the individual instances involved
652   have never been used together.
653
654#### Additional Ordering Rules
655
656Tracking lock classes introduces additional ordering considerations when locking
657multiple locks of the same class. Because individual instances are not tracked
658it is necessary to take additional steps to ensure consistency when multiple
659locks of the same class must be acquired at the same time.
660
661##### Externally Ordered Locks
662
663Nesting locks of the same class is necessary when a hierarchical or other
664ordered data structure has locks in each node and more than one per-node lock
665must be held at a time. In this situation the data structure or access pattern
666must provide a stable ordering that is used to guarantee ordering of the locks.
667
668Validation of nestable lock classes requires only that the external order is
669recorded in the active locks list for each nestable lock and compared when new
670locks of the same class are added to the list. A consequence of this design is
671that other lock classes may not be interspersed between nested locks of the
672same class, only wholly before or after a collection of nested locks.
673
674For example, non-nestable lock classes **A** and **B**, and nestable lock class
675**N** may be interspersed like this:
676
677**A**, **N<sub>0</sub>**, **N<sub>1</sub>**, ... **N<sub>n</sub>**, **B**
678
679But not like this:
680
681**A**, **N<sub>0</sub>**, **B**, **N<sub>1</sub>**, ... **N<sub>n</sub>** or
682**A**, **N<sub>0</sub>**, **N<sub>1</sub>**, **B**, ... **N<sub>n</sub>** or
683... etc
684
685In most situations this is a reasonable constraint, as interspersing other locks
686within a nested structure with arbitrary depth is likely to result in inversions
687as the structure is updated at runtime. On the other hand, in situations where
688nesting is bounded to a few levels it may be more effective to define separate
689lock classes for each level instead of using a nested class -- in this case
690other locks may be allowed at a specific level following normal lock ordering
691rules.
692
693##### Address Ordering
694
695It is difficult to generalize lock ordering between locks of the same class
696without an externally provided order when the locks are acquired at different
697times. It is possible however, to provide an ordering guarantee when acquiring
698multiple locks at the same time, without temporal separation. In this situation
699the locks may be ordered by address, guaranteeing that any path that acquires
700the same set locks produces a consistent locking order.
701
702For example, consider an operation **F**(**S<sub>a</sub>**, **S<sub>b</sub>**)
703that operates on two instances of structure **S**, each with a lock of class
704**L** and, as part of the operation **F** must lock both locks.
705
706If instance **S<usb>0</sub>** is ordered in memory before instance
707**S<sub>1</sub>** then the locks have the same relative ordering as their
708containing instances. We can consider the locks to have the subclasses
709**L<sub>0</sub>** and **L<sub>1</sub>** respectively.
710
711A lock ordering problem arises if we perform the operation with different
712orders:
713
714**F**(**S<sub>0</sub>**, **S<sub>1</sub>**) and
715**F**(**S<sub>1</sub>**, **S<sub>0</sub>**)
716
717Without intervention these produce the inverted lock sequences:
718
719**L<sub>0</sub>**, **L<sub>1</sub>** and **L<sub>1</sub>**, **L<sub>0</sub>**
720
721Since **F** has simultaneous access to both locks at the same time, it is
722possible to order the locks by address, resulting in a consistent lock
723sequence regardless of the original order of the arguments.
724
725Now suppose we add two more lock classes to the sequence: class **A** acquired
726before operation **F** and class **B** acquired after operation **F**. The
727resulting lock sequence is:
728
729**A**, **L<sub>0</sub>**, **L<sub>1</sub>**, **B**
730
731Note that this looks similar to the nested lock class sequence diagram in the
732previous section. It is in fact the same situation, only the ordering of locks
733is provided by address rather than an external order. This means that the same
734bookkeeping in the active threads list can be used for both situations.
735
736#### Lock Class Tracking Data Structure
737
738This section discusses implementation details for tracking lock classes and
739concrete processing techniques to detect potential deadlocks.
740
741Each lock class has a statically allocated node in the directed graph
742representing all locks belonging to that class. Each node has the following data
743structures:
744
745##### Lock-Free, Wait-Free Hash Set
746
747Each lock class node has a hash set that tracks the edges from the lock class to
748the lock classes ordered before it.
749
750**TODO**: Add implementation details of the hash set.
751
752##### Lock-Free, Wait-Free Disjoint Set Structures
753
754Each lock class node has a parent pointer used to track nodes that are connected
755in cycles in the directed graph. This permits reporting cycles that have been
756previously by the loop detection algorithm without fully re-traversing the graph.
757
758**TODO**: Add implementation details of the disjoint set structure.
759
760##### Thread-Local Lock List
761
762Each thread maintains a thread-local list of the locks it currently holds.
763
764**TODO**: Add implementation details of the thread-local lock list.
765
766##### Loop Detection Thread
767
768Whenever a new edge is added to the directed graph, the loop detection thread is
769triggered to traverse the graph to find circular dependencies involving more than
770two locks. Tarjan's strongly connected sets algorithm is an efficient choice,
771with worst case complexity of **O**(|**E**| + |**V**|). This algorithm is stable
772even when traversing a graph that is updated concurrently by other threads.
773
774**TODO**: Add implementation details of the loop detection thread.
775
776## References
777
7781. Clang static [thread safety analysis](https://clang.llvm.org/docs/ThreadSafetyAnalysis.html).
7792. LLVM runtime [thread sanitizer](https://github.com/google/sanitizers/wiki/ThreadSanitizerDeadlockDetector).
7803. Linux Kernel [lockdep subsystem](https://www.kernel.org/doc/Documentation/locking/lockdep-design.txt).
781
782