1# Jitterentropy: tuning the configuration
2
3The jitterentropy library is written by Stephan Mueller, is available at
4<https://github.com/smuellerDD/jitterentropy-library>, and is documented at
5<http://www.chronox.de/jent.html>. In Zircon, it's used as a simple entropy
6source to seed the system CPRNG.
7
8[The companion document about basic configuration options to jitterentropy](config-basic.md)
9describes two options that fundamentally affect how jitterentropy runs. This document describes
10instead the numeric parameters that control how fast jitterentropy is and how much entropy it
11collects, but without fundamentally altering its principles of operation. It also describes how to
12test various parameters and what to look for in the output (e.g. if adding support for a new device,
13or to do a more thorough job of optimizing the parameters).
14
15[TOC]
16
17## A rundown of jitterentropy's parameters
18
19The following tunable parameters control how fast jitterentropy runs, and how fast it collects
20entropy:
21
22### [`kernel.jitterentropy.ll`](../kernel_cmdline.md#kernel_jitterentropy_ll_num)
23
24"`ll`" stands for "LFSR loops". Jitterentropy uses a (deliberately inefficient implementation of a)
25LFSR to exercise the CPU, as part of its noise generation. The inner loop shifts the LFSR 64 times;
26the outer loop repeats `kernel.jitterentropy.ll`-many times.
27
28In my experience, the LFSR code significantly slows down jitterentropy, but doesn't generate very
29much entropy. I tested this on RPi3 and qemu-arm64 with qualitatively similar results, but it hasn't
30been tested on x86 yet. This is something to consider when tuning: using fewer LFSR loops tends to
31lead to better overall performance.
32
33Note that setting `kernel.jitterentropy.ll=0` causes jitterentropy to choose the number of LFSR
34loops in a "random-ish" way. As described in [the basic config doc](config-basic.md), I discourage
35the use of `kernel.jitterentropy.ll=0`.
36
37
38### [`kernel.jitterentropy.ml`](../kernel_cmdline.md#kernel_jitterentropy_ml_num)
39
40"`ml`" stands for "memory access loops". Jitterentropy walks through a moderately large chunk of
41RAM, reading and writing each byte. The size of the chunk and access pattern are controlled by the
42two parameters below. The memory access loop is repeated `kernel.jitterentropy.ml`-many times.
43
44In my experience, the memory access loops are a good source of raw entropy. Again, I've only tested
45this on RPi3 and qemu-arm64 so far.
46
47Much like `kernel.jitterentropy.ll`, if you set `kernel.jitterentropy.ml=0`, then jitterentropy will
48choose a "random-ish" value for the memory access loop count. I also discourage this.
49
50### [`kernel.jitterentropy.bs`](../kernel_cmdline.md#kernel_jitterentropy_bs_num)
51
52"`bs`" stands for "block size". Jitterentropy divides its chunk of RAM into blocks of this size. The
53memory access loop starts with byte 0 of block zero, then "byte -1" of block 1 (which is actually
54the last byte of block 0), then "byte -2" of block 2 (i.e. the second-to-last byte of block 1), and
55so on. This pattern ensures that every byte gets hit, and most accesses go into different blocks.
56
57I have usually tested jitterentropy with `kernel.jitterentropy.bs=64`, based on the size of a cache
58line. I haven't tested yet to see whether there's a better option on some/all platforms.
59
60### [`kernel.jitterentropy.bc`](../kernel_cmdline.md#kernel_jitterentropy_bc_num)
61
62"`bc`" stands for "block count". Jitterentropy uses this many blocks of RAM, each of size
63`kernel.jitterentropy.bs`, in its memory access loops.
64
65Since I choose `kernel.jitterentropy.bs=64`, I usually choose `kernel.jitterentropy.bc=1024`.
66This means using 64KB of RAM, which is enough to overflow L1 cache.
67
68The [jitterentropy source code](../../third_party/lib/jitterentropy/jitterentropy-base.c#234)
69in the comment before `jent_memaccess` suggests choosing the block size and count so that the RAM
70used is bigger than L1. Confusingly, the default values in upstream jitterentropy (block size = 32,
71block count = 64) aren't big enough to overflow L1.
72
73## Tuning process
74
75The basic idea is simple: on a particular target device, try different values for the parameters.
76Collect a large amount of data for each parameter set (ideally around 1MB), then
77[run the NIST test suite to analyze the data](../entropy_quality_tests.md#running-the-nist-test-suite).
78Determine which parameters give the best entropy per unit time. The time taken to draw the entropy
79samples is logged on the system under test.
80
81One complication is the startup testing built into jitterentropy. This essentially draws and
82discards 400 samples, after performing some basic analysis (mostly making sure that the clock is
83monotonic and has a high enough resolution and variability). A more accurate test would reboot twice
84for each set of parameters: once to collect around 1MB of data for analysis, and a second time to
85boot with the "right" amount of entropy (as computed according to the entropy estimate in the first
86phase, with appropriate safety margins, etc. See
87["Determining the entropy\_per\_1000\_bytes statistic"](#determining-the-entropy_per_1000_bytes-statistic),
88below). This second phase of testing simulates a real boot, including the startup tests. After
89completing the second phase, choose the parameter set that boots fastest. Of course, each phase of
90testing should be repeated a few times to reduce random variations.
91
92## Determining the entropy\_per\_1000\_bytes statistic
93
94The `crypto::entropy::Collector` interface in
95[kernel/lib/crypto/include/lib/crypto/entropy/collector.h](../../kernel/lib/crypto/include/lib/crypto/entropy/collector.h)
96requires a parameter `entropy_per_1000_bytes` from its instantiations. The value relevant to
97jitterentropy is currently hard-coded in
98[kernel/lib/crypto/entropy/jitterentropy\_collector.cpp](../../kernel/lib/crypto/entropy/jitterentropy_collector.cpp).
99This value is meant to measure how much min-entropy is contained in each byte of data produced by
100jitterentropy (since the bytes aren't independent and uniformly distributed, this will be less than
1018 bits). The "per 1000 bytes" part simply makes it possible to specify fractional amounts of
102entropy, like "0.123 bits / byte", without requiring fractional arithmetic (since `float` is
103disallowed in kernel code, and fixed-point arithmetic is confusing).
104
105The value should be determined by using the NIST test suite to analyze random data samples, as
106described in
107[the entropy quality tests document](../entropy_quality_tests.md#running-the-nist-test-suite).
108The test suite produces an estimate of the min-entropy; repeated tests of the same RNG have (in my
109experience) varied by a few tenths of a bit (which is pretty significant when entropy values can be
110around 0.5 bits per byte of data!). After getting good, consistent results from the test suites,
111apply a safety factor (i.e. divide the entropy estimate by 2), and update the value of
112`entropy_per_1000_bytes` (don't forget to multiply by 1000).
113
114Note that eventually `entropy_per_1000_bytes` should probably be configured somewhere instead of
115hard-coded in jitterentropy\_collector.cpp. Kernel cmdlines or even a preprocessor symbol could work.
116
117## Notes about the testing script
118
119The `scripts/entropy-test/jitterentropy/test-tunable` script automates the practice of looping
120through a large test matrix. The downside is that tests run in sequence on a single machine, so (1)
121an error will stall the test pipeline so supervision *is* required, and (2) the machine is being
122constantly rebooted rather than cold-booted (plus it's a netboot-reboot), which could conceivably
123confound the tests. Still, it beats hitting power-off/power-on a thousand times by hand!
124
125Some happy notes:
126
1271. When netbooting, the script leaves bootserver on while waiting for netcp to successfully export
128   the data file. If the system hangs, you can power it off and back on, and the existing bootserver
129   process will restart the failed test.
130
1312. If the test is going to run (say) 16 combinations of parameters 10 times each, it will go like
132   this:
133
134       test # 0: ml = 1   ll = 1  bc = 1  bs = 1
135       test # 1: ml = 1   ll = 1  bc = 1  bs = 64
136       test # 2: ml = 1   ll = 1  bc = 32 bs = 1
137       test # 3: ml = 1   ll = 1  bc = 32 bs = 64
138       ...
139       test #15: ml = 128 ll = 16 bc = 32 bs = 64
140       test #16: ml = 1   ll = 1  bc = 1  bs = 1
141       test #17: ml = 1   ll = 1  bc = 1  bs = 64
142       ...
143
144   (The output files are numbered starting with 0, so I started with 0 above.)
145
146   So, if test #17 fails, you can delete tests #16 and #17, and re-run 9 more iterations of each
147   test. You can at least keep the complete results from the first iteration. In theory, the tests
148   could be smarter and also keep the existing result from test #16, but the current shell scripts
149   aren't that sophisticated.
150
151The scripts don't do a two-phase process like I suggested in the ["Tuning process"](#tuning-process)
152section above. It's certainly possible, but again the existing scripts aren't that sophisticated.
153
154## Open questions
155
156### How much do we trust the low-entropy extreme?
157
158It's *a priori* possible that we maximize entropy per unit time by choosing small parameter values.
159Most extreme is of course `ll=1, ml=1, bs=1, bc=1`, but even something like `ll=1, ml=1, bs=64,
160bc=32` is an example of what I'm thinking of.  Part of the concern is the variability in the test
161suite: if hypothetically the tests are only accurate to within 0.2 bits of entropy per byte, and if
162they're reporting 0.15 bits of entropy per byte, what do we make of it? Hopefully running the same
163test a few hundred times in a row will reveal a clear modal value, but it's still a little bit risky
164to rely on that low estimate to be accurate.
165
166The NIST publication states (line 1302, page 35, second draft) that the estimators "work well when
167the entropy-per-sample is greater than 0.1". This is fairly low, so hopefully it isn't an issue in
168practice. Still, the fact that there is a lower bound means we should probably leave a fairly
169conservative envelope around it.
170
171### How device-dependent is the optimal choice of parameters?
172
173There's evidently a significant difference in the actual "bits of entropy per byte" metric on
174different architectures or different hardware. Is it possible that most systems are optimal at
175similar parameter values (so that we can just hard-code these values into
176`kernel/lib/crypto/entropy/jitterentropy_collector.cpp`? Or, do we need to put the parameters into
177MDI or into a preprocessor macro, so that we can use different defaults on a per-platform basis (or
178whatever level of granularity is appropriate).
179
180### Can we even record optimal parameters with enough granularity?
181
182I mentioned it above, but one of our targets is "x86", which is what runs on any x86
183PC. Naturally, x86 PCs can very quite a bit. Even if we did something like add preprocessor symbols
184like `JITTERENTROPY_LL_VALUE` etc. to the build, customized in `kernel/project/target/pc-x86.mk`,
185could we pick a good value for *all PCs*?
186
187If not, what are our options?
188
1891. We could store a lookup table based on values accessible at runtime (like the exact CPU model,
190   the core memory size, cache line size, etc.). This seems rather unwieldy. Maybe if we could find
191   one or two simple properties to key off of, say "CPU core frequency" and "L1 cache size", we
192   could make this relatively non-terrible.
193
1942. We could try an adaptive approach: monitor the quality of the entropy stream, and adjust the
195   parameters according on the fly. This would take a lot of testing and justification if we want to
196   trust it.
197
1983. We could settle for "good enough" parameters on most devices, with the option to tune via kernel
199   cmdlines or a similar mechanism. This seems like the most likely outcome to me. I expect that
200   "good enough" parameters will be easy to find, and not disruptive enough to justify extreme
201   solutions.
202