1# Banjo compiler
2
3## Pipeline
4
5### Loading a file
6
7All the banjo files in a library compilation are loaded by a
8[SourceManager](lib/source_manager.h). This thing's job is to own the buffers backing
9files. These buffers are kept alive for the entire pipeline. Tokens,
10for example, are essentially a string view plus some metadata
11describing their source location (a file and position).
12
13### Parsing a file
14
15The Banjo compiler first parses each file into an in-memory AST, which
16is defined by the structures in [ast.h](lib/ast.h). This parsing operation
17starts by reading the file into memory, and then lexing the contents
18into a [token](lib/token.h) stream. The [parser](lib/parser.cpp) proper then
19parses the stream into the hierarchical AST. At this point names of
20types are unresolved (they could end up pointing to types in another
21file or library, or simply be garbage), and nested declarations are
22still nested in the AST.
23
24This step will fail if any of the given files is not valid Banjo.
25
26### Flattening a library
27
28Once all the files are parsed into AST nodes, it's time to flatten the
29representation.
30
31Recall that some declarations can be nested. For instance, a const
32declaration can be present in an interface or struct declaration.
33
34Flattening pulls all the declarations out to one level, which entails
35computing fully qualified names for nested types.
36
37### Resolving names in a library
38
39Many parts of a banjo file refer to each other by name. For instance, a
40struct may have a field whose type is given by the (possibly
41qualified) name of some other struct. Any name that can't be resolved
42(because it is not present in any of the given files or library
43dependencies) causes compilation to fail at this stage.
44
45### Computing layout
46
47At this stage layouts of all data structures are computed. This
48includes both the coding tables for all of the messages defined by the
49library, as well as the wire formats of those messages. The in-memory
50representation of this layout is defined by the structures in
51[coded_ast.h](lib/coded_ast.h).
52
53This step can fail in a few ways. If a given message statically
54exceeds the limits of a channel message, compilation will
55fail. Statically exceeding the recursion limit of Banjo decoding will
56also cause compilation to fail.
57
58### Backend generation
59
60At this stage, nothing about the Banjo library per se should cause
61compilation to fail (anything particular to a certain language binding
62could fail, or the compiler could be given a bogus location to put its
63output etc.).
64
65#### C
66
67C bindings are directly generated from the library layout.
68
69#### JSON
70
71All other language bindings are generated by another program.
72