4 A complete rewrite of the LLVMC compiler driver is proposed, aimed at
5 making it more configurable and useful.
10 As it stands, the current version of LLVMC does not meet its stated goals
11 of configurability and extensibility, and is therefore not used
12 much. The need for enhancements to LLVMC is also reflected in [1]_. The
13 proposed rewrite will fix the aforementioned deficiences and provide
14 an extensible, future-proof solution.
19 A compiler driver's job is essentially to find a way to transform a set
20 of input files into a set of targets, depending on the user-provided
21 options. Since several methods of transformation can potentially exist,
22 it's natural to use a directed graph to represent all of them. In this
23 graph, nodes are tools -- e.g., ``gcc -S`` is a tool that generates
24 assembly from C language files -- and edges between the nodes indicate
25 that the output of one tool can be given as input to another -- i.e.,
26 ``gcc -S -o - file.c | as``. We'll call this graph the compilation graph.
28 The proposed design revolves around the compilation graph and the
29 following core abstractions:
31 - Target - An (intermediate) compilation target.
33 - Action - A shell command template that represents a basic compilation
34 transformation -- example: ``gcc -S $INPUT_FILE -o $OUTPUT_FILE``.
36 - Tool - Encapsulates information about a concrete tool used in the
37 compilation process, produces Actions. Its operation depends on
38 command-line options provided by the user.
40 - GraphBuilder - Constructs the compilation graph. Its operation
41 depends on command-line options.
43 - GraphTraverser - Traverses the compilation graph and constructs a
44 sequence of Actions needed to build the target file. Its operation
45 depends on command-line options.
47 A high-level view of the compilation process:
49 1. Configuration libraries (see below) are loaded in and the
50 compilation graph is constructed from the tool descriptions.
52 2. Information about possible options is gathered from (the nodes of)
53 the compilation graph.
55 3. Options are parsed based on data gathered in step 2.
57 4. A sequence of Actions needed to build the target is constructed
58 using the compilation graph and provided options.
60 5. The resulting action sequence is executed.
65 To make this design extensible, TableGen [2]_ will be used for
66 automatic generation of the Tool classes. Users wanting to customize
67 LLVMC need to write a configuration library consisting of a set of
68 TableGen descriptions of compilation tools plus a number of hooks
69 that influence compilation graph construction and traversal. LLVMC
70 will have the ability to load user configuration libraries at runtime;
71 in fact, its own basic functionality will be implemented as a
72 configuration library.
74 TableGen specification example
75 ------------------------------
77 This small example specifies a Tool that converts C source to object
78 files. Note that it is only a mock-up of intended functionality, not a
82 GCCProperties, // Properties of this tool
83 GCCOptions // Options description for this tool
86 def GCCProperties : ToolProperties<[
88 InputLanguageName<"C">,
89 OutputLanguageName<"Object-Code">
90 InputFileExtension<"c">,
91 OutputFileExtension<"o">,
92 CommandFormat<"gcc -c $OPTIONS $FILES">
95 def GCCOptions : ToolOptions<[
97 "-Wall", // Option name
98 [None], // Allowed values
99 [AddOption<"-Wall">]>, // Action
102 "-Wextra", // Option name
103 [None], // Allowed values
104 [AddOption<"-Wextra">]>, // Action
108 [None], // Allowed values
109 [AddOption<"-W">]>, // Action
113 [AnyString], // Allowed values
115 [AddOptionWithArgument<"-D",GetOptionArgument<"-D">>]
117 // If the driver was given option "-D<argument>", add
118 // option "-D" with the same argument to the invocation string of
124 Example of generated code
125 -------------------------
127 The specification above compiles to the following code (again, it's a
130 class GCC : public Tool {
138 static const char* ToolName = "GCC";
139 static const char* InputLanguageName = "C";
140 static const char* OutputLanguageName = "Object-Code";
141 static const char* InputFileExtension = "c";
142 static const char* OutputFileExtension = "o";
143 static const char* CommandFormat = "gcc -c $OPTIONS $FILES";
147 OptionsDescription SupportedOptions() {
148 OptionsDescription supportedOptions;
150 supportedOptions.Add(Option("-Wall"));
151 supportedOptions.Add(Option("-Wextra"));
152 supportedOptions.Add(Option("-W"));
153 supportedOptions.Add(Option("-D", AllowedArgs::ANY_STRING));
155 return supportedOptions;
158 Action GenerateAction(Options providedOptions) {
159 Action generatedAction(CommandFormat); Option curOpt;
161 curOpt = providedOptions.Get("-D");
163 assert(curOpt.HasArgument());
164 generatedAction.AddOption(Option("-D", curOpt.GetArgument()));
167 curOpt = providedOptions.Get("-Wall");
169 generatedAction.AddOption(Option("-Wall"));
171 curOpt = providedOptions.Get("-Wextra");
173 generatedAction.AddOption(Option("-Wall"));
175 curOpt = providedOptions.Get("-W");
177 generatedAction.AddOption(Option("-Wall")); }
179 return generatedAction;
184 // defined somewhere...
186 class Action { public: void AddOption(const Option& opt) {...}
187 int Run(const Filenames& fnms) {...}
194 Because one of the main tasks of the compiler driver is to correctly
195 handle user-provided options, it is important to define this process
196 in an exact way. The intent of the proposed scheme is to function as
197 a drop-in replacement for GCC.
202 The option syntax is specified by the following formal grammar::
204 <command-line> ::= <option>*
205 <option> ::= <positional-option> | <named-option>
206 <named-option> ::= -[-]<option-name>[<delimeter><option-argument>]
207 <delimeter> ::= ',' | '=' | ' '
208 <positional-option> ::= <string>
209 <option-name> ::= <string>
210 <option-argument> ::= <string>
212 This roughly corresponds to the GCC option syntax. Note that grouping
213 of short options (as in ``ls -la``) is forbidden.
217 llvmc -O3 -Wa,-foo,-bar -pedantic -std=c++0x a.c b.c c.c
219 Option arguments can also have special forms. For example, an argument
220 can be a comma-separated list (like in -Wa,-foo,-bar). In such cases,
221 it's up to the option handler to parse the argument.
226 According to their meaning, options are classified into the following
229 - Global options - Options that influence compilation graph
230 construction/traversal. Example: -E (stop after preprocessing).
232 - Local options - Options that influence one or several Actions in
233 the generated action sequence. Example: -O3 (turn on optimization).
235 - Prefix options - Options that influence the meaning of the following
236 command-line arguments. Example: -x language (specify language for
237 the input files explicitly). Prefix options can be local or global.
239 - Built-in options - Options that are hard-coded into the
240 driver. Examples: --help, -o file/-pipe (redirect output). Can be
246 Because the compiler driver, as a single point of access to the LLVM
247 tool set, is a frequently used tool, it is desirable to make its name
248 as short and easy to type as possible. Some possible names are 'llcc' or
249 'lcc', by analogy with gcc.
255 1. Should global-options-influencing hooks be written by hand or
256 auto-generated from TableGen specifications?
265 http://llvm.org/bugs/show_bug.cgi?id=686
267 .. [2] TableGen Fundamentals
269 http://llvm.org/docs/TableGenFundamentals.html