1 ===================================
2 Customizing LLVMC: Reference Manual
3 ===================================
4 :Author: Mikhail Glushenkov <foldr@codedegers.com>
6 LLVMC is a generic compiler driver, designed to be customizable and
7 extensible. It plays the same role for LLVM as the ``gcc`` program
8 does for GCC - LLVMC's job is essentially to transform a set of input
9 files into a set of targets depending on configuration rules and user
10 options. What makes LLVMC different is that these transformation rules
11 are completely customizable - in fact, LLVMC knows nothing about the
12 specifics of transformation (even the command-line options are mostly
13 not hard-coded) and regards the transformation structure as an
14 abstract graph. The structure of this graph is completely determined
15 by plugins, which can be either statically or dynamically linked. This
16 makes it possible to easily adapt LLVMC for other purposes - for
17 example, as a build tool for game resources.
19 Because LLVMC employs TableGen [1]_ as its configuration language, you
20 need to be familiar with it to customize LLVMC.
29 LLVMC tries hard to be as compatible with ``gcc`` as possible,
30 although there are some small differences. Most of the time, however,
31 you shouldn't be able to notice them::
33 $ # This works as expected:
34 $ llvmc -O3 -Wall hello.cpp
38 One nice feature of LLVMC is that one doesn't have to distinguish
39 between different compilers for different languages (think ``g++`` and
40 ``gcc``) - the right toolchain is chosen automatically based on input
41 language names (which are, in turn, determined from file
42 extensions). If you want to force files ending with ".c" to compile as
43 C++, use the ``-x`` option, just like you would do it with ``gcc``::
45 $ # hello.c is really a C++ file
46 $ llvmc -x c++ hello.c
50 On the other hand, when using LLVMC as a linker to combine several C++
51 object files you should provide the ``--linker`` option since it's
52 impossible for LLVMC to choose the right linker in that case::
56 [A lot of link-time errors skipped]
57 $ llvmc --linker=c++ hello.o
65 LLVMC has some built-in options that can't be overridden in the
68 * ``-o FILE`` - Output file name.
70 * ``-x LANGUAGE`` - Specify the language of the following input files
71 until the next -x option.
73 * ``-load PLUGIN_NAME`` - Load the specified plugin DLL. Example:
74 ``-load $LLVM_DIR/Release/lib/LLVMCSimple.so``.
76 * ``-v`` - Enable verbose mode, i.e. print out all executed commands.
78 * ``--view-graph`` - Show a graphical representation of the compilation
79 graph. Requires that you have ``dot`` and ``gv`` programs
80 installed. Hidden option, useful for debugging.
82 * ``--write-graph`` - Write a ``compilation-graph.dot`` file in the
83 current directory with the compilation graph description in the
84 Graphviz format. Hidden option, useful for debugging.
86 * ``--save-temps`` - Write temporary files to the current directory
87 and do not delete them on exit. Hidden option, useful for debugging.
89 * ``--help``, ``--help-hidden``, ``--version`` - These options have
90 their standard meaning.
93 Compiling LLVMC plugins
94 =======================
96 It's easiest to start working on your own LLVMC plugin by copying the
97 skeleton project which lives under ``$LLVMC_DIR/plugins/Simple``::
99 $ cd $LLVMC_DIR/plugins
100 $ cp -r Simple MyPlugin
103 Makefile PluginMain.cpp Simple.td
105 As you can see, our basic plugin consists of only two files (not
106 counting the build script). ``Simple.td`` contains TableGen
107 description of the compilation graph; its format is documented in the
108 following sections. ``PluginMain.cpp`` is just a helper file used to
109 compile the auto-generated C++ code produced from TableGen source. It
110 can also contain hook definitions (see `below`__).
114 The first thing that you should do is to change the ``LLVMC_PLUGIN``
115 variable in the ``Makefile`` to avoid conflicts (since this variable
116 is used to name the resulting library)::
118 LLVMC_PLUGIN=MyPlugin
120 It is also a good idea to rename ``Simple.td`` to something less
123 $ mv Simple.td MyPlugin.td
125 Note that the plugin source directory must be placed under
126 ``$LLVMC_DIR/plugins`` to make use of the existing build
127 infrastructure. To build a version of the LLVMC executable called
128 ``mydriver`` with your plugin compiled in, use the following command::
131 $ make BUILTIN_PLUGINS=MyPlugin DRIVER_NAME=mydriver
133 To build your plugin as a dynamic library, just ``cd`` to its source
134 directory and run ``make``. The resulting file will be called
135 ``LLVMC$(LLVMC_PLUGIN).$(DLL_EXTENSION)`` (in our case,
136 ``LLVMCMyPlugin.so``). This library can be then loaded in with the
137 ``-load`` option. Example::
139 $ cd $LLVMC_DIR/plugins/Simple
141 $ llvmc -load $LLVM_DIR/Release/lib/LLVMCSimple.so
143 Sometimes, you will want a 'bare-bones' version of LLVMC that has no
144 built-in plugins. It can be compiled with the following command::
147 $ make BUILTIN_PLUGINS=""
149 How plugins are loaded
150 ======================
152 It is possible for LLVMC plugins to depend on each other. For example,
153 one can create edges between nodes defined in some other plugin. To
154 make this work, however, that plugin should be loaded first. To
155 achieve this, the concept of plugin priority was introduced. By
156 default, every plugin has priority zero; to specify the priority
157 explicitly, put the following line in your ``.td`` file::
159 def Priority : PluginPriority<$PRIORITY_VALUE>;
160 # Where PRIORITY_VALUE is some integer > 0
162 Plugins are loaded in order of their (increasing) priority, starting
163 with 0. Therefore, the plugin with the highest priority value will be
167 Customizing LLVMC: the compilation graph
168 ========================================
170 Each TableGen configuration file should include the common
173 include "llvm/CompilerDriver/Common.td"
175 // include "llvm/CompilerDriver/Tools.td"
176 // which contains some useful tool definitions.
178 Internally, LLVMC stores information about possible source
179 transformations in form of a graph. Nodes in this graph represent
180 tools, and edges between two nodes represent a transformation path. A
181 special "root" node is used to mark entry points for the
182 transformations. LLVMC also assigns a weight to each edge (more on
183 this later) to choose between several alternative edges.
185 The definition of the compilation graph (see file
186 ``plugins/Base/Base.td`` for an example) is just a list of edges::
188 def CompilationGraph : CompilationGraph<[
189 Edge<"root", "llvm_gcc_c">,
190 Edge<"root", "llvm_gcc_assembler">,
193 Edge<"llvm_gcc_c", "llc">,
194 Edge<"llvm_gcc_cpp", "llc">,
197 OptionalEdge<"llvm_gcc_c", "opt", (case (switch_on "opt"),
199 OptionalEdge<"llvm_gcc_cpp", "opt", (case (switch_on "opt"),
203 OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker",
204 (case (input_languages_contain "c++"), (inc_weight),
205 (or (parameter_equals "linker", "g++"),
206 (parameter_equals "linker", "c++")), (inc_weight))>,
211 As you can see, the edges can be either default or optional, where
212 optional edges are differentiated by an additional ``case`` expression
213 used to calculate the weight of this edge. Notice also that we refer
214 to tools via their names (as strings). This makes it possible to add
215 edges to an existing compilation graph in plugins without having to
216 know about all tool definitions used in the graph.
218 The default edges are assigned a weight of 1, and optional edges get a
219 weight of 0 + 2*N where N is the number of tests that evaluated to
220 true in the ``case`` expression. It is also possible to provide an
221 integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
222 the weight is increased (or decreased) by the provided value instead
225 When passing an input file through the graph, LLVMC picks the edge
226 with the maximum weight. To avoid ambiguity, there should be only one
227 default edge between two nodes (with the exception of the root node,
228 which gets a special treatment - there you are allowed to specify one
229 default edge *per language*).
231 To get a visual representation of the compilation graph (useful for
232 debugging), run ``llvmc --view-graph``. You will need ``dot`` and
233 ``gsview`` installed for this to work properly.
236 Writing a tool description
237 ==========================
239 As was said earlier, nodes in the compilation graph represent tools,
240 which are described separately. A tool definition looks like this
241 (taken from the ``include/llvm/CompilerDriver/Tools.td`` file)::
243 def llvm_gcc_cpp : Tool<[
245 (out_language "llvm-assembler"),
246 (output_suffix "bc"),
247 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
251 This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
252 ``llvm-g++``. As you can see, a tool definition is just a list of
253 properties; most of them should be self-explanatory. The ``sink``
254 property means that this tool should be passed all command-line
255 options that lack explicit descriptions.
257 The complete list of the currently implemented tool properties follows:
259 * Possible tool properties:
261 - ``in_language`` - input language name. Can be either a string or a
262 list, in case the tool supports multiple input languages.
264 - ``out_language`` - output language name.
266 - ``output_suffix`` - output file suffix.
268 - ``cmd_line`` - the actual command used to run the tool. You can
269 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
270 with ``>``, hook invocations (``$CALL``), environment variables
271 (via ``$ENV``) and the ``case`` construct (more on this below).
273 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
274 list of input files and joins them together. Used for linkers.
276 - ``sink`` - all command-line options that are not handled by other
277 tools are passed to this tool.
279 The next tool definition is slightly more complex::
281 def llvm_gcc_linker : Tool<[
282 (in_language "object-code"),
283 (out_language "executable"),
284 (output_suffix "out"),
285 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
287 (prefix_list_option "L", (forward),
288 (help "add a directory to link path")),
289 (prefix_list_option "l", (forward),
290 (help "search a library when linking")),
291 (prefix_list_option "Wl", (unpack_values),
292 (help "pass options to linker"))
295 This tool has a "join" property, which means that it behaves like a
296 linker. This tool also defines several command-line options: ``-l``,
297 ``-L`` and ``-Wl`` which have their usual meaning. An option has two
298 attributes: a name and a (possibly empty) list of properties. All
299 currently implemented option types and properties are described below:
301 * Possible option types:
303 - ``switch_option`` - a simple boolean switch, for example ``-time``.
305 - ``parameter_option`` - option that takes an argument, for example
308 - ``parameter_list_option`` - same as the above, but more than one
309 occurence of the option is allowed.
311 - ``prefix_option`` - same as the parameter_option, but the option name
312 and parameter value are not separated.
314 - ``prefix_list_option`` - same as the above, but more than one
315 occurence of the option is allowed; example: ``-lm -lpthread``.
317 - ``alias_option`` - a special option type for creating
318 aliases. Unlike other option types, aliases are not allowed to
319 have any properties besides the aliased option name. Usage
320 example: ``(alias_option "preprocess", "E")``
323 * Possible option properties:
325 - ``append_cmd`` - append a string to the tool invocation command.
327 - ``forward`` - forward this option unchanged.
329 - ``forward_as`` - Change the name of this option, but forward the
330 argument unchanged. Example: ``(forward_as "--disable-optimize")``.
332 - ``output_suffix`` - modify the output suffix of this
333 tool. Example: ``(switch "E", (output_suffix "i")``.
335 - ``stop_compilation`` - stop compilation after this phase.
337 - ``unpack_values`` - used for for splitting and forwarding
338 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
339 converted to ``-foo=bar -baz`` and appended to the tool invocation
342 - ``help`` - help string associated with this option. Used for
345 - ``required`` - this option is obligatory.
348 Option list - specifying all options in a single place
349 ======================================================
351 It can be handy to have all information about options gathered in a
352 single place to provide an overview. This can be achieved by using a
353 so-called ``OptionList``::
355 def Options : OptionList<[
356 (switch_option "E", (help "Help string")),
357 (alias_option "quiet", "q")
361 ``OptionList`` is also a good place to specify option aliases.
363 Tool-specific option properties like ``append_cmd`` have (obviously)
364 no meaning in the context of ``OptionList``, so the only properties
365 allowed there are ``help`` and ``required``.
367 Option lists are used at file scope. See the file
368 ``plugins/Clang/Clang.td`` for an example of ``OptionList`` usage.
372 Using hooks and environment variables in the ``cmd_line`` property
373 ==================================================================
375 Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
376 this is not sufficient: for example, we may want to specify tool names
377 in the configuration file. This can be achieved via the mechanism of
378 hooks - to write your own hooks, just add their definitions to the
379 ``PluginMain.cpp`` or drop a ``.cpp`` file into the
380 ``$LLVMC_DIR/driver`` directory. Hooks should live in the ``hooks``
381 namespace and have the signature ``std::string hooks::MyHookName
382 (void)``. They can be used from the ``cmd_line`` tool property::
384 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
386 It is also possible to use environment variables in the same manner::
388 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
390 To change the command line string based on user-provided options use
391 the ``case`` expression (documented below)::
396 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
398 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
400 Conditional evaluation: the ``case`` expression
401 ===============================================
403 The 'case' construct can be used to calculate weights of the optional
404 edges and to choose between several alternative command line strings
405 in the ``cmd_line`` tool property. It is designed after the
406 similarly-named construct in functional languages and takes the form
407 ``(case (test_1), statement_1, (test_2), statement_2, ... (test_N),
408 statement_N)``. The statements are evaluated only if the corresponding
409 tests evaluate to true.
413 // Increases edge weight by 5 if "-A" is provided on the
414 // command-line, and by 5 more if "-B" is also provided.
416 (switch_on "A"), (inc_weight 5),
417 (switch_on "B"), (inc_weight 5))
419 // Evaluates to "cmdline1" if option "-A" is provided on the
420 // command line, otherwise to "cmdline2"
422 (switch_on "A"), "cmdline1",
423 (switch_on "B"), "cmdline2",
424 (default), "cmdline3")
426 Note the slight difference in 'case' expression handling in contexts
427 of edge weights and command line specification - in the second example
428 the value of the ``"B"`` switch is never checked when switch ``"A"`` is
429 enabled, and the whole expression always evaluates to ``"cmdline1"`` in
432 Case expressions can also be nested, i.e. the following is legal::
434 (case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
437 You should, however, try to avoid doing that because it hurts
438 readability. It is usually better to split tool descriptions and/or
439 use TableGen inheritance instead.
441 * Possible tests are:
443 - ``switch_on`` - Returns true if a given command-line switch is
444 provided by the user. Example: ``(switch_on "opt")``. Note that
445 you have to define all possible command-line options separately in
446 the tool descriptions. See the next section for the discussion of
447 different kinds of command-line options.
449 - ``parameter_equals`` - Returns true if a command-line parameter equals
450 a given value. Example: ``(parameter_equals "W", "all")``.
452 - ``element_in_list`` - Returns true if a command-line parameter list
453 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
455 - ``input_languages_contain`` - Returns true if a given language
456 belongs to the current input language set. Example:
457 ``(input_languages_contain "c++")``.
459 - ``in_language`` - Evaluates to true if the language of the input
460 file equals to the argument. At the moment works only with
461 ``cmd_line`` property on non-join nodes. Example: ``(in_language
464 - ``not_empty`` - Returns true if a given option (which should be
465 either a parameter or a parameter list) is set by the
466 user. Example: ``(not_empty "o")``.
468 - ``default`` - Always evaluates to true. Should always be the last
469 test in the ``case`` expression.
471 - ``and`` - A standard logical combinator that returns true iff all
472 of its arguments return true. Used like this: ``(and (test1),
473 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
476 - ``or`` - Another logical combinator that returns true only if any
477 one of its arguments returns true. Example: ``(or (test1),
478 (test2), ... (testN))``.
484 One last thing that you will need to modify when adding support for a
485 new language to LLVMC is the language map, which defines mappings from
486 file extensions to language names. It is used to choose the proper
487 toolchain(s) for a given input file set. Language map definition looks
490 def LanguageMap : LanguageMap<
491 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
492 LangToSuffixes<"c", ["c"]>,
499 When writing LLVMC plugins, it can be useful to get a visual view of
500 the resulting compilation graph. This can be achieved via the command
501 line option ``--view-graph``. This command assumes that Graphviz [2]_ and
502 Ghostview [3]_ are installed. There is also a ``--dump-graph`` option that
503 creates a Graphviz source file(``compilation-graph.dot``) in the
510 .. [1] TableGen Fundamentals
511 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html
514 http://www.graphviz.org/
517 http://pages.cs.wisc.edu/~ghost/