1 ===================================
2 Customizing LLVMC: Reference Manual
3 ===================================
5 This file was automatically generated by rst2html.
6 Please do not edit directly!
7 The ReST source lives in the directory 'tools/llvmc/doc'.
13 <div class="doc_author">
14 <p>Written by <a href="mailto:foldr@codedgers.com">Mikhail Glushenkov</a></p>
20 LLVMC is a generic compiler driver, designed to be customizable and
21 extensible. It plays the same role for LLVM as the ``gcc`` program
22 does for GCC - LLVMC's job is essentially to transform a set of input
23 files into a set of targets depending on configuration rules and user
24 options. What makes LLVMC different is that these transformation rules
25 are completely customizable - in fact, LLVMC knows nothing about the
26 specifics of transformation (even the command-line options are mostly
27 not hard-coded) and regards the transformation structure as an
28 abstract graph. The structure of this graph is completely determined
29 by plugins, which can be either statically or dynamically linked. This
30 makes it possible to easily adapt LLVMC for other purposes - for
31 example, as a build tool for game resources.
33 Because LLVMC employs TableGen_ as its configuration language, you
34 need to be familiar with it to customize LLVMC.
36 .. _TableGen: http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html
42 LLVMC tries hard to be as compatible with ``gcc`` as possible,
43 although there are some small differences. Most of the time, however,
44 you shouldn't be able to notice them::
46 $ # This works as expected:
47 $ llvmc -O3 -Wall hello.cpp
51 One nice feature of LLVMC is that one doesn't have to distinguish
52 between different compilers for different languages (think ``g++`` and
53 ``gcc``) - the right toolchain is chosen automatically based on input
54 language names (which are, in turn, determined from file
55 extensions). If you want to force files ending with ".c" to compile as
56 C++, use the ``-x`` option, just like you would do it with ``gcc``::
58 $ # hello.c is really a C++ file
59 $ llvmc -x c++ hello.c
63 On the other hand, when using LLVMC as a linker to combine several C++
64 object files you should provide the ``--linker`` option since it's
65 impossible for LLVMC to choose the right linker in that case::
69 [A lot of link-time errors skipped]
70 $ llvmc --linker=c++ hello.o
74 By default, LLVMC uses ``llvm-gcc`` to compile the source code. It is
75 also possible to choose the work-in-progress ``clang`` compiler with
76 the ``-clang`` option.
82 LLVMC has some built-in options that can't be overridden in the
83 configuration libraries:
85 * ``-o FILE`` - Output file name.
87 * ``-x LANGUAGE`` - Specify the language of the following input files
88 until the next -x option.
90 * ``-load PLUGIN_NAME`` - Load the specified plugin DLL. Example:
91 ``-load $LLVM_DIR/Release/lib/LLVMCSimple.so``.
93 * ``-v`` - Enable verbose mode, i.e. print out all executed commands.
95 * ``--check-graph`` - Check the compilation for common errors like
96 mismatched output/input language names, multiple default edges and
97 cycles. Hidden option, useful for debugging.
99 * ``--view-graph`` - Show a graphical representation of the compilation
100 graph. Requires that you have ``dot`` and ``gv`` programs
101 installed. Hidden option, useful for debugging.
103 * ``--write-graph`` - Write a ``compilation-graph.dot`` file in the
104 current directory with the compilation graph description in the
105 Graphviz format. Hidden option, useful for debugging.
107 * ``--save-temps`` - Write temporary files to the current directory
108 and do not delete them on exit. Hidden option, useful for debugging.
110 * ``--help``, ``--help-hidden``, ``--version`` - These options have
111 their standard meaning.
114 Compiling LLVMC plugins
115 =======================
117 It's easiest to start working on your own LLVMC plugin by copying the
118 skeleton project which lives under ``$LLVMC_DIR/plugins/Simple``::
120 $ cd $LLVMC_DIR/plugins
121 $ cp -r Simple MyPlugin
124 Makefile PluginMain.cpp Simple.td
126 As you can see, our basic plugin consists of only two files (not
127 counting the build script). ``Simple.td`` contains TableGen
128 description of the compilation graph; its format is documented in the
129 following sections. ``PluginMain.cpp`` is just a helper file used to
130 compile the auto-generated C++ code produced from TableGen source. It
131 can also contain hook definitions (see `below`__).
135 The first thing that you should do is to change the ``LLVMC_PLUGIN``
136 variable in the ``Makefile`` to avoid conflicts (since this variable
137 is used to name the resulting library)::
139 LLVMC_PLUGIN=MyPlugin
141 It is also a good idea to rename ``Simple.td`` to something less
144 $ mv Simple.td MyPlugin.td
146 Note that the plugin source directory must be placed under
147 ``$LLVMC_DIR/plugins`` to make use of the existing build
148 infrastructure. To build a version of the LLVMC executable called
149 ``mydriver`` with your plugin compiled in, use the following command::
152 $ make BUILTIN_PLUGINS=MyPlugin DRIVER_NAME=mydriver
154 To build your plugin as a dynamic library, just ``cd`` to its source
155 directory and run ``make``. The resulting file will be called
156 ``LLVMC$(LLVMC_PLUGIN).$(DLL_EXTENSION)`` (in our case,
157 ``LLVMCMyPlugin.so``). This library can be then loaded in with the
158 ``-load`` option. Example::
160 $ cd $LLVMC_DIR/plugins/Simple
162 $ llvmc -load $LLVM_DIR/Release/lib/LLVMCSimple.so
164 Sometimes, you will want a 'bare-bones' version of LLVMC that has no
165 built-in plugins. It can be compiled with the following command::
168 $ make BUILTIN_PLUGINS=""
171 Customizing LLVMC: the compilation graph
172 ========================================
174 Each TableGen configuration file should include the common
177 include "llvm/CompilerDriver/Common.td"
179 Internally, LLVMC stores information about possible source
180 transformations in form of a graph. Nodes in this graph represent
181 tools, and edges between two nodes represent a transformation path. A
182 special "root" node is used to mark entry points for the
183 transformations. LLVMC also assigns a weight to each edge (more on
184 this later) to choose between several alternative edges.
186 The definition of the compilation graph (see file
187 ``plugins/Base/Base.td`` for an example) is just a list of edges::
189 def CompilationGraph : CompilationGraph<[
190 Edge<"root", "llvm_gcc_c">,
191 Edge<"root", "llvm_gcc_assembler">,
194 Edge<"llvm_gcc_c", "llc">,
195 Edge<"llvm_gcc_cpp", "llc">,
198 OptionalEdge<"llvm_gcc_c", "opt", (case (switch_on "opt"),
200 OptionalEdge<"llvm_gcc_cpp", "opt", (case (switch_on "opt"),
204 OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker",
205 (case (input_languages_contain "c++"), (inc_weight),
206 (or (parameter_equals "linker", "g++"),
207 (parameter_equals "linker", "c++")), (inc_weight))>,
212 As you can see, the edges can be either default or optional, where
213 optional edges are differentiated by an additional ``case`` expression
214 used to calculate the weight of this edge. Notice also that we refer
215 to tools via their names (as strings). This makes it possible to add
216 edges to an existing compilation graph in plugins without having to
217 know about all tool definitions used in the graph.
219 The default edges are assigned a weight of 1, and optional edges get a
220 weight of 0 + 2*N where N is the number of tests that evaluated to
221 true in the ``case`` expression. It is also possible to provide an
222 integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
223 the weight is increased (or decreased) by the provided value instead
224 of the default 2. It is also possible to change the default weight of
225 an optional edge by using the ``default`` clause of the ``case``
228 When passing an input file through the graph, LLVMC picks the edge
229 with the maximum weight. To avoid ambiguity, there should be only one
230 default edge between two nodes (with the exception of the root node,
231 which gets a special treatment - there you are allowed to specify one
232 default edge *per language*).
234 When multiple plugins are loaded, their compilation graphs are merged
235 together. Since multiple edges that have the same end nodes are not
236 allowed (i.e. the graph is not a multigraph), an edge defined in
237 several plugins will be replaced by the definition from the plugin
238 that was loaded last. Plugin load order can be controlled by using the
239 plugin priority feature described above.
241 To get a visual representation of the compilation graph (useful for
242 debugging), run ``llvmc --view-graph``. You will need ``dot`` and
243 ``gsview`` installed for this to work properly.
248 Command-line options that the plugin supports are defined by using an
251 def Options : OptionList<[
252 (switch_option "E", (help "Help string")),
253 (alias_option "quiet", "q")
257 As you can see, the option list is just a list of DAGs, where each DAG
258 is an option description consisting of the option name and some
259 properties. A plugin can define more than one option list (they are
260 all merged together in the end), which can be handy if one wants to
261 separate option groups syntactically.
263 * Possible option types:
265 - ``switch_option`` - a simple boolean switch without arguments,
266 for example ``-O2`` or ``-time``.
268 - ``parameter_option`` - option that takes one argument, for
269 example ``-std=c99``. It is also allowed to use spaces instead of
270 the equality sign: ``-std c99``.
272 - ``parameter_list_option`` - same as the above, but more than one
273 option occurence is allowed.
275 - ``prefix_option`` - same as the parameter_option, but the option
276 name and argument do not have to be separated. Example:
277 ``-ofile``. This can be also specified as ``-o file``; however,
278 ``-o=file`` will be parsed incorrectly (``=file`` will be
279 interpreted as option value).
281 - ``prefix_list_option`` - same as the above, but more than one
282 occurence of the option is allowed; example: ``-lm -lpthread``.
284 - ``alias_option`` - a special option type for creating
285 aliases. Unlike other option types, aliases are not allowed to
286 have any properties besides the aliased option name. Usage
287 example: ``(alias_option "preprocess", "E")``
290 * Possible option properties:
292 - ``help`` - help string associated with this option. Used for
295 - ``required`` - this option is obligatory.
297 - ``hidden`` - this option should not appear in the ``--help``
298 output (but should appear in the ``--help-hidden`` output).
300 - ``really_hidden`` - the option should not appear in any help
303 - ``extern`` - this option is defined in some other plugin, see below.
308 Sometimes, when linking several plugins together, one plugin needs to
309 access options defined in some other plugin. Because of the way
310 options are implemented, such options should be marked as
311 ``extern``. This is what the ``extern`` option property is
315 (switch_option "E", (extern))
318 See also the section on plugin `priorities`__.
324 Conditional evaluation
325 ======================
327 The 'case' construct is the main means by which programmability is
328 achieved in LLVMC. It can be used to calculate edge weights, program
329 actions and modify the shell commands to be executed. The 'case'
330 expression is designed after the similarly-named construct in
331 functional languages and takes the form ``(case (test_1), statement_1,
332 (test_2), statement_2, ... (test_N), statement_N)``. The statements
333 are evaluated only if the corresponding tests evaluate to true.
337 // Edge weight calculation
339 // Increases edge weight by 5 if "-A" is provided on the
340 // command-line, and by 5 more if "-B" is also provided.
342 (switch_on "A"), (inc_weight 5),
343 (switch_on "B"), (inc_weight 5))
346 // Tool command line specification
348 // Evaluates to "cmdline1" if the option "-A" is provided on the
349 // command line; to "cmdline2" if "-B" is provided;
350 // otherwise to "cmdline3".
353 (switch_on "A"), "cmdline1",
354 (switch_on "B"), "cmdline2",
355 (default), "cmdline3")
357 Note the slight difference in 'case' expression handling in contexts
358 of edge weights and command line specification - in the second example
359 the value of the ``"B"`` switch is never checked when switch ``"A"`` is
360 enabled, and the whole expression always evaluates to ``"cmdline1"`` in
363 Case expressions can also be nested, i.e. the following is legal::
365 (case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
368 You should, however, try to avoid doing that because it hurts
369 readability. It is usually better to split tool descriptions and/or
370 use TableGen inheritance instead.
372 * Possible tests are:
374 - ``switch_on`` - Returns true if a given command-line switch is
375 provided by the user. Example: ``(switch_on "opt")``.
377 - ``parameter_equals`` - Returns true if a command-line parameter equals
379 Example: ``(parameter_equals "W", "all")``.
381 - ``element_in_list`` - Returns true if a command-line parameter
382 list contains a given value.
383 Example: ``(parameter_in_list "l", "pthread")``.
385 - ``input_languages_contain`` - Returns true if a given language
386 belongs to the current input language set.
387 Example: ``(input_languages_contain "c++")``.
389 - ``in_language`` - Evaluates to true if the input file language
390 equals to the argument. At the moment works only with ``cmd_line``
391 and ``actions`` (on non-join nodes).
392 Example: ``(in_language "c++")``.
394 - ``not_empty`` - Returns true if a given option (which should be
395 either a parameter or a parameter list) is set by the
397 Example: ``(not_empty "o")``.
399 - ``empty`` - The opposite of ``not_empty``. Equivalent to ``(not (not_empty
400 X))``. Provided for convenience.
402 - ``default`` - Always evaluates to true. Should always be the last
403 test in the ``case`` expression.
405 - ``and`` - A standard logical combinator that returns true iff all
406 of its arguments return true. Used like this: ``(and (test1),
407 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
410 - ``or`` - Another logical combinator that returns true only if any
411 one of its arguments returns true. Example: ``(or (test1),
412 (test2), ... (testN))``.
415 Writing a tool description
416 ==========================
418 As was said earlier, nodes in the compilation graph represent tools,
419 which are described separately. A tool definition looks like this
420 (taken from the ``include/llvm/CompilerDriver/Tools.td`` file)::
422 def llvm_gcc_cpp : Tool<[
424 (out_language "llvm-assembler"),
425 (output_suffix "bc"),
426 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
430 This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
431 ``llvm-g++``. As you can see, a tool definition is just a list of
432 properties; most of them should be self-explanatory. The ``sink``
433 property means that this tool should be passed all command-line
434 options that aren't mentioned in the option list.
436 The complete list of all currently implemented tool properties follows.
438 * Possible tool properties:
440 - ``in_language`` - input language name. Can be either a string or a
441 list, in case the tool supports multiple input languages.
443 - ``out_language`` - output language name. Tools are not allowed to
444 have multiple output languages.
446 - ``output_suffix`` - output file suffix. Can also be changed
447 dynamically, see documentation on actions.
449 - ``cmd_line`` - the actual command used to run the tool. You can
450 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
451 with ``>``, hook invocations (``$CALL``), environment variables
452 (via ``$ENV``) and the ``case`` construct.
454 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
455 list of input files and joins them together. Used for linkers.
457 - ``sink`` - all command-line options that are not handled by other
458 tools are passed to this tool.
460 - ``actions`` - A single big ``case`` expression that specifies how
461 this tool reacts on command-line options (described in more detail
467 A tool often needs to react to command-line options, and this is
468 precisely what the ``actions`` property is for. The next example
469 illustrates this feature::
471 def llvm_gcc_linker : Tool<[
472 (in_language "object-code"),
473 (out_language "executable"),
474 (output_suffix "out"),
475 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
477 (actions (case (not_empty "L"), (forward "L"),
478 (not_empty "l"), (forward "l"),
480 [(append_cmd "-dummy1"), (append_cmd "-dummy2")])
483 The ``actions`` tool property is implemented on top of the omnipresent
484 ``case`` expression. It associates one or more different *actions*
485 with given conditions - in the example, the actions are ``forward``,
486 which forwards a given option unchanged, and ``append_cmd``, which
487 appends a given string to the tool execution command. Multiple actions
488 can be associated with a single condition by using a list of actions
489 (used in the example to append some dummy options). The same ``case``
490 construct can also be used in the ``cmd_line`` property to modify the
493 The "join" property used in the example means that this tool behaves
496 The list of all possible actions follows.
500 - ``append_cmd`` - append a string to the tool invocation
502 Example: ``(case (switch_on "pthread"), (append_cmd
505 - ``error` - exit with error.
506 Example: ``(error "Mixing -c and -S is not allowed!")``.
508 - ``forward`` - forward an option unchanged.
509 Example: ``(forward "Wall")``.
511 - ``forward_as`` - Change the name of an option, but forward the
513 Example: ``(forward_as "O0" "--disable-optimization")``.
515 - ``output_suffix`` - modify the output suffix of this
517 Example: ``(output_suffix "i")``.
519 - ``stop_compilation`` - stop compilation after this tool processes
520 its input. Used without arguments.
522 - ``unpack_values`` - used for for splitting and forwarding
523 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
524 converted to ``-foo=bar -baz`` and appended to the tool invocation
526 Example: ``(unpack_values "Wa,")``.
531 If you are adding support for a new language to LLVMC, you'll need to
532 modify the language map, which defines mappings from file extensions
533 to language names. It is used to choose the proper toolchain(s) for a
534 given input file set. Language map definition looks like this::
536 def LanguageMap : LanguageMap<
537 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
538 LangToSuffixes<"c", ["c"]>,
542 For example, without those definitions the following command wouldn't work::
545 llvmc: Unknown suffix: cpp
547 The language map entries should be added only for tools that are
548 linked with the root node. Since tools are not allowed to have
549 multiple output languages, for nodes "inside" the graph the input and
550 output languages should match. This is enforced at compile-time.
558 Hooks and environment variables
559 -------------------------------
561 Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
562 this is not sufficient: for example, we may want to specify tool names
563 in the configuration file. This can be achieved via the mechanism of
564 hooks - to write your own hooks, just add their definitions to the
565 ``PluginMain.cpp`` or drop a ``.cpp`` file into the
566 ``$LLVMC_DIR/driver`` directory. Hooks should live in the ``hooks``
567 namespace and have the signature ``std::string hooks::MyHookName
568 (void)``. They can be used from the ``cmd_line`` tool property::
570 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
572 It is also possible to use environment variables in the same manner::
574 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
576 To change the command line string based on user-provided options use
577 the ``case`` expression (documented `above`__)::
582 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
584 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
590 How plugins are loaded
591 ----------------------
593 It is possible for LLVMC plugins to depend on each other. For example,
594 one can create edges between nodes defined in some other plugin. To
595 make this work, however, that plugin should be loaded first. To
596 achieve this, the concept of plugin priority was introduced. By
597 default, every plugin has priority zero; to specify the priority
598 explicitly, put the following line in your plugin's TableGen file::
600 def Priority : PluginPriority<$PRIORITY_VALUE>;
601 # Where PRIORITY_VALUE is some integer > 0
603 Plugins are loaded in order of their (increasing) priority, starting
604 with 0. Therefore, the plugin with the highest priority value will be
610 When writing LLVMC plugins, it can be useful to get a visual view of
611 the resulting compilation graph. This can be achieved via the command
612 line option ``--view-graph``. This command assumes that Graphviz_ and
613 Ghostview_ are installed. There is also a ``--dump-graph`` option that
614 creates a Graphviz source file (``compilation-graph.dot``) in the
617 Another useful option is ``--check-graph``. It checks the compilation
618 graph for common errors like mismatched output/input language names,
619 multiple default edges and cycles. These checks can't be performed at
620 compile-time because the plugins can load code dynamically. When
621 invoked with ``--check-graph``, ``llvmc`` doesn't perform any
622 compilation tasks and returns the number of encountered errors as its
625 .. _Graphviz: http://www.graphviz.org/
626 .. _Ghostview: http://pages.cs.wisc.edu/~ghost/
632 <a href="http://jigsaw.w3.org/css-validator/check/referer">
633 <img src="http://jigsaw.w3.org/css-validator/images/vcss-blue"
634 alt="Valid CSS" /></a>
635 <a href="http://validator.w3.org/check?uri=referer">
636 <img src="http://www.w3.org/Icons/valid-xhtml10-blue"
637 alt="Valid XHTML 1.0 Transitional"/></a>
639 <a href="mailto:foldr@codedgers.com">Mikhail Glushenkov</a><br />
640 <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br />
642 Last modified: $Date: 2008-12-11 11:34:48 -0600 (Thu, 11 Dec 2008) $