1 ===================================
2 Customizing LLVMC: Reference Manual
3 ===================================
5 This file was automatically generated by rst2html.
6 Please do not edit directly!
7 The ReST source lives in the directory 'tools/llvmc/doc'.
13 <div class="doc_author">
14 <p>Written by <a href="mailto:foldr@codedgers.com">Mikhail Glushenkov</a></p>
20 LLVMC is a generic compiler driver, designed to be customizable and
21 extensible. It plays the same role for LLVM as the ``gcc`` program
22 does for GCC - LLVMC's job is essentially to transform a set of input
23 files into a set of targets depending on configuration rules and user
24 options. What makes LLVMC different is that these transformation rules
25 are completely customizable - in fact, LLVMC knows nothing about the
26 specifics of transformation (even the command-line options are mostly
27 not hard-coded) and regards the transformation structure as an
28 abstract graph. The structure of this graph is completely determined
29 by plugins, which can be either statically or dynamically linked. This
30 makes it possible to easily adapt LLVMC for other purposes - for
31 example, as a build tool for game resources.
33 Because LLVMC employs TableGen_ as its configuration language, you
34 need to be familiar with it to customize LLVMC.
36 .. _TableGen: http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html
42 LLVMC tries hard to be as compatible with ``gcc`` as possible,
43 although there are some small differences. Most of the time, however,
44 you shouldn't be able to notice them::
46 $ # This works as expected:
47 $ llvmc -O3 -Wall hello.cpp
51 One nice feature of LLVMC is that one doesn't have to distinguish
52 between different compilers for different languages (think ``g++`` and
53 ``gcc``) - the right toolchain is chosen automatically based on input
54 language names (which are, in turn, determined from file
55 extensions). If you want to force files ending with ".c" to compile as
56 C++, use the ``-x`` option, just like you would do it with ``gcc``::
58 $ # hello.c is really a C++ file
59 $ llvmc -x c++ hello.c
63 On the other hand, when using LLVMC as a linker to combine several C++
64 object files you should provide the ``--linker`` option since it's
65 impossible for LLVMC to choose the right linker in that case::
69 [A lot of link-time errors skipped]
70 $ llvmc --linker=c++ hello.o
74 By default, LLVMC uses ``llvm-gcc`` to compile the source code. It is
75 also possible to choose the work-in-progress ``clang`` compiler with
76 the ``-clang`` option.
82 LLVMC has some built-in options that can't be overridden in the
83 configuration libraries:
85 * ``-o FILE`` - Output file name.
87 * ``-x LANGUAGE`` - Specify the language of the following input files
88 until the next -x option.
90 * ``-load PLUGIN_NAME`` - Load the specified plugin DLL. Example:
91 ``-load $LLVM_DIR/Release/lib/LLVMCSimple.so``.
93 * ``-v`` - Enable verbose mode, i.e. print out all executed commands.
95 * ``--check-graph`` - Check the compilation for common errors like mismatched
96 output/input language names, multiple default edges and cycles. Because of
97 plugins, these checks can't be performed at compile-time. Exit with code zero if
98 no errors were found, and return the number of found errors otherwise. Hidden
99 option, useful for debugging LLVMC plugins.
101 * ``--view-graph`` - Show a graphical representation of the compilation graph
102 and exit. Requires that you have ``dot`` and ``gv`` programs installed. Hidden
103 option, useful for debugging LLVMC plugins.
105 * ``--write-graph`` - Write a ``compilation-graph.dot`` file in the current
106 directory with the compilation graph description in Graphviz format (identical
107 to the file used by the ``--view-graph`` option). The ``-o`` option can be used
108 to set the output file name. Hidden option, useful for debugging LLVMC plugins.
110 * ``--save-temps`` - Write temporary files to the current directory
111 and do not delete them on exit. Hidden option, useful for debugging.
113 * ``--help``, ``--help-hidden``, ``--version`` - These options have
114 their standard meaning.
117 Compiling LLVMC plugins
118 =======================
120 It's easiest to start working on your own LLVMC plugin by copying the
121 skeleton project which lives under ``$LLVMC_DIR/plugins/Simple``::
123 $ cd $LLVMC_DIR/plugins
124 $ cp -r Simple MyPlugin
127 Makefile PluginMain.cpp Simple.td
129 As you can see, our basic plugin consists of only two files (not
130 counting the build script). ``Simple.td`` contains TableGen
131 description of the compilation graph; its format is documented in the
132 following sections. ``PluginMain.cpp`` is just a helper file used to
133 compile the auto-generated C++ code produced from TableGen source. It
134 can also contain hook definitions (see `below`__).
138 The first thing that you should do is to change the ``LLVMC_PLUGIN``
139 variable in the ``Makefile`` to avoid conflicts (since this variable
140 is used to name the resulting library)::
142 LLVMC_PLUGIN=MyPlugin
144 It is also a good idea to rename ``Simple.td`` to something less
147 $ mv Simple.td MyPlugin.td
149 Note that the plugin source directory must be placed under
150 ``$LLVMC_DIR/plugins`` to make use of the existing build
151 infrastructure. To build a version of the LLVMC executable called
152 ``mydriver`` with your plugin compiled in, use the following command::
155 $ make BUILTIN_PLUGINS=MyPlugin DRIVER_NAME=mydriver
157 To build your plugin as a dynamic library, just ``cd`` to its source
158 directory and run ``make``. The resulting file will be called
159 ``LLVMC$(LLVMC_PLUGIN).$(DLL_EXTENSION)`` (in our case,
160 ``LLVMCMyPlugin.so``). This library can be then loaded in with the
161 ``-load`` option. Example::
163 $ cd $LLVMC_DIR/plugins/Simple
165 $ llvmc -load $LLVM_DIR/Release/lib/LLVMCSimple.so
167 Sometimes, you will want a 'bare-bones' version of LLVMC that has no
168 built-in plugins. It can be compiled with the following command::
171 $ make BUILTIN_PLUGINS=""
174 Customizing LLVMC: the compilation graph
175 ========================================
177 Each TableGen configuration file should include the common
180 include "llvm/CompilerDriver/Common.td"
182 Internally, LLVMC stores information about possible source
183 transformations in form of a graph. Nodes in this graph represent
184 tools, and edges between two nodes represent a transformation path. A
185 special "root" node is used to mark entry points for the
186 transformations. LLVMC also assigns a weight to each edge (more on
187 this later) to choose between several alternative edges.
189 The definition of the compilation graph (see file
190 ``plugins/Base/Base.td`` for an example) is just a list of edges::
192 def CompilationGraph : CompilationGraph<[
193 Edge<"root", "llvm_gcc_c">,
194 Edge<"root", "llvm_gcc_assembler">,
197 Edge<"llvm_gcc_c", "llc">,
198 Edge<"llvm_gcc_cpp", "llc">,
201 OptionalEdge<"llvm_gcc_c", "opt", (case (switch_on "opt"),
203 OptionalEdge<"llvm_gcc_cpp", "opt", (case (switch_on "opt"),
207 OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker",
208 (case (input_languages_contain "c++"), (inc_weight),
209 (or (parameter_equals "linker", "g++"),
210 (parameter_equals "linker", "c++")), (inc_weight))>,
215 As you can see, the edges can be either default or optional, where
216 optional edges are differentiated by an additional ``case`` expression
217 used to calculate the weight of this edge. Notice also that we refer
218 to tools via their names (as strings). This makes it possible to add
219 edges to an existing compilation graph in plugins without having to
220 know about all tool definitions used in the graph.
222 The default edges are assigned a weight of 1, and optional edges get a
223 weight of 0 + 2*N where N is the number of tests that evaluated to
224 true in the ``case`` expression. It is also possible to provide an
225 integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
226 the weight is increased (or decreased) by the provided value instead
227 of the default 2. It is also possible to change the default weight of
228 an optional edge by using the ``default`` clause of the ``case``
231 When passing an input file through the graph, LLVMC picks the edge
232 with the maximum weight. To avoid ambiguity, there should be only one
233 default edge between two nodes (with the exception of the root node,
234 which gets a special treatment - there you are allowed to specify one
235 default edge *per language*).
237 When multiple plugins are loaded, their compilation graphs are merged
238 together. Since multiple edges that have the same end nodes are not
239 allowed (i.e. the graph is not a multigraph), an edge defined in
240 several plugins will be replaced by the definition from the plugin
241 that was loaded last. Plugin load order can be controlled by using the
242 plugin priority feature described above.
244 To get a visual representation of the compilation graph (useful for
245 debugging), run ``llvmc --view-graph``. You will need ``dot`` and
246 ``gsview`` installed for this to work properly.
251 Command-line options that the plugin supports are defined by using an
254 def Options : OptionList<[
255 (switch_option "E", (help "Help string")),
256 (alias_option "quiet", "q")
260 As you can see, the option list is just a list of DAGs, where each DAG
261 is an option description consisting of the option name and some
262 properties. A plugin can define more than one option list (they are
263 all merged together in the end), which can be handy if one wants to
264 separate option groups syntactically.
266 * Possible option types:
268 - ``switch_option`` - a simple boolean switch without arguments, for example
269 ``-O2`` or ``-time``. At most one occurrence is allowed.
271 - ``parameter_option`` - option that takes one argument, for example
272 ``-std=c99``. It is also allowed to use spaces instead of the equality
273 sign: ``-std c99``. At most one occurrence is allowed.
275 - ``parameter_list_option`` - same as the above, but more than one option
276 occurence is allowed.
278 - ``prefix_option`` - same as the parameter_option, but the option name and
279 argument do not have to be separated. Example: ``-ofile``. This can be also
280 specified as ``-o file``; however, ``-o=file`` will be parsed incorrectly
281 (``=file`` will be interpreted as option value). At most one occurrence is
284 - ``prefix_list_option`` - same as the above, but more than one occurence of
285 the option is allowed; example: ``-lm -lpthread``.
287 - ``alias_option`` - a special option type for creating aliases. Unlike other
288 option types, aliases are not allowed to have any properties besides the
289 aliased option name. Usage example: ``(alias_option "preprocess", "E")``
292 * Possible option properties:
294 - ``help`` - help string associated with this option. Used for ``--help``
297 - ``required`` - this option must be specified exactly once (or, in case of
298 the list options without the ``multi_val`` property, at least
299 once). Incompatible with ``zero_or_one`` and ``one_or_more``.
301 - ``one_or_more`` - the option must be specified at least one time. Useful
302 only for list options in conjunction with ``multi_val``; for ordinary lists
303 it is synonymous with ``required``. Incompatible with ``required`` and
306 - ``zero_or_one`` - the option can be specified zero or one times. Useful
307 only for list options in conjunction with ``multi_val``. Incompatible with
308 ``required`` and ``one_or_more``.
310 - ``hidden`` - the description of this option will not appear in
311 the ``--help`` output (but will appear in the ``--help-hidden``
314 - ``really_hidden`` - the option will not be mentioned in any help
317 - ``multi_val n`` - this option takes *n* arguments (can be useful in some
318 special cases). Usage example: ``(parameter_list_option "foo", (multi_val
319 3))``. Only list options can have this attribute; you can, however, use
320 the ``one_or_more`` and ``zero_or_one`` properties.
322 - ``extern`` - this option is defined in some other plugin, see below.
327 Sometimes, when linking several plugins together, one plugin needs to
328 access options defined in some other plugin. Because of the way
329 options are implemented, such options must be marked as
330 ``extern``. This is what the ``extern`` option property is
334 (switch_option "E", (extern))
337 See also the section on plugin `priorities`__.
343 Conditional evaluation
344 ======================
346 The 'case' construct is the main means by which programmability is
347 achieved in LLVMC. It can be used to calculate edge weights, program
348 actions and modify the shell commands to be executed. The 'case'
349 expression is designed after the similarly-named construct in
350 functional languages and takes the form ``(case (test_1), statement_1,
351 (test_2), statement_2, ... (test_N), statement_N)``. The statements
352 are evaluated only if the corresponding tests evaluate to true.
356 // Edge weight calculation
358 // Increases edge weight by 5 if "-A" is provided on the
359 // command-line, and by 5 more if "-B" is also provided.
361 (switch_on "A"), (inc_weight 5),
362 (switch_on "B"), (inc_weight 5))
365 // Tool command line specification
367 // Evaluates to "cmdline1" if the option "-A" is provided on the
368 // command line; to "cmdline2" if "-B" is provided;
369 // otherwise to "cmdline3".
372 (switch_on "A"), "cmdline1",
373 (switch_on "B"), "cmdline2",
374 (default), "cmdline3")
376 Note the slight difference in 'case' expression handling in contexts
377 of edge weights and command line specification - in the second example
378 the value of the ``"B"`` switch is never checked when switch ``"A"`` is
379 enabled, and the whole expression always evaluates to ``"cmdline1"`` in
382 Case expressions can also be nested, i.e. the following is legal::
384 (case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
387 You should, however, try to avoid doing that because it hurts
388 readability. It is usually better to split tool descriptions and/or
389 use TableGen inheritance instead.
391 * Possible tests are:
393 - ``switch_on`` - Returns true if a given command-line switch is
394 provided by the user. Example: ``(switch_on "opt")``.
396 - ``parameter_equals`` - Returns true if a command-line parameter equals
398 Example: ``(parameter_equals "W", "all")``.
400 - ``element_in_list`` - Returns true if a command-line parameter
401 list contains a given value.
402 Example: ``(parameter_in_list "l", "pthread")``.
404 - ``input_languages_contain`` - Returns true if a given language
405 belongs to the current input language set.
406 Example: ``(input_languages_contain "c++")``.
408 - ``in_language`` - Evaluates to true if the input file language
409 equals to the argument. At the moment works only with ``cmd_line``
410 and ``actions`` (on non-join nodes).
411 Example: ``(in_language "c++")``.
413 - ``not_empty`` - Returns true if a given option (which should be
414 either a parameter or a parameter list) is set by the
416 Example: ``(not_empty "o")``.
418 - ``empty`` - The opposite of ``not_empty``. Equivalent to ``(not (not_empty
419 X))``. Provided for convenience.
421 - ``default`` - Always evaluates to true. Should always be the last
422 test in the ``case`` expression.
424 - ``and`` - A standard logical combinator that returns true iff all
425 of its arguments return true. Used like this: ``(and (test1),
426 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
429 - ``or`` - Another logical combinator that returns true only if any
430 one of its arguments returns true. Example: ``(or (test1),
431 (test2), ... (testN))``.
434 Writing a tool description
435 ==========================
437 As was said earlier, nodes in the compilation graph represent tools,
438 which are described separately. A tool definition looks like this
439 (taken from the ``include/llvm/CompilerDriver/Tools.td`` file)::
441 def llvm_gcc_cpp : Tool<[
443 (out_language "llvm-assembler"),
444 (output_suffix "bc"),
445 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
449 This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
450 ``llvm-g++``. As you can see, a tool definition is just a list of
451 properties; most of them should be self-explanatory. The ``sink``
452 property means that this tool should be passed all command-line
453 options that aren't mentioned in the option list.
455 The complete list of all currently implemented tool properties follows.
457 * Possible tool properties:
459 - ``in_language`` - input language name. Can be either a string or a
460 list, in case the tool supports multiple input languages.
462 - ``out_language`` - output language name. Tools are not allowed to
463 have multiple output languages.
465 - ``output_suffix`` - output file suffix. Can also be changed
466 dynamically, see documentation on actions.
468 - ``cmd_line`` - the actual command used to run the tool. You can
469 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
470 with ``>``, hook invocations (``$CALL``), environment variables
471 (via ``$ENV``) and the ``case`` construct.
473 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
474 list of input files and joins them together. Used for linkers.
476 - ``sink`` - all command-line options that are not handled by other
477 tools are passed to this tool.
479 - ``actions`` - A single big ``case`` expression that specifies how
480 this tool reacts on command-line options (described in more detail
486 A tool often needs to react to command-line options, and this is
487 precisely what the ``actions`` property is for. The next example
488 illustrates this feature::
490 def llvm_gcc_linker : Tool<[
491 (in_language "object-code"),
492 (out_language "executable"),
493 (output_suffix "out"),
494 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
496 (actions (case (not_empty "L"), (forward "L"),
497 (not_empty "l"), (forward "l"),
499 [(append_cmd "-dummy1"), (append_cmd "-dummy2")])
502 The ``actions`` tool property is implemented on top of the omnipresent
503 ``case`` expression. It associates one or more different *actions*
504 with given conditions - in the example, the actions are ``forward``,
505 which forwards a given option unchanged, and ``append_cmd``, which
506 appends a given string to the tool execution command. Multiple actions
507 can be associated with a single condition by using a list of actions
508 (used in the example to append some dummy options). The same ``case``
509 construct can also be used in the ``cmd_line`` property to modify the
512 The "join" property used in the example means that this tool behaves
515 The list of all possible actions follows.
519 - ``append_cmd`` - append a string to the tool invocation
521 Example: ``(case (switch_on "pthread"), (append_cmd
524 - ``error` - exit with error.
525 Example: ``(error "Mixing -c and -S is not allowed!")``.
527 - ``forward`` - forward an option unchanged.
528 Example: ``(forward "Wall")``.
530 - ``forward_as`` - Change the name of an option, but forward the
532 Example: ``(forward_as "O0" "--disable-optimization")``.
534 - ``output_suffix`` - modify the output suffix of this
536 Example: ``(output_suffix "i")``.
538 - ``stop_compilation`` - stop compilation after this tool processes
539 its input. Used without arguments.
541 - ``unpack_values`` - used for for splitting and forwarding
542 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
543 converted to ``-foo=bar -baz`` and appended to the tool invocation
545 Example: ``(unpack_values "Wa,")``.
550 If you are adding support for a new language to LLVMC, you'll need to
551 modify the language map, which defines mappings from file extensions
552 to language names. It is used to choose the proper toolchain(s) for a
553 given input file set. Language map definition looks like this::
555 def LanguageMap : LanguageMap<
556 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
557 LangToSuffixes<"c", ["c"]>,
561 For example, without those definitions the following command wouldn't work::
564 llvmc: Unknown suffix: cpp
566 The language map entries should be added only for tools that are
567 linked with the root node. Since tools are not allowed to have
568 multiple output languages, for nodes "inside" the graph the input and
569 output languages should match. This is enforced at compile-time.
577 Hooks and environment variables
578 -------------------------------
580 Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
581 this is not sufficient: for example, we may want to specify tool paths
582 or names in the configuration file. This can be easily achieved via
583 the hooks mechanism. To write your own hooks, just add their
584 definitions to the ``PluginMain.cpp`` or drop a ``.cpp`` file into the
585 your plugin directory. Hooks should live in the ``hooks`` namespace
586 and have the signature ``std::string hooks::MyHookName ([const char*
587 Arg0 [ const char* Arg2 [, ...]]])``. They can be used from the
588 ``cmd_line`` tool property::
590 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
592 To pass arguments to hooks, use the following syntax::
594 (cmd_line "$CALL(MyHook, 'Arg1', 'Arg2', 'Arg # 3')/path/to/file -o1 -o2")
596 It is also possible to use environment variables in the same manner::
598 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
600 To change the command line string based on user-provided options use
601 the ``case`` expression (documented `above`__)::
606 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
608 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
614 How plugins are loaded
615 ----------------------
617 It is possible for LLVMC plugins to depend on each other. For example,
618 one can create edges between nodes defined in some other plugin. To
619 make this work, however, that plugin should be loaded first. To
620 achieve this, the concept of plugin priority was introduced. By
621 default, every plugin has priority zero; to specify the priority
622 explicitly, put the following line in your plugin's TableGen file::
624 def Priority : PluginPriority<$PRIORITY_VALUE>;
625 # Where PRIORITY_VALUE is some integer > 0
627 Plugins are loaded in order of their (increasing) priority, starting
628 with 0. Therefore, the plugin with the highest priority value will be
634 When writing LLVMC plugins, it can be useful to get a visual view of
635 the resulting compilation graph. This can be achieved via the command
636 line option ``--view-graph``. This command assumes that Graphviz_ and
637 Ghostview_ are installed. There is also a ``--write-graph`` option that
638 creates a Graphviz source file (``compilation-graph.dot``) in the
641 Another useful ``llvmc`` option is ``--check-graph``. It checks the
642 compilation graph for common errors like mismatched output/input
643 language names, multiple default edges and cycles. These checks can't
644 be performed at compile-time because the plugins can load code
645 dynamically. When invoked with ``--check-graph``, ``llvmc`` doesn't
646 perform any compilation tasks and returns the number of encountered
647 errors as its status code.
649 .. _Graphviz: http://www.graphviz.org/
650 .. _Ghostview: http://pages.cs.wisc.edu/~ghost/
656 <a href="http://jigsaw.w3.org/css-validator/check/referer">
657 <img src="http://jigsaw.w3.org/css-validator/images/vcss-blue"
658 alt="Valid CSS" /></a>
659 <a href="http://validator.w3.org/check?uri=referer">
660 <img src="http://www.w3.org/Icons/valid-xhtml10-blue"
661 alt="Valid XHTML 1.0 Transitional"/></a>
663 <a href="mailto:foldr@codedgers.com">Mikhail Glushenkov</a><br />
664 <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br />
666 Last modified: $Date: 2008-12-11 11:34:48 -0600 (Thu, 11 Dec 2008) $