1 ===================================
2 Customizing LLVMC: Reference Manual
3 ===================================
5 This file was automatically generated by rst2html.
6 Please do not edit directly!
7 The ReST source lives in the directory 'tools/llvmc/doc'.
13 <div class="doc_author">
14 <p>Written by <a href="mailto:foldr@codedgers.com">Mikhail Glushenkov</a></p>
20 LLVMC is a generic compiler driver, designed to be customizable and
21 extensible. It plays the same role for LLVM as the ``gcc`` program
22 does for GCC - LLVMC's job is essentially to transform a set of input
23 files into a set of targets depending on configuration rules and user
24 options. What makes LLVMC different is that these transformation rules
25 are completely customizable - in fact, LLVMC knows nothing about the
26 specifics of transformation (even the command-line options are mostly
27 not hard-coded) and regards the transformation structure as an
28 abstract graph. The structure of this graph is completely determined
29 by plugins, which can be either statically or dynamically linked. This
30 makes it possible to easily adapt LLVMC for other purposes - for
31 example, as a build tool for game resources.
33 Because LLVMC employs TableGen_ as its configuration language, you
34 need to be familiar with it to customize LLVMC.
36 .. _TableGen: http://llvm.org/docs/TableGenFundamentals.html
42 LLVMC tries hard to be as compatible with ``gcc`` as possible,
43 although there are some small differences. Most of the time, however,
44 you shouldn't be able to notice them::
46 $ # This works as expected:
47 $ llvmc -O3 -Wall hello.cpp
51 One nice feature of LLVMC is that one doesn't have to distinguish between
52 different compilers for different languages (think ``g++`` vs. ``gcc``) - the
53 right toolchain is chosen automatically based on input language names (which
54 are, in turn, determined from file extensions). If you want to force files
55 ending with ".c" to compile as C++, use the ``-x`` option, just like you would
58 $ # hello.c is really a C++ file
59 $ llvmc -x c++ hello.c
63 On the other hand, when using LLVMC as a linker to combine several C++
64 object files you should provide the ``--linker`` option since it's
65 impossible for LLVMC to choose the right linker in that case::
69 [A lot of link-time errors skipped]
70 $ llvmc --linker=c++ hello.o
74 By default, LLVMC uses ``llvm-gcc`` to compile the source code. It is also
75 possible to choose the ``clang`` compiler with the ``-clang`` option.
81 LLVMC has some built-in options that can't be overridden in the
82 configuration libraries:
84 * ``-o FILE`` - Output file name.
86 * ``-x LANGUAGE`` - Specify the language of the following input files
87 until the next -x option.
89 * ``-load PLUGIN_NAME`` - Load the specified plugin DLL. Example:
90 ``-load $LLVM_DIR/Release/lib/LLVMCSimple.so``.
92 * ``-v`` - Enable verbose mode, i.e. print out all executed commands.
94 * ``--save-temps`` - Write temporary files to the current directory and do not
95 delete them on exit. This option can also take an argument: the
96 ``--save-temps=obj`` switch will write files into the directory specified with
97 the ``-o`` option. The ``--save-temps=cwd`` and ``--save-temps`` switches are
98 both synonyms for the default behaviour.
100 * ``--temp-dir`` - Write temporary files to the specified directory. This option
101 overrides ``--save-temps``.
103 * ``--check-graph`` - Check the compilation for common errors like mismatched
104 output/input language names, multiple default edges and cycles. Because of
105 plugins, these checks can't be performed at compile-time. Exit with code zero
106 if no errors were found, and return the number of found errors
107 otherwise. Hidden option, useful for debugging LLVMC plugins.
109 * ``--view-graph`` - Show a graphical representation of the compilation graph
110 and exit. Requires that you have ``dot`` and ``gv`` programs installed. Hidden
111 option, useful for debugging LLVMC plugins.
113 * ``--write-graph`` - Write a ``compilation-graph.dot`` file in the current
114 directory with the compilation graph description in Graphviz format (identical
115 to the file used by the ``--view-graph`` option). The ``-o`` option can be
116 used to set the output file name. Hidden option, useful for debugging LLVMC
119 * ``--help``, ``--help-hidden``, ``--version`` - These options have
120 their standard meaning.
122 Compiling LLVMC plugins
123 =======================
125 It's easiest to start working on your own LLVMC plugin by copying the
126 skeleton project which lives under ``$LLVMC_DIR/plugins/Simple``::
128 $ cd $LLVMC_DIR/plugins
129 $ cp -r Simple MyPlugin
132 Makefile PluginMain.cpp Simple.td
134 As you can see, our basic plugin consists of only two files (not
135 counting the build script). ``Simple.td`` contains TableGen
136 description of the compilation graph; its format is documented in the
137 following sections. ``PluginMain.cpp`` is just a helper file used to
138 compile the auto-generated C++ code produced from TableGen source. It
139 can also contain hook definitions (see `below`__).
143 The first thing that you should do is to change the ``LLVMC_PLUGIN``
144 variable in the ``Makefile`` to avoid conflicts (since this variable
145 is used to name the resulting library)::
147 LLVMC_PLUGIN=MyPlugin
149 It is also a good idea to rename ``Simple.td`` to something less
152 $ mv Simple.td MyPlugin.td
154 To build your plugin as a dynamic library, just ``cd`` to its source
155 directory and run ``make``. The resulting file will be called
156 ``plugin_llvmc_$(LLVMC_PLUGIN).$(DLL_EXTENSION)`` (in our case,
157 ``plugin_llvmc_MyPlugin.so``). This library can be then loaded in with the
158 ``-load`` option. Example::
160 $ cd $LLVMC_DIR/plugins/Simple
162 $ llvmc -load $LLVM_DIR/Release/lib/plugin_llvmc_Simple.so
164 Compiling standalone LLVMC-based drivers
165 ========================================
167 By default, the ``llvmc`` executable consists of a driver core plus several
168 statically linked plugins (``Base`` and ``Clang`` at the moment). You can
169 produce a standalone LLVMC-based driver executable by linking the core with your
170 own plugins. The recommended way to do this is by starting with the provided
171 ``Skeleton`` example (``$LLVMC_DIR/example/Skeleton``)::
173 $ cd $LLVMC_DIR/example/
174 $ cp -r Skeleton mydriver
180 If you're compiling LLVM with different source and object directories, then you
181 must perform the following additional steps before running ``make``::
183 # LLVMC_SRC_DIR = $LLVM_SRC_DIR/tools/llvmc/
184 # LLVMC_OBJ_DIR = $LLVM_OBJ_DIR/tools/llvmc/
185 $ cp $LLVMC_SRC_DIR/example/mydriver/Makefile \
186 $LLVMC_OBJ_DIR/example/mydriver/
187 $ cd $LLVMC_OBJ_DIR/example/mydriver
190 Another way to do the same thing is by using the following command::
193 $ make LLVMC_BUILTIN_PLUGINS=MyPlugin LLVMC_BASED_DRIVER_NAME=mydriver
195 This works with both srcdir == objdir and srcdir != objdir, but assumes that the
196 plugin source directory was placed under ``$LLVMC_DIR/plugins``.
198 Sometimes, you will want a 'bare-bones' version of LLVMC that has no
199 built-in plugins. It can be compiled with the following command::
202 $ make LLVMC_BUILTIN_PLUGINS=""
205 Customizing LLVMC: the compilation graph
206 ========================================
208 Each TableGen configuration file should include the common
211 include "llvm/CompilerDriver/Common.td"
213 Internally, LLVMC stores information about possible source
214 transformations in form of a graph. Nodes in this graph represent
215 tools, and edges between two nodes represent a transformation path. A
216 special "root" node is used to mark entry points for the
217 transformations. LLVMC also assigns a weight to each edge (more on
218 this later) to choose between several alternative edges.
220 The definition of the compilation graph (see file
221 ``plugins/Base/Base.td`` for an example) is just a list of edges::
223 def CompilationGraph : CompilationGraph<[
224 Edge<"root", "llvm_gcc_c">,
225 Edge<"root", "llvm_gcc_assembler">,
228 Edge<"llvm_gcc_c", "llc">,
229 Edge<"llvm_gcc_cpp", "llc">,
232 OptionalEdge<"llvm_gcc_c", "opt", (case (switch_on "opt"),
234 OptionalEdge<"llvm_gcc_cpp", "opt", (case (switch_on "opt"),
238 OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker",
239 (case (input_languages_contain "c++"), (inc_weight),
240 (or (parameter_equals "linker", "g++"),
241 (parameter_equals "linker", "c++")), (inc_weight))>,
246 As you can see, the edges can be either default or optional, where
247 optional edges are differentiated by an additional ``case`` expression
248 used to calculate the weight of this edge. Notice also that we refer
249 to tools via their names (as strings). This makes it possible to add
250 edges to an existing compilation graph in plugins without having to
251 know about all tool definitions used in the graph.
253 The default edges are assigned a weight of 1, and optional edges get a
254 weight of 0 + 2*N where N is the number of tests that evaluated to
255 true in the ``case`` expression. It is also possible to provide an
256 integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
257 the weight is increased (or decreased) by the provided value instead
258 of the default 2. It is also possible to change the default weight of
259 an optional edge by using the ``default`` clause of the ``case``
262 When passing an input file through the graph, LLVMC picks the edge
263 with the maximum weight. To avoid ambiguity, there should be only one
264 default edge between two nodes (with the exception of the root node,
265 which gets a special treatment - there you are allowed to specify one
266 default edge *per language*).
268 When multiple plugins are loaded, their compilation graphs are merged
269 together. Since multiple edges that have the same end nodes are not
270 allowed (i.e. the graph is not a multigraph), an edge defined in
271 several plugins will be replaced by the definition from the plugin
272 that was loaded last. Plugin load order can be controlled by using the
273 plugin priority feature described above.
275 To get a visual representation of the compilation graph (useful for
276 debugging), run ``llvmc --view-graph``. You will need ``dot`` and
277 ``gsview`` installed for this to work properly.
282 Command-line options that the plugin supports are defined by using an
285 def Options : OptionList<[
286 (switch_option "E", (help "Help string")),
287 (alias_option "quiet", "q")
291 As you can see, the option list is just a list of DAGs, where each DAG
292 is an option description consisting of the option name and some
293 properties. A plugin can define more than one option list (they are
294 all merged together in the end), which can be handy if one wants to
295 separate option groups syntactically.
297 * Possible option types:
299 - ``switch_option`` - a simple boolean switch without arguments, for example
300 ``-O2`` or ``-time``. At most one occurrence is allowed.
302 - ``parameter_option`` - option that takes one argument, for example
303 ``-std=c99``. It is also allowed to use spaces instead of the equality
304 sign: ``-std c99``. At most one occurrence is allowed.
306 - ``parameter_list_option`` - same as the above, but more than one option
307 occurence is allowed.
309 - ``prefix_option`` - same as the parameter_option, but the option name and
310 argument do not have to be separated. Example: ``-ofile``. This can be also
311 specified as ``-o file``; however, ``-o=file`` will be parsed incorrectly
312 (``=file`` will be interpreted as option value). At most one occurrence is
315 - ``prefix_list_option`` - same as the above, but more than one occurence of
316 the option is allowed; example: ``-lm -lpthread``.
318 - ``alias_option`` - a special option type for creating aliases. Unlike other
319 option types, aliases are not allowed to have any properties besides the
320 aliased option name. Usage example: ``(alias_option "preprocess", "E")``
323 * Possible option properties:
325 - ``help`` - help string associated with this option. Used for ``--help``
328 - ``required`` - this option must be specified exactly once (or, in case of
329 the list options without the ``multi_val`` property, at least
330 once). Incompatible with ``zero_or_one`` and ``one_or_more``.
332 - ``one_or_more`` - the option must be specified at least one time. Useful
333 only for list options in conjunction with ``multi_val``; for ordinary lists
334 it is synonymous with ``required``. Incompatible with ``required`` and
337 - ``zero_or_one`` - the option can be specified zero or one times. Useful
338 only for list options in conjunction with ``multi_val``. Incompatible with
339 ``required`` and ``one_or_more``.
341 - ``hidden`` - the description of this option will not appear in
342 the ``--help`` output (but will appear in the ``--help-hidden``
345 - ``really_hidden`` - the option will not be mentioned in any help
348 - ``multi_val n`` - this option takes *n* arguments (can be useful in some
349 special cases). Usage example: ``(parameter_list_option "foo", (multi_val
350 3))``. Only list options can have this attribute; you can, however, use
351 the ``one_or_more`` and ``zero_or_one`` properties.
353 - ``init`` - this option has a default value, either a string (if it is a
354 parameter), or a boolean (if it is a switch; boolean constants are called
355 ``true`` and ``false``). List options can't have this attribute. Usage
356 examples: ``(switch_option "foo", (init true))``; ``(prefix_option "bar",
359 - ``extern`` - this option is defined in some other plugin, see below.
364 Sometimes, when linking several plugins together, one plugin needs to
365 access options defined in some other plugin. Because of the way
366 options are implemented, such options must be marked as
367 ``extern``. This is what the ``extern`` option property is
371 (switch_option "E", (extern))
374 If an external option has additional attributes besides 'extern', they are
375 ignored. See also the section on plugin `priorities`__.
381 Conditional evaluation
382 ======================
384 The 'case' construct is the main means by which programmability is
385 achieved in LLVMC. It can be used to calculate edge weights, program
386 actions and modify the shell commands to be executed. The 'case'
387 expression is designed after the similarly-named construct in
388 functional languages and takes the form ``(case (test_1), statement_1,
389 (test_2), statement_2, ... (test_N), statement_N)``. The statements
390 are evaluated only if the corresponding tests evaluate to true.
394 // Edge weight calculation
396 // Increases edge weight by 5 if "-A" is provided on the
397 // command-line, and by 5 more if "-B" is also provided.
399 (switch_on "A"), (inc_weight 5),
400 (switch_on "B"), (inc_weight 5))
403 // Tool command line specification
405 // Evaluates to "cmdline1" if the option "-A" is provided on the
406 // command line; to "cmdline2" if "-B" is provided;
407 // otherwise to "cmdline3".
410 (switch_on "A"), "cmdline1",
411 (switch_on "B"), "cmdline2",
412 (default), "cmdline3")
414 Note the slight difference in 'case' expression handling in contexts
415 of edge weights and command line specification - in the second example
416 the value of the ``"B"`` switch is never checked when switch ``"A"`` is
417 enabled, and the whole expression always evaluates to ``"cmdline1"`` in
420 Case expressions can also be nested, i.e. the following is legal::
422 (case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
425 You should, however, try to avoid doing that because it hurts
426 readability. It is usually better to split tool descriptions and/or
427 use TableGen inheritance instead.
429 * Possible tests are:
431 - ``switch_on`` - Returns true if a given command-line switch is
432 provided by the user. Example: ``(switch_on "opt")``.
434 - ``parameter_equals`` - Returns true if a command-line parameter equals
436 Example: ``(parameter_equals "W", "all")``.
438 - ``element_in_list`` - Returns true if a command-line parameter
439 list contains a given value.
440 Example: ``(parameter_in_list "l", "pthread")``.
442 - ``input_languages_contain`` - Returns true if a given language
443 belongs to the current input language set.
444 Example: ``(input_languages_contain "c++")``.
446 - ``in_language`` - Evaluates to true if the input file language
447 equals to the argument. At the moment works only with ``cmd_line``
448 and ``actions`` (on non-join nodes).
449 Example: ``(in_language "c++")``.
451 - ``not_empty`` - Returns true if a given option (which should be
452 either a parameter or a parameter list) is set by the
454 Example: ``(not_empty "o")``.
456 - ``empty`` - The opposite of ``not_empty``. Equivalent to ``(not (not_empty
457 X))``. Provided for convenience.
459 - ``default`` - Always evaluates to true. Should always be the last
460 test in the ``case`` expression.
462 - ``and`` - A standard logical combinator that returns true iff all
463 of its arguments return true. Used like this: ``(and (test1),
464 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
467 - ``or`` - Another logical combinator that returns true only if any
468 one of its arguments returns true. Example: ``(or (test1),
469 (test2), ... (testN))``.
472 Writing a tool description
473 ==========================
475 As was said earlier, nodes in the compilation graph represent tools,
476 which are described separately. A tool definition looks like this
477 (taken from the ``include/llvm/CompilerDriver/Tools.td`` file)::
479 def llvm_gcc_cpp : Tool<[
481 (out_language "llvm-assembler"),
482 (output_suffix "bc"),
483 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
487 This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
488 ``llvm-g++``. As you can see, a tool definition is just a list of
489 properties; most of them should be self-explanatory. The ``sink``
490 property means that this tool should be passed all command-line
491 options that aren't mentioned in the option list.
493 The complete list of all currently implemented tool properties follows.
495 * Possible tool properties:
497 - ``in_language`` - input language name. Can be either a string or a
498 list, in case the tool supports multiple input languages.
500 - ``out_language`` - output language name. Tools are not allowed to
501 have multiple output languages.
503 - ``output_suffix`` - output file suffix. Can also be changed
504 dynamically, see documentation on actions.
506 - ``cmd_line`` - the actual command used to run the tool. You can
507 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
508 with ``>``, hook invocations (``$CALL``), environment variables
509 (via ``$ENV``) and the ``case`` construct.
511 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
512 list of input files and joins them together. Used for linkers.
514 - ``sink`` - all command-line options that are not handled by other
515 tools are passed to this tool.
517 - ``actions`` - A single big ``case`` expression that specifies how
518 this tool reacts on command-line options (described in more detail
524 A tool often needs to react to command-line options, and this is
525 precisely what the ``actions`` property is for. The next example
526 illustrates this feature::
528 def llvm_gcc_linker : Tool<[
529 (in_language "object-code"),
530 (out_language "executable"),
531 (output_suffix "out"),
532 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
534 (actions (case (not_empty "L"), (forward "L"),
535 (not_empty "l"), (forward "l"),
537 [(append_cmd "-dummy1"), (append_cmd "-dummy2")])
540 The ``actions`` tool property is implemented on top of the omnipresent
541 ``case`` expression. It associates one or more different *actions*
542 with given conditions - in the example, the actions are ``forward``,
543 which forwards a given option unchanged, and ``append_cmd``, which
544 appends a given string to the tool execution command. Multiple actions
545 can be associated with a single condition by using a list of actions
546 (used in the example to append some dummy options). The same ``case``
547 construct can also be used in the ``cmd_line`` property to modify the
550 The "join" property used in the example means that this tool behaves
553 The list of all possible actions follows.
557 - ``append_cmd`` - append a string to the tool invocation
559 Example: ``(case (switch_on "pthread"), (append_cmd
562 - ``error` - exit with error.
563 Example: ``(error "Mixing -c and -S is not allowed!")``.
565 - ``forward`` - forward an option unchanged.
566 Example: ``(forward "Wall")``.
568 - ``forward_as`` - Change the name of an option, but forward the
570 Example: ``(forward_as "O0", "--disable-optimization")``.
572 - ``output_suffix`` - modify the output suffix of this
574 Example: ``(output_suffix "i")``.
576 - ``stop_compilation`` - stop compilation after this tool processes
577 its input. Used without arguments.
579 - ``unpack_values`` - used for for splitting and forwarding
580 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
581 converted to ``-foo=bar -baz`` and appended to the tool invocation
583 Example: ``(unpack_values "Wa,")``.
588 If you are adding support for a new language to LLVMC, you'll need to
589 modify the language map, which defines mappings from file extensions
590 to language names. It is used to choose the proper toolchain(s) for a
591 given input file set. Language map definition looks like this::
593 def LanguageMap : LanguageMap<
594 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
595 LangToSuffixes<"c", ["c"]>,
599 For example, without those definitions the following command wouldn't work::
602 llvmc: Unknown suffix: cpp
604 The language map entries should be added only for tools that are
605 linked with the root node. Since tools are not allowed to have
606 multiple output languages, for nodes "inside" the graph the input and
607 output languages should match. This is enforced at compile-time.
615 Hooks and environment variables
616 -------------------------------
618 Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
619 this is not sufficient: for example, we may want to specify tool paths
620 or names in the configuration file. This can be easily achieved via
621 the hooks mechanism. To write your own hooks, just add their
622 definitions to the ``PluginMain.cpp`` or drop a ``.cpp`` file into the
623 your plugin directory. Hooks should live in the ``hooks`` namespace
624 and have the signature ``std::string hooks::MyHookName ([const char*
625 Arg0 [ const char* Arg2 [, ...]]])``. They can be used from the
626 ``cmd_line`` tool property::
628 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
630 To pass arguments to hooks, use the following syntax::
632 (cmd_line "$CALL(MyHook, 'Arg1', 'Arg2', 'Arg # 3')/path/to/file -o1 -o2")
634 It is also possible to use environment variables in the same manner::
636 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
638 To change the command line string based on user-provided options use
639 the ``case`` expression (documented `above`__)::
644 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
646 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
652 How plugins are loaded
653 ----------------------
655 It is possible for LLVMC plugins to depend on each other. For example,
656 one can create edges between nodes defined in some other plugin. To
657 make this work, however, that plugin should be loaded first. To
658 achieve this, the concept of plugin priority was introduced. By
659 default, every plugin has priority zero; to specify the priority
660 explicitly, put the following line in your plugin's TableGen file::
662 def Priority : PluginPriority<$PRIORITY_VALUE>;
663 # Where PRIORITY_VALUE is some integer > 0
665 Plugins are loaded in order of their (increasing) priority, starting
666 with 0. Therefore, the plugin with the highest priority value will be
672 When writing LLVMC plugins, it can be useful to get a visual view of
673 the resulting compilation graph. This can be achieved via the command
674 line option ``--view-graph``. This command assumes that Graphviz_ and
675 Ghostview_ are installed. There is also a ``--write-graph`` option that
676 creates a Graphviz source file (``compilation-graph.dot``) in the
679 Another useful ``llvmc`` option is ``--check-graph``. It checks the
680 compilation graph for common errors like mismatched output/input
681 language names, multiple default edges and cycles. These checks can't
682 be performed at compile-time because the plugins can load code
683 dynamically. When invoked with ``--check-graph``, ``llvmc`` doesn't
684 perform any compilation tasks and returns the number of encountered
685 errors as its status code.
687 .. _Graphviz: http://www.graphviz.org/
688 .. _Ghostview: http://pages.cs.wisc.edu/~ghost/
690 Conditioning on the executable name
691 -----------------------------------
693 For now, the executable name (the value passed to the driver in ``argv[0]``) is
694 accessible only in the C++ code (i.e. hooks). Use the following code::
697 extern const char* ProgramName;
700 std::string MyHook() {
702 if (strcmp(ProgramName, "mydriver") == 0) {
707 In general, you're encouraged not to make the behaviour dependent on the
708 executable file name, and use command-line switches instead. See for example how
709 the ``Base`` plugin behaves when it needs to choose the correct linker options
710 (think ``g++`` vs. ``gcc``).
716 <a href="http://jigsaw.w3.org/css-validator/check/referer">
717 <img src="http://jigsaw.w3.org/css-validator/images/vcss-blue"
718 alt="Valid CSS" /></a>
719 <a href="http://validator.w3.org/check?uri=referer">
720 <img src="http://www.w3.org/Icons/valid-xhtml10-blue"
721 alt="Valid XHTML 1.0 Transitional"/></a>
723 <a href="mailto:foldr@codedgers.com">Mikhail Glushenkov</a><br />
724 <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br />
726 Last modified: $Date: 2008-12-11 11:34:48 -0600 (Thu, 11 Dec 2008) $