<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>The LLVM Compiler Driver (llvmc)</title>
<link rel="stylesheet" href="llvm.css" type="text/css">
- <style type="text/css">
- TR, TD { border: 2px solid gray; padding: 4pt 4pt 2pt 2pt; }
- TH { border: 2px solid gray; font-weight: bold; font-size: 105%; }
- TABLE { text-align: center; border: 2px solid black;
- border-collapse: collapse; margin-top: 1em; margin-left: 1em;
- margin-right: 1em; margin-bottom: 1em; }
- .td_left { border: 2px solid gray; text-align: left; }
- </style>
<meta name="author" content="Reid Spencer">
<meta name="description"
content="A description of the use and design of the LLVM Compiler Driver.">
<div class="doc_section"> <a name="introduction">Introduction</a></div>
<!-- *********************************************************************** -->
<div class="doc_text">
- <p>The <tt>llvmc</tt> <a href="def_tool">tool</a> is a configurable compiler
- <a href="def_driver">driver</a>. As such, it isn't the compiler, optimizer,
- or linker itself but it drives (invokes) other software that perform those
+ <p>The <tt>llvmc</tt> <a href="#def_tool">tool</a> is a configurable compiler
+ <a href="#def_driver">driver</a>. As such, it isn't a compiler, optimizer,
+ or a linker itself but it drives (invokes) other software that perform those
tasks. If you are familiar with the GNU Compiler Collection's <tt>gcc</tt>
tool, <tt>llvmc</tt> is very similar.</p>
<p>The following introductory sections will help you understand why this tool
<!-- _______________________________________________________________________ -->
<div class="doc_subsection"><a name="purpose">Purpose</a></div>
<div class="doc_text">
- <p><tt>llvmc</tt> was invented to make compilation with LLVM based compilers
- easier. To accomplish this, <tt>llvmc</tt> strives to:</p>
+ <p><tt>llvmc</tt> was invented to make compilation of user programs with
+ LLVM-based tools easier. To accomplish this, <tt>llvmc</tt> strives to:</p>
<ul>
<li>Be the single point of access to most of the LLVM tool set.</li>
<li>Hide the complexities of the LLVM tools through a single interface.</li>
with LLVM, because it:</p>
<ul>
<li>Makes integration of existing non-LLVM tools simple.</li>
- <li>Extends the capabilities of minimal front ends by optimizing their
+ <li>Extends the capabilities of minimal compiler tools by optimizing their
output.</li>
<li>Reduces the number of interfaces a compiler writer must know about
before a working compiler can be completed (essentially only the VMCore
<dt><b>Read Configuration Files</b></dt>
<dd>Based on the options and the suffixes of the filenames presented, a set
of configuration files are read to configure the actions <tt>llvmc</tt> will
- take. Configuration files are provided by either LLVM or the front end
+ take. Configuration files are provided by either LLVM or the
compiler tools that <tt>llvmc</tt> invokes. These files determine what
actions <tt>llvmc</tt> will take in response to the user's request. See
the section on <a href="#configuration">configuration</a> for more details.
<code>
llvmc -O2 x.c y.c z.c -o xyz</code>
<p>must produce <i>exactly</i> the same results as:</p>
- <code>
- llvmc -O2 x.c
- llvmc -O2 y.c
- llvmc -O2 z.c
- llvmc -O2 x.o y.o z.o -o xyz</code>
+ <pre><tt>
+ llvmc -O2 x.c -o x.o
+ llvmc -O2 y.c -o y.o
+ llvmc -O2 z.c -o z.o
+ llvmc -O2 x.o y.o z.o -o xyz</tt></pre>
<p>To accomplish this, <tt>llvmc</tt> uses a very simple goal oriented
procedure to do its work. The overall goal is to produce a functioning
executable. To accomplish this, <tt>llvmc</tt> always attempts to execute a
program.</dd>
</dl>
<p>The following table shows the inputs, outputs, and command line options
- applicabe to each phase.</p>
+ applicable to each phase.</p>
<table>
<tr>
<th style="width: 10%">Phase</th>
</ul></td>
<td class="td_left"><ul>
<li>LLVM Assembly</li>
- <li>LLVM Bytecode</li>
+ <li>LLVM Bitcode</li>
<li>LLVM C++ IR</li>
</ul></td>
<td class="td_left"><dl>
<td><b>Optimization</b></td>
<td class="td_left"><ul>
<li>LLVM Assembly</li>
- <li>LLVM Bytecode</li>
+ <li>LLVM Bitcode</li>
</ul></td>
<td class="td_left"><ul>
- <li>LLVM Bytecode</li>
+ <li>LLVM Bitcode</li>
</ul></td>
<td class="td_left"><dl>
<dt><tt>-Ox</tt>
- <dd>This group of options affects the amount of optimization
+ <dd>This group of options controls the amount of optimization
performed.</dd>
</dl></td>
</tr>
<tr>
<td><b>Linking</b></td>
<td class="td_left"><ul>
- <li>LLVM Bytecode</li>
+ <li>LLVM Bitcode</li>
<li>Native Object Code</li>
<li>LLVM Library</li>
<li>Native Library</li>
</ul></td>
<td class="td_left"><ul>
- <li>LLVM Bytecode Executable</li>
+ <li>LLVM Bitcode Executable</li>
<li>Native Executable</li>
</ul></td>
<td class="td_left"><dl>
<div class="doc_text">
<p>This section of the document describes the configuration files used by
<tt>llvmc</tt>. Configuration information is relatively static for a
- given release of LLVM and a front end compiler. However, the details may
+ given release of LLVM and a compiler tool. However, the details may
change from release to release of either. Users are encouraged to simply use
the various options of the <tt>llvmc</tt> command and ignore the configuration
of the tool. These configuration files are for compiler writers and LLVM
<p>Because <tt>llvmc</tt> just invokes other programs, it must deal with the
available command line options for those programs regardless of whether they
-were written for LLVM or not. Furthermore, not all compilation front ends will
-have the same capabilities. Some front ends will simply generate LLVM assembly
-code, others will be able to generate fully optimized byte code. In general,
+were written for LLVM or not. Furthermore, not all compiler tools will
+have the same capabilities. Some compiler tools will simply generate LLVM assembly
+code, others will be able to generate fully optimized bitcode. In general,
<tt>llvmc</tt> doesn't make any assumptions about the capabilities or command
line options of a sub-tool. It simply uses the details found in the
configuration files and leaves it to the compiler writer to specify the
configuration correctly.</p>
-<p>This approach means that new compiler front ends can be up and working very
-quickly. As a first cut, a front end can simply compile its source to raw
-(unoptimized) bytecode or LLVM assembly and <tt>llvmc</tt> can be configured
-to pick up the slack (translate LLVM assembly to bytecode, optimize the
-bytecode, generate native assembly, link, etc.). In fact, the front end need
-not use any LLVM libraries, and it could be written in any language (instead of
-C++). The configuration data will allow the full range of optimization,
-assembly, and linking capabilities that LLVM provides to be added to these kinds
-of tools. Enabling the rapid development of front-ends is one of the primary
-goals of <tt>llvmc</tt>.</p>
-
-<p>As a compiler front end matures, it may utilize the LLVM libraries and tools
-to more efficiently produce optimized bytecode directly in a single compilation
+<p>This approach means that new compiler tools can be up and working very
+quickly. As a first cut, a tool can simply compile its source to raw
+(unoptimized) bitcode or LLVM assembly and <tt>llvmc</tt> can be configured
+to pick up the slack (translate LLVM assembly to bitcode, optimize the
+bitcode, generate native assembly, link, etc.). In fact, the compiler tools
+need not use any LLVM libraries, and it could be written in any language
+(instead of C++). The configuration data will allow the full range of
+optimization, assembly, and linking capabilities that LLVM provides to be added
+to these kinds of tools. Enabling the rapid development of front-ends is one
+of the primary goals of <tt>llvmc</tt>.</p>
+
+<p>As a compiler tool matures, it may utilize the LLVM libraries and tools
+to more efficiently produce optimized bitcode directly in a single compilation
and optimization program. In these cases, multiple tools would not be needed
and the configuration data for the compiler would change.</p>
<p>Configuring <tt>llvmc</tt> to the needs and capabilities of a source language
-compiler is relatively straight forward. A compiler writer must provide a
+compiler is relatively straight-forward. A compiler writer must provide a
definition of what to do for each of the five compilation phases for each of
the optimization levels. The specification consists simply of prototypical
command lines into which <tt>llvmc</tt> can substitute command line
</div>
<!-- _______________________________________________________________________ -->
-<div class="doc_subsection"><a name="filetypes"></a>Configuration Files</div>
+<div class="doc_subsection"><a name="filetypes">Configuration Files</a></div>
+<div class="doc_subsubsection"><a name="filecontents">File Contents</a></div>
<div class="doc_text">
- <h3>File Contents</h3>
<p>Each configuration file provides the details for a single source language
that is to be compiled. This configuration information tells <tt>llvmc</tt>
how to invoke the language's pre-processor, translator, optimizer, assembler
and linker. Note that a given source language needn't provide all these tools
as many of them exist in llvm currently.</p>
+</div>
- <h3>Directory Search</h3>
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection"><a name="dirsearch">Directory Search</a></div>
+<div class="doc_text">
<p><tt>llvmc</tt> always looks for files of a specific name. It uses the
first file with the name its looking for by searching directories in the
following order:<br/>
<ol>
- <li>Any directory specified by the <tt>--config-dir</tt> option will be
+ <li>Any directory specified by the <tt>-config-dir</tt> option will be
checked first.</li>
<li>If the environment variable LLVM_CONFIG_DIR is set, and it contains
the name of a valid directory, that directory will be searched next.</li>
<p>The first file found in this search will be used. Other files with the
same name will be ignored even if they exist in one of the subsequent search
locations.</p>
+</div>
- <h3>File Names</h3>
+<div class="doc_subsubsection"><a name="filenames">File Names</a></div>
+<div class="doc_text">
<p>In the directories searched, each configuration file is given a specific
name to foster faster lookup (so llvmc doesn't have to do directory searches).
The name of a given language specific configuration file is simply the same
<tt>cpp</tt>, <tt>C</tt>, or <tt>cxx</tt>. For languages that support multiple
file suffixes, multiple (probably identical) files (or symbolic links) will
need to be provided.</p>
+</div>
- <h3>What Gets Read</h3>
+<div class="doc_subsubsection"><a name="whatgetsread">What Gets Read</a></div>
+<div class="doc_text">
<p>Which configuration files are read depends on the command line options and
the suffixes of the file names provided on <tt>llvmc</tt>'s command line. Note
- that the <tt>--x LANGUAGE</tt> option alters the language that <tt>llvmc</tt>
+ that the <tt>-x LANGUAGE</tt> option alters the language that <tt>llvmc</tt>
uses for the subsequent files on the command line. Only the configuration
files actually needed to complete <tt>llvmc</tt>'s task are read. Other
language specific files will be ignored.</p>
<ul>
<li>The file encoding is ASCII.</li>
<li>The file is line oriented. There should be one configuration definition
- per line. Lines are terminated by the newline character (0x0A).</li>
+ per line. Lines are terminated by the newline (0x0A) and/or carriage return
+ characters (0x0D)</li>
<li>A backslash (<tt>\</tt>) before a newline causes the newline to be
ignored. This is useful for line continuation of long definitions. A
backslash anywhere else is recognized as a backslash.</li>
<li>Integers are simply sequences of digits.</li>
<li>Commands start with a program name and are followed by a sequence of
words that are passed to that program as command line arguments. Program
- arguments that begin and end with the <tt>@</tt> sign will have their value
+ arguments that begin and end with the <tt>%</tt> sign will have their value
substituted. Program names beginning with <tt>/</tt> are considered to be
absolute. Otherwise the <tt>PATH</tt> will be applied to find the program to
execute.</li>
<th>Description</th>
<th>Default</th>
</tr>
+ <tr><td colspan="4"><h4>LLVMC ITEMS</h4></td></tr>
+ <tr>
+ <td><b>version</b></td>
+ <td>string</td>
+ <td class="td_left">Provides the version string for the contents of this
+ configuration file. What is accepted as a legal configuration file
+ will change over time and this item tells <tt>llvmc</tt> which version
+ should be expected.</td>
+ <td><i>b</i></td>
+ </tr>
<tr><td colspan="4"><h4>LANG ITEMS</h4></td></tr>
<tr>
<td><b>lang.name</b></td>
<td><b>translator.command</b></td>
<td>command</td>
<td class="td_left">This provides the command prototype that will be used
- to run the translator. Valid substitutions are <tt>@in@</tt> for the
- input file and <tt>@out@</tt> for the output file.</td>
+ to run the translator. Valid substitutions are <tt>%in%</tt> for the
+ input file and <tt>%out%</tt> for the output file.</td>
<td><blank></td>
</tr>
<tr>
<td><b>translator.output</b></td>
- <td><tt>native</tt>, <tt>bytecode</tt> or <tt>assembly</tt></td>
+ <td><tt>bitcode</tt> or <tt>assembly</tt></td>
<td class="td_left">This item specifies the kind of output the language's
translator generates.</td>
- <td><tt>bytecode</tt></td>
+ <td><tt>bitcode</tt></td>
</tr>
<tr>
<td><b>translator.preprocesses</b></td>
whenever the final phase is not pre-processing.</td>
<td><tt>false</tt></td>
</tr>
- <tr>
- <td><b>translator.optimizers</b></td>
- <td>boolean</td>
- <td class="td_left">Indicates that the translator also optimizes. If
- this is true, then <tt>llvmc</tt> will skip the optimization phase
- whenever the final phase is optimization or later.</td>
- <td><tt>false</tt></td>
- </tr>
- <tr>
- <td><b>translator.groks_dash_o</b></td>
- <td>boolean</td>
- <td class="td_left">Indicates that the translator understands the
- <i>intent</i> of the various <tt>-O</tt><i>n</i> options to
- <tt>llvmc</tt>. This will cause the <tt>-O</tt><i>n</i> option to be
- given to the translator instead of the equivalent options provided by
- <tt>lang.opt</tt><i>n</i>.</td>
- <td><tt>false</tt></td>
- </tr>
<tr><td colspan="4"><h4>OPTIMIZER ITEMS</h4></td></tr>
<tr>
<td><b>optimizer.command</b></td>
<td>command</td>
<td class="td_left">This provides the command prototype that will be used
- to run the optimizer. Valid substitutions are <tt>@in@</tt> for the
- input file and <tt>@out@</tt> for the output file.</td>
+ to run the optimizer. Valid substitutions are <tt>%in%</tt> for the
+ input file and <tt>%out%</tt> for the output file.</td>
<td><blank></td>
</tr>
<tr>
<td><b>optimizer.output</b></td>
- <td><tt>native</tt>, <tt>bytecode</tt> or <tt>assembly</tt></td>
+ <td><tt>bitcode</tt> or <tt>assembly</tt></td>
<td class="td_left">This item specifies the kind of output the language's
- optimizer generates.</td>
- <td><tt>bytecode</tt></td>
+ optimizer generates. Valid values are "assembly" and "bitcode"</td>
+ <td><tt>bitcode</tt></td>
</tr>
<tr>
<td><b>optimizer.preprocesses</b></td>
whenever the final phase is optimization or later.</td>
<td><tt>false</tt></td>
</tr>
- <tr>
- <td><b>optimizer.groks_dash_o</b></td>
- <td>boolean</td>
- <td class="td_left">Indicates that the translator understands the
- <i>intent</i> of the various <tt>-O</tt><i>n</i> options to
- <tt>llvmc</tt>. This will cause the <tt>-O</tt><i>n</i> option to be
- given to the translator instead of the equivalent options provided by
- <tt>lang.opt</tt><i>n</i>.</td>
- <td><tt>false</tt></td>
- </tr>
<tr><td colspan="4"><h4>ASSEMBLER ITEMS</h4></td></tr>
<tr>
<td><b>assembler.command</b></td>
<td>command</td>
<td class="td_left">This provides the command prototype that will be used
- to run the assembler. Valid substitutions are <tt>@in@</tt> for the
- input file and <tt>@out@</tt> for the output file.</td>
+ to run the assembler. Valid substitutions are <tt>%in%</tt> for the
+ input file and <tt>%out%</tt> for the output file.</td>
<td><blank></td>
</tr>
- <tr><td colspan="4"><h4>LINKER ITEMS</h4></td></tr>
- <tr>
- <td><b>linker.libs</b></td>
- <td>library names</td>
- <td class="td_left">This provides the list of runtime libraries that the
- source language <i>could</i> link with. In general, the libraries
- needed will be encoded into the LLVM Assembly or bytecode file.
- However, this list tells <tt>llvmc</tt> the names of the ones that
- apply to this source language. The names provided here should be
- unadorned with no suffix and no "lib" prefix.
- </td>
- <td><blank></td>
- </tr>
- <tr>
- <td><b>linker.lib_paths</b></td>
- <td>Fully qualifed local path names</td>
- <td class="td_left">This item provides a list of potential directories
- in which the source language's runtime libraries might be located. If
- a given object file compiled with this language's translator is linked
- then those libraries will be given as <tt>-L</tt> options to the
- linker.</td>
- <td><tt><blank></tt></td>
- </tr>
- <tr>
- <td><b>linker.output</b></td>
- <td><tt>native</tt>, <tt>bytecode</tt> or <tt>assembly</tt></td>
- <td class="td_left">This item specifies the kind of output the language's
- translator generates.</td>
- <td><tt>bytecode</tt></td>
- </tr>
</tbody>
</table>
</div>
<!-- _______________________________________________________________________ -->
<div class="doc_subsection"><a name="substitutions">Substitutions</a></div>
<div class="doc_text">
- <p>On any configruation item that ends in <tt>command</tt>, you must
+ <p>On any configuration item that ends in <tt>command</tt>, you must
specify substitution tokens. Substitution tokens begin and end with a percent
sign (<tt>%</tt>) and are replaced by the corresponding text. Any substitution
token may be given on any <tt>command</tt> line but some are more useful than
others. In particular each command <em>should</em> have both an <tt>%in%</tt>
- and an <tt>%out%</tt> substittution. The table below provides definitions of
+ and an <tt>%out%</tt> substitution. The table below provides definitions of
each of the allowed substitution tokens.</p>
<table>
<tbody>
<td class="td_left">Replaced with all the tool-specific arguments given
to <tt>llvmc</tt> via the <tt>-T</tt> set of options. This just allows
you to place these arguments in the correct place on the command line.
- If the %args% option does not appear on your command line, then you
- are explicitly disallowing the <tt>-T</tt> option for your tool.
+ If the <tt>%args%</tt> option does not appear on your command line,
+ then you are explicitly disallowing the <tt>-T</tt> option for your
+ tool.
+ </td>
+ <tr>
+ <td><tt>%force%</tt></td>
+ <td class="td_left">Replaced with the <tt>-f</tt> option if it was
+ specified on the <tt>llvmc</tt> command line. This is intended to tell
+ the compiler tool to force the overwrite of output files.
</td>
+ </tr>
<tr>
<td><tt>%in%</tt></td>
<td class="td_left">Replaced with the full path of the input file. You
-gcse -dse -scalarrepl -sccp
lang.opt3=-simplifycfg -instcombine -mem2reg -load-vn \
-gcse -dse -scalarrepl -sccp -branch-combine -adce \
- -globaldce -inline -licm -pre
+ -globaldce -inline -licm
lang.opt4=-simplifycfg -instcombine -mem2reg -load-vn \
-gcse -dse -scalarrepl -sccp -ipconstprop \
- -branch-combine -adce -globaldce -inline -licm -pre
+ -branch-combine -adce -globaldce -inline -licm
lang.opt5=-simplifycfg -instcombine -mem2reg --load-vn \
-gcse -dse scalarrepl -sccp -ipconstprop \
- -branch-combine -adce -globaldce -inline -licm -pre \
+ -branch-combine -adce -globaldce -inline -licm \
-block-placement
##########################################################
# To compile stacker source, we just run the stacker
# compiler with a default stack size of 2048 entries.
translator.command=stkrc -s 2048 %in% -o %out% %time% \
- %stats% %args%
+ %stats% %force% %args%
# stkrc doesn't preprocess but we set this to true so
# that we don't run the cp command by default.
# The translator is required to run.
translator.required=true
- # stkrc doesn't do any optimization, it just translates
- translator.optimizes=no
-
# stkrc doesn't handle the -On options
- translator.groks_dash_O=no
+ translator.output=bitcode
##########################################################
# Optimizer definitions
# For optimization, we use the LLVM "opt" program
optimizer.command=opt %in% -o %out% %opt% %time% %stats% \
- %args%
+ %force% %args%
- # opt doesn't (yet) grok -On
- optimizer.groks_dash_O=no
+ optimizer.required = true
# opt doesn't translate
optimizer.translates = no
# opt doesn't preprocess
optimizer.preprocesses=no
-##########################################################
-# Assembler definitions
-##########################################################
- assembler.command=llc %in% -o %out% %target% \
- "-regalloc=linearscan" %time% %stats%
+ # opt produces bitcode
+ optimizer.output = bc
##########################################################
-# Linker definitions
+# Assembler definitions
##########################################################
- linker.libs=stkr_runtime
- linker.paths=
+ assembler.command=llc %in% -o %out% %target% %time% %stats%
</tt></pre>
-
+</div>
<!-- *********************************************************************** -->
<div class="doc_section"><a name="glossary">Glossary</a></div>
defined below.</p>
<dl>
<dt><a name="def_assembly"><b>assembly</b></a></dt>
- <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bytecode or
+ <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bitcode or
LLVM assembly code is assembled to a native code format (either target
specific aseembly language or the platform's native object file format).
</dd>
<dd>Refers to <tt>llvmc</tt> itself.</dd>
<dt><a name="def_linking"><b>linking</b></a></dt>
- <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bytecode files
+ <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bitcode files
and (optionally) native system libraries are combined to form a complete
executable program.</dd>
<dt><a name="def_optimization"><b>optimization</b></a></dt>
- <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bytecode is
+ <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bitcode is
optimized.</dd>
<dt><a name="def_phase"><b>phase</b></a></dt>
<dt><a name="def_translation"><b>translation</b></a></dt>
<dd>A compilation <a href="#def_phase">phase</a> in which
<a href="#def_sourcelanguage">source language</a> code is translated into
- either LLVM assembly language or LLVM bytecode.</dd>
+ either LLVM assembly language or LLVM bitcode.</dd>
</dl>
</div>
<!-- *********************************************************************** -->
href="http://validator.w3.org/check/referer"><img
src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a><a
href="mailto:rspencer@x10sys.com">Reid Spencer</a><br>
-<a href="http://llvm.cs.uiuc.edu">The LLVM Compiler Infrastructure</a><br>
+<a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
Last modified: $Date$
</address>
<!-- vim: sw=2