=pod

=head1 NAME

llvmc - The LLVM Compiler Driver

=head1 SYNOPSIS

B<llvmc> [I<options>] [I<filenames>...]

=head1 DESCRIPTION

The B<llvmc> command is a configurable driver for invoking other 
LLVM (and non-LLVM) tools in order to compile, optimize and link software
for multiple languages. For those familiar with the GNU Compiler 
Collection's B<gcc> tool, it is very similar. This tool has the
following main goals or purposes:

=over

=item * A Single point of access to the LLVM tool set.

=item * Hide the complexities of the LLVM tools through a single interface.

=item * Make integration of existing non-LLVM tools simple.

=item * Extend the capabilities of minimal front ends.

=item * Make the interface for compiling consistent for all languages.

=back

The tool itself does nothing with a user's program. It merely invokes other
tools to get the compilation tasks done.

The options supported by B<llvmc> generalize the compilation process and
provide a consistent and simple interface for multiple programming languages.
This makes it easier for developers to get their software compiled with LLVM.
Without B<llvmc>, developers would need to understand how to invoke the 
front-end compiler, optimizer, assembler, and linker in order to compile their 
programs. B<llvmc>'s sole mission is to trivialize that process.

=head2 Basic Operation

B<llvmc> always takes the following basic actions:

=over

=item * Command line options and filenames are collected.

The command line options provide the marching orders to B<llvmc> on what actions
it should perform. This is the I<request> the user is making of B<llvmc> and it
is interpreted first.

=item * Configuration files are read.

Based on the options and the suffixes of the filenames presented, a set of 
configuration files are read to configure the actions B<llvmc> will take 
(more on this later).

=item * Determine actions to take.

The tool chain needed to complete the task is determined. This is the primary 
work of B<llvmc>. It breaks the request specified by the command line options 
into a set of basic actions to be done: 

=over

=item * Pre-processing: gathering/filtering compiler input

=item * Compilation: source language to bytecode conversion

=item * Assembly: bytecode to native code conversion

=item * Optimization: conversion of bytecode to something that runs faster

=item * Linking: combining multiple bytecodes to produce executable program

=back

=item * Execute actions.

The actions determined previously are executed sequentially and then
B<llvmc> terminates.

=back

=head1 OPTIONS

=head2 Control Options

Control options tell B<llvmc> what to do at a high level. The 
following control options are defined:

=over

=item B<-c> or B<--compile>

This option specifies that the linking phase is not to be run. All
previous phases, if applicable will run. This is generally how a given
bytecode file is compiled and optimized for a source language module.

=item B<-k> or B<--link> or default

This option (or the lack of any control option) specifies that all stages
of compilation, optimization, and linking should be attempted.  Source files
specified on the command line will be compiled and linked with objects and
libraries also specified. 

=item B<-S> or B<--assemble>

This option specifies that compilation should end in the creation of
an LLVM assembly file that can be later converted to an LLVM object
file.

=item B<-E> or B<--preprocess>

This option specifies that no compilation or linking should be 
performed. Only pre-processing, if applicabe to the language being
compiled, is performed. For languages that support it, this will
result in the output containing the raw input to the compiler.

=back

=head2 Optimization Options

Optimization with B<llvmc> is based on goals and specified with
the following -O options. The specific details of which
optimizations run is controlled by the configuration files because
each source language will have different needs. 

=over

=item B<-O1> or B<-O0> (default, fast compilation)

Only those optimizations that will hasten the compilation (mostly by reducing 
the output) are applied. In general these are extremely fast and simple 
optimizations that reduce emitted code size. The goal here is not to make the 
resulting program fast but to make the compilation fast.  If not specified, 
this is the default level of optimization.

=item B<-O2> (basic optimization)

This level of optimization specifies a balance between generating good code 
that will execute reasonably quickly and not spending too much time optimizing 
the code to get there. For example, this level of optimization may include 
things like global common subexpression elimintation, aggressive dead code 
elimination, and scalar replication.

=item B<-O3> (aggressive optimization)

This level of optimization aggressively optimizes each set of files compiled 
together. However, no link-time inter-procedural optimization is performed.
This level implies all the optimizations of the B<-O1> and B<-O2> optimization
levels, and should also provide loop optimizatiosn and compile time 
inter-procedural optimizations. Essentially, this level tries to do as much
as it can with the input it is given but doesn't do any link time IPO.

=item B<-O4> (linktime optimization)

In addition to the previous three levels of optimization, this level of 
optimization aggressively optimizes each program at link time. It employs
basic analysis and basic link-time inter-procedural optimizations, 
considering the program as a whole.

=item B<-O5> (aggressive linktime optimization)

This is the same as B<-O4> except it employs aggressive analyses and
aggressive inter-procedural optimization. 

=item B<-O6> (profile guided optimization: not implemented)

This is the same as B<-O5> except that it employes profile-guided
reoptimization of the program after it has executed. Note that this implies
a single level of reoptimization based on runtime profile analysis. Once
the re-optimization has completed, the profiling instrumentation is
removed and final optimizations are employed.

=item B<-O7> (lifelong optimization: not implemented)

This is the same as B<-O5> and similar to B<-O6> except that reoptimization
is performed through the life of the program. That is, each run will update
the profile by which future reoptimizations are directed.

=back

=head2 Input Options

=over

=item B<-l> I<LIBRARY>

This option instructs B<llvmc> to locate a library named I<LIBRARY> and search
it for unresolved symbols when linking the program.

=item B<-L> F<path>

This option instructs B<llvmc> to add F<path> to the list of places in which
the linker will

=item B<-x> I<LANGUAGE>

This option instructs B<llvmc> to regard the following input files as 
containing programs in the language I<LANGUAGE>. Normally, input file languages
are identified by their suffix but this option will override that default
behavior. The B<-x> option stays in effect until the end of the options or
a new B<-x> option is encountered.

=back

=head2 Output Options

=over

=item B<-m>I<arch>

This option selects the back end code generator to use. The I<arch> portion
of the option names the back end to use.

=item B<--native>

Normally, B<llvmc> produces bytecode files at most stages of compilation.
With this option, B<llvmc> will arrange for native object files to be
generated with the B<-c> option, native assembly files to be generated
with the B<-S> option, and native executables to be generated with the
B<--link> option. In the case of the B<-E> option, the output will not
differ as there is no I<native> version of pre-processed output.

=item B<-o> F<filename>

Specify the output file name.  The contents of the file  depend on other 
options. 

=back

=head2 Information Options

=over

=item B<-n> or B<--noop>

This option tells B<llvmc> to do everything but actually execute the
resulting tools. In combination with the B<-v> option, this causes B<llvmc>
to merely print out what it would have done.

=item B<-v> or B<--verbose>

This option will cause B<llvmc> to print out (on standard output) each of the 
actions it takes to accomplish the objective. The output will immediately
precede the invocation of other tools.

=item B<--stats>

Print all statistics gathered during the compilation to the standard error. 
Note that this option is merely passed through to the sub-tools to do with 
as they please.

=item B<--time-passes>

Record the amount of time needed for each optimization pass and print it 
to standard error. Like B<--stats> this option is just passed through to
the sub-tools to do with as they please.

=item B<--time-programs>

Record the amount of time each program (compilation tool) takes and print
it to the standard error. 

=back

=head2 Language Specific Options

=over


=item B<-Tool,opt>=I<options>

Pass an arbitrary option to the optimizer.

=item B<-Tool,link>=I<options>

Pass an arbitrary option to the linker.

=item B<-Tool,asm>=I<options>

Pass an arbitrary optionsto the code generator.

=back

=head3 C/C++ Specific Options

=over

=item B<-I>F<path>

This option is just passed through to a C or C++ front end compiler to tell it
where include files can be found.

=back

=head2 Miscellaneous Options

=over

=item B<--help>

Print a summary of command line options.

=item B<-V> or B<--version>

This option will cause B<llvmc> to print out its version number
and terminate.

=back

=head2 Advanced Options

You better know what you're doing if you use these options. Improper use
of these options can produce drastically wrong results.

=over 

=item B<--show-config> I<[suffixes...]>

When this option is given, the only action taken by B<llvmc> is to show its
final configuration state in the form of a configuration file. No compilation
tasks will be conducted when this option is given; processing will stop once
the configuration has been printed. The optional (comma separated) list of 
suffixes controls what is printed. Without any suffixes, the configuration
for all languages is printed. With suffixes, only the languages pertaining
to those file suffixes will be printed. The configuration information is
printed after all command line options and configuration files have been
read and processed. This allows the user to verify that the correct
configuration data has been read by B<llvmc>.

=item B<--config> :I<section>:I<name>=I<value>

This option instructs B<llvmc> to accept I<value> as the value for configuration
item I<name> in the section named I<section>. This is a quick way to override
a configuration item on the command line without resorting to changing the
configuration files. 

=item B<--config-file> F<dirname>

This option tells B<llvmc> to read configuration data from the I<directory>
named F<dirname>. Data from such directories will be read in the order
specified on the command line after all other standard config files have
been read. This allows users or groups of users to conveniently create 
their own configuration directories in addition to the standard ones to which 
they may not have write access.

=item B<--config-only-from> F<dirname>

This option tells B<llvmc> to skip the normal processing of configuration
files and only configure from the contents of the F<dirname> directory. Multiple
B<--config-only-from> options may be given in which case the directories are
read in the order given on the command line.


=item B<--emit-raw-code>

No optimization is done whatsoever. The compilers invoked by B<llvmc> with 
this option given will be instructed to produce raw, unoptimized code.  This 
option is useful only to front end language developers and therefore does not 
participate in the list of B<-O> options. This is distinctly different from
the B<-O0> option (a synonym for B<-O1>) because those optimizations will
reduce code size to make compilation faster. With B<--emit-raw-code>, only
the full raw code produced by the compiler will be generated.

=back

=head1 CONFIGURATION

=head2 Warning
  
Configuration information is relatively static for a given release of LLVM and
a front end compiler. However, the details may change from release to release.  
Users are encouraged to simply use the various options of the B<llvmc> command 
and ignore the configuration of the tool. These configuration files are for 
compiler writers and LLVM developers. Those wishing to simply use B<llvmc> 
don't need to understand this section but it may be instructive on what the tool
does.

=head2 Introduction

B<llvmc> is highly configurable both on the command line and in configuration
files. The options it understands are generic, consistent and simple by design.
Furthermore, the B<llvmc> options apply to the compilation of any LLVM enabled 
programming language. To be enabled as a supported source language compiler, a
compiler writer must provide a configuration file that tells B<llvmc> how to
invoke the compiler and what its capabilities are. The purpose of the
configuration files then is to allow compiler writers to specify to B<llvmc> how
the compiler should be invoked. Users may but are not advised to alter the
compiler's B<llvmc> configuration.

Because B<llvmc> just invokes other programs, it must deal with the
available command line options for those programs regardless of whether they
were written for LLVM or not. Furthermore, not all compilation front ends will
have the same capabilities. Some front ends will simply generate LLVM assembly
code, others will be able to generate fully optimized byte code. In general,
B<llvmc> doesn't make any assumptions about the capabilities or command line
options of a sub-tool. It simply uses the details found in the configuration
files and leaves it to the compiler writer to specify the configuration
correctly.

Ths approach means that new compiler front ends can be up and working very
quickly. As a first cut, a front end can simply compile its source to raw 
(unoptimized) bytecode or LLVM assembly and B<llvmc> can be configured to pick 
up the slack (translate LLVm assembly to bytecode, optimize the bytecode, 
generate native assembly, link, etc.).   In fact, the front end need not use 
any LLVM libraries, and it could be written in any language (instead of C++).
The configuration data will allow the full range of optimization, assembly, 
and linking capabilities that LLVM provides to be added to these kinds of tools.
Enabling the rapid development of front-ends is one of the primary goals of
B<llvmc>.

As a compiler front end matures, it may utilize the LLVM libraries and tools to 
more efficiently produce optimized bytecode directly in a single compilation and
optimization program. In these cases, multiple tools would not be needed and
the configuration data for the compiler would change.

Configuring B<llvmc> to the needs and capabilities of a source language compiler
is relatively straight forward. The compilation process is broken down into five
phases:

=over

=item * Pre-processing (filter and combine source files)

=item * Translation (translate source language to LLVM assembly or bytecode)

=item * Optimization (make bytecode execute quickly)

=item * Assembly (converting bytecode to object code)

=item * Linking (converting translated code to an executable)

=back

A compiler writer must provide a definition of what to do for each of these five
phases for each of the optimization levels. The specification consists simply of
prototypical command lines into which B<llvmc> can substitute command line
arguments and file names. Note that any given phase can be completely blank if
the source language's compiler combines multiple phases into a single program.
For example, quite often pre-processng, translation, and optimization are
combined into a single program. The specification for such a compiler would have
blank entries for pre-processing and translation but a full command line for
optimization. 

=head2 Configuration File Types

There are two types of configuration files: the master configuration file
and the language specific configuration file.

The master configuration file contains the general configuration of B<llvmc> 
itself.  This includes things like the mapping between file extensions and 
source languages. This mapping is needed in order to quickly read only the
applicable language-specific configuration files (avoiding reading every config
file for every compilation task).

Language specific configuration files tell B<llvmc> how to invoke the language's
compiler for a variety of different tasks and what other tools are needed to
I<backfill> the compiler's  missing features (e.g. optimization).

Language specific configuration files are placed in directories and given 
specific names to foster faster lookup. The name of a given configuration file
is the name of the source language.

=head2 Default Directory Locations

B<llvmc> will look for configuration files in two standard locations: the
LLVM installation directory (typically C</usr/local/llvm/etc>) and the user's 
home directory (typically C</home/user/.llvm>). In these directories a file named
C<master> provides the master configuration for B<llvmc>. Language specific
files will have a language specific name (e.g. C++, Stacker, Scheme, FORTRAN).
When reading the configuration files, the master files are always read first in
the following order:

=over

=item 1 C<master> in LLVM installation directory

=item 2 C<master> in the user's home directory.

=back

Then, based on the command line options and the suffixes of the file names
provided on B<llvmc>'s command line, one or more language specific configuration
files are read. Only the language specific configuration files actually needed
to complete B<llvmc>'s task are read. Other language specific files will be
ignored.

Note that the user can affect this process in several ways using the various
B<--config-*> options and with the B<--x LANGUAGE> option.

Although a user I<can> override the master configuration file, this is not
advised. The capability is retained so that compiler writers can affect the
master configuration (such as adding new file suffixes) while developing a new
compiler front end since they might not have write access to the installed
master configuration.

=head2 Syntax

The syntax of the configuration files is yet to be determined. There are three
viable options:

=over

=item XML
=item Windows .ini
=item specific to B<llvmc>

=back

=head1 EXIT STATUS

If B<llvmc> succeeds, it will exit with 0.  Otherwise, if an error
occurs, it will exit with a non-zero value and no compilation actions
will be taken. If one of the compilation tools returns a non-zero 
status, pending actions will be discarded and B<llvmc> will return the
same result code as the failing compilation tool.

=head1 SEE ALSO

L<gccas|gccas>, L<gccld|gccld>, L<llvm-as|llvm-as>, L<llvm-dis|llvm-dis>, 
L<llc|llc>, L<llvm-link|llvm-link>

=head1 AUTHORS

Reid Spencer

=cut