<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
- <link rel="stylesheet" href="llvm.css" type="text/css">
- <title>LLVM 3.0 Release Notes</title>
+ <link rel="stylesheet" href="_static/llvm.css" type="text/css">
+ <title>LLVM 3.2 Release Notes</title>
</head>
<body>
-<h1>LLVM 3.0 Release Notes</h1>
+<h1>LLVM 3.2 Release Notes</h1>
-<img align=right src="http://llvm.org/img/DragonSmall.png"
- width="136" height="136" alt="LLVM Dragon Logo">
+<div>
+<img style="float:right" src="http://llvm.org/img/DragonSmall.png"
+ width="136" height="136" alt="LLVM Dragon Logo">
+</div>
<ol>
<li><a href="#intro">Introduction</a></li>
<li><a href="#subproj">Sub-project Status Update</a></li>
- <li><a href="#externalproj">External Projects Using LLVM 3.0</a></li>
- <li><a href="#whatsnew">What's New in LLVM 3.0?</a></li>
+ <li><a href="#externalproj">External Projects Using LLVM 3.2</a></li>
+ <li><a href="#whatsnew">What's New in LLVM?</a></li>
<li><a href="GettingStarted.html">Installation Instructions</a></li>
<li><a href="#knownproblems">Known Problems</a></li>
<li><a href="#additionalinfo">Additional Information</a></li>
<p>Written by the <a href="http://llvm.org/">LLVM Team</a></p>
</div>
-<!--
-<h1 style="color:red">These are in-progress notes for the upcoming LLVM 3.0
+<h1 style="color:red">These are in-progress notes for the upcoming LLVM 3.2
release.<br>
You may prefer the
-<a href="http://llvm.org/releases/2.9/docs/ReleaseNotes.html">LLVM 2.9
+<a href="http://llvm.org/releases/3.1/docs/ReleaseNotes.html">LLVM 3.1
Release Notes</a>.</h1>
- -->
<!-- *********************************************************************** -->
<h2>
<div>
<p>This document contains the release notes for the LLVM Compiler
- Infrastructure, release 3.0. Here we describe the status of LLVM, including
+ Infrastructure, release 3.2. Here we describe the status of LLVM, including
major improvements from the previous release, improvements in various
- subprojects of LLVM, and some of the current users of the code.
- All LLVM releases may be downloaded from
- the <a href="http://llvm.org/releases/">LLVM releases web site</a>.</p>
+ subprojects of LLVM, and some of the current users of the code. All LLVM
+ releases may be downloaded from the <a href="http://llvm.org/releases/">LLVM
+ releases web site</a>.</p>
<p>For more information about LLVM, including information about the latest
release, please check out the <a href="http://llvm.org/">main LLVM web
<div>
-<p>The LLVM 3.0 distribution currently consists of code from the core LLVM
- repository (which roughly includes the LLVM optimizers, code generators and
- supporting tools), and the Clang repository. In
- addition to this code, the LLVM Project includes other sub-projects that are
- in development. Here we include updates on these subprojects.</p>
+<p>The LLVM 3.2 distribution currently consists of code from the core LLVM
+ repository, which roughly includes the LLVM optimizers, code generators and
+ supporting tools, and the Clang repository. In addition to this code, the
+ LLVM Project includes other sub-projects that are in development. Here we
+ include updates on these subprojects.</p>
<!--=========================================================================-->
<h3>
production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86
(32- and 64-bit), and for Darwin/ARM targets.</p>
-<p>In the LLVM 3.0 time-frame, the Clang team has made many improvements:</p>
-
+<p>In the LLVM 3.2 time-frame, the Clang team has made many improvements.
+ Highlights include:</p>
<ul>
- <li>Greatly improved support for building C++ applications, with greater
- stability and better diagnostics.</li>
-
- <li><a href="http://clang.llvm.org/cxx_status.html">Improved support</a> for
- the <a href="http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=50372">C++
- 2011</a> standard (aka "C++'0x"), including implementations of non-static data member
- initializers, alias templates, delegating constructors, range-based
- for loops, and implicitly-generated move constructors and move assignment
- operators, among others.</li>
-
- <li>Implemented support for some features of the upcoming C1x standard,
- including static assertions and generic selections.</li>
-
- <li>Better detection of include and linking paths for system headers and
- libraries, especially for Linux distributions.</li>
-
- <li>Several improvements to Objective-C support, including:
-
- <ul>
- <li><a href="http://clang.llvm.org/docs/AutomaticReferenceCounting.html">
- Automatic Reference Counting</a> (ARC) and an improved memory model
- cleanly separating object and C memory.</li>
-
- <li>A migration tool for moving manual retain/release code to ARC</li>
-
- <li>Better support for data hiding, allowing instance variables to be
- declared in implementation contexts or class extensions</li>
- <li>Weak linking support for Objective-C classes</li>
- <li>Improved static type checking by inferring the return type of methods
- such as +alloc and -init.</li>
- </ul>
-
- Some new Objective-C features require either the Mac OS X 10.7 / iOS 5
- Objective-C runtime, or version 1.6 or later of the GNUstep Objective-C
- runtime version.</li>
-
- <li>Improved support for OpenCL C, including the <tt>vec_step</tt> operator,
- address space qualifiers, improved vector literal support and code
- generation support for the <a href="#PTX">PTX target</a>.</li>
-
- <li>Implemented a number of optimizations in <tt>libclang</tt>, the Clang C
- interface, to improve the performance of code completion and the mapping
- from source locations to abstract syntax tree nodes.</li>
+ <li>...</li>
</ul>
+<p>For more details about the changes to Clang since the 3.1 release, see the
+ <a href="http://clang.llvm.org/docs/ReleaseNotes.html">Clang release
+ notes.</a></p>
<p>If Clang rejects your code but another compiler accepts it, please take a
look at the <a href="http://clang.llvm.org/compatibility.html">language
</h3>
<div>
+
<p><a href="http://dragonegg.llvm.org/">DragonEgg</a> is a
<a href="http://gcc.gnu.org/wiki/plugins">gcc plugin</a> that replaces GCC's
- optimizers and code generators with LLVM's. It works with gcc-4.5 or gcc-4.6,
- targets the x86-32 and x86-64 processor families, and has been successfully
- used on the Darwin, FreeBSD, KFreeBSD, Linux and OpenBSD platforms. It fully
- supports Ada, C, C++ and Fortran. It has partial support for Go, Java, Obj-C
- and Obj-C++.</p>
-
-<p>The 3.0 release has the following notable changes:</p>
-
- <ul>
- <li>GCC version 4.6 is now fully supported.</li>
+ optimizers and code generators with LLVM's. It works with gcc-4.5 and gcc-4.6
+ (and partially with gcc-4.7), can target the x86-32/x86-64 and ARM processor
+ families, and has been successfully used on the Darwin, FreeBSD, KFreeBSD,
+ Linux and OpenBSD platforms. It fully supports Ada, C, C++ and Fortran. It
+ has partial support for Go, Java, Obj-C and Obj-C++.</p>
- <li>Patching and building GCC is no longer required: the plugin should work
- with your system GCC (version 4.5 or 4.6; on Debian/Ubuntu systems the
- gcc-4.5-plugin-dev or gcc-4.6-plugin-dev package is also needed).</li>
-
- <li>The <tt>-fplugin-arg-dragonegg-enable-gcc-optzns</tt> option, which runs
- GCC's optimizers as well as LLVM's, now works much better. This is the
- option to use if you want ultimate performance! It is still experimental
- though: it may cause the plugin to crash.</li>
-
- <li>The type and constant conversion logic has been almost entirely rewritten,
- fixing a multitude of obscure bugs.</li>
+<p>The 3.2 release has the following notable changes:</p>
+<ul>
+ <li>...</li>
</ul>
</div>
target-specific hooks required by code generation and other runtime
components. For example, when compiling for a 32-bit target, converting a
double to a 64-bit unsigned integer is compiled into a runtime call to the
- "__fixunsdfdi" function. The compiler-rt library provides highly optimized
- implementations of this and other low-level routines (some are 3x faster than
- the equivalent libgcc routines).</p>
+ <code>__fixunsdfdi</code> function. The compiler-rt library provides highly
+ optimized implementations of this and other low-level routines (some are 3x
+ faster than the equivalent libgcc routines).</p>
-<p>In the LLVM 3.0 timeframe, the target specific ARM code has converted to
- "unified" assembly syntax, and several new functions have been added to the
- library.</p>
+<p>The 3.2 release has the following notable changes:</p>
+
+<ul>
+ <li>...</li>
+</ul>
</div>
<div>
-<p>LLDB is a ground-up implementation of a command line debugger, as well as a
- debugger API that can be used from other applications. LLDB makes use of the
- Clang parser to provide high-fidelity expression parsing (particularly for
- C++) and uses the LLVM JIT for target support.</p>
+<p><a href="http://lldb.llvm.org">LLDB</a> is a ground-up implementation of a
+ command line debugger, as well as a debugger API that can be used from other
+ applications. LLDB makes use of the Clang parser to provide high-fidelity
+ expression parsing (particularly for C++) and uses the LLVM JIT for target
+ support.</p>
+
+<p>The 3.2 release has the following notable changes:</p>
-<p>LLDB has advanced by leaps and bounds in the 3.0 timeframe. It is
- dramatically more stable and useful, and includes both a
- new <a href="http://lldb.llvm.org/tutorial.html">tutorial</a> and
- a <a href="http://lldb.llvm.org/lldb-gdb.html">side-by-side comparison with
- GDB</a>.</p>
+<ul>
+ <li>...</li>
+</ul>
</div>
licensed</a> under the MIT and UIUC license, allowing it to be used more
permissively.</p>
-<p>Libc++ has been ported to FreeBSD and imported into the base system. It is
- planned to be the default STL implementation for FreeBSD 10.</p>
+<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>
+
+<ul>
+ <li>...</li>
+</ul>
</div>
<div>
- <p>The <a href="http://vmkit.llvm.org/">VMKit project</a> is an
- implementation of a Java Virtual Machine (Java VM or JVM) that uses LLVM for
- static and just-in-time compilation.
-
- <p>In the LLVM 3.0 time-frame, VMKit has had significant improvements on both
- runtime and startup performance:</p>
+<p>The <a href="http://vmkit.llvm.org/">VMKit project</a> is an implementation
+ of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and
+ just-in-time compilation.</p>
- <ul>
- <li>Precompilation: by compiling ahead of time a small subset of Java's core
- library, the startup performance have been highly optimized to the point that
- running a 'Hello World' program takes less than 30 milliseconds.</li>
+<p>The 3.2 release has the following notable changes:</p>
- <li>Customization: by customizing virtual methods for individual classes,
- the VM can statically determine the target of a virtual call, and decide to
- inline it.</li>
-
- <li>Inlining: the VM does more inlining than it did before, by allowing more
- bytecode instructions to be inlined, and thanks to customization. It also
- inlines GC barriers, and object allocations.</li>
-
- <li>New exception model: the generated code for a method that does not do
- any try/catch is not penalized anymore by the eventuality of calling a
- method that throws an exception. Instead, the method that throws the
- exception jumps directly to the method that could catch it.</li>
- </ul>
+<ul>
+ <li>...</li>
+</ul>
</div>
<!--=========================================================================-->
<h3>
-<a name="LLBrowse">LLBrowse: IR Browser</a>
+<a name="Polly">Polly: Polyhedral Optimizer</a>
</h3>
<div>
-<p><a href="http://llvm.org/svn/llvm-project/llbrowse/trunk/doc/LLBrowse.html">
- LLBrowse</a> is an interactive viewer for LLVM modules. It can load any LLVM
- module and displays its contents as an expandable tree view, facilitating an
- easy way to inspect types, functions, global variables, or metadata nodes. It
- is fully cross-platform, being based on the popular wxWidgets GUI
- toolkit.</p>
-
-</div>
+<p><a href="http://polly.llvm.org/">Polly</a> is an <em>experimental</em>
+ optimizer for data locality and parallelism. It provides high-level
+ loop optimizations and automatic parallelisation.</p>
+<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>
-<!--=========================================================================-->
-<!--
-<h3>
-<a name="klee">KLEE: A Symbolic Execution Virtual Machine</a>
-</h3>
+<ul>
+ <li>isl, the integer set library used by Polly, was relicensed to the MIT
+license</li>
+ <li>isl based code generation<br />
+ <ul>
+<li>MIT licensed replacement for CLooG (LGPLv2) </li>
+<li>Fine grained option handling (separation of
+core and border computations, control overhead vs. code size) </li>
+</li>
+</ul>
+<li>Support for FORTRAN and dragonegg</li>
+<li>OpenMP code generation fixes</li>
+</ul>
-<div>
-<p>
-<a href="http://klee.llvm.org/">KLEE</a> is a symbolic execution framework for
-programs in LLVM bitcode form. KLEE tries to symbolically evaluate "all" paths
-through the application and records state transitions that lead to fault
-states. This allows it to construct testcases that lead to faults and can even
-be used to verify some algorithms.
-</p>
-<p>UPDATE!</p>
-</div>-->
+</div>
</div>
<!-- *********************************************************************** -->
<h2>
- <a name="externalproj">External Open Source Projects Using LLVM 3.0</a>
+ <a name="externalproj">External Open Source Projects Using LLVM 3.2</a>
</h2>
<!-- *********************************************************************** -->
<div>
<p>An exciting aspect of LLVM is that it is used as an enabling technology for
- a lot of other language and tools projects. This section lists some of the
- projects that have already been updated to work with LLVM 3.0.</p>
-
-<!--=========================================================================-->
-<h3>AddressSanitizer</h3>
-
-<div>
-
-<p><a href="http://code.google.com/p/address-sanitizer/">AddressSanitizer</a>
- uses compiler instrumentation and a specialized malloc library to find C/C++
- bugs such as use-after-free and out-of-bound accesses to heap, stack, and
- globals. The key feature of the tool is speed: the average slowdown
- introduced by AddressSanitizer is less than 2x.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>ClamAV</h3>
-
-<div>
-
-<p><a href="http://www.clamav.net">Clam AntiVirus</a> is an open source (GPL)
- anti-virus toolkit for UNIX, designed especially for e-mail scanning on mail
- gateways.</p>
-
-<p>Since version 0.96 it
- has <a href="http://vrt-sourcefire.blogspot.com/2010/09/introduction-to-clamavs-low-level.html">bytecode
- signatures</a> that allow writing detections for complex malware.
- It uses LLVM's JIT to speed up the execution of bytecode on X86, X86-64,
- PPC32/64, falling back to its own interpreter otherwise. The git version was
- updated to work with LLVM 3.0.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>clang_complete for VIM</h3>
-
-<div>
-
-<p><a href="https://github.com/Rip-Rip/clang_complete">clang_complete</a> is a
- VIM plugin, that provides accurate C/C++ autocompletion using the clang front
- end. The development version of clang complete, can directly use libclang
- which can maintain a cache to speed up auto completion.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>clReflect</h3>
-
-<div>
-
-<p><a href="https://bitbucket.org/dwilliamson/clreflect">clReflect</a> is a C++
- parser that uses clang/LLVM to derive a light-weight reflection database
- suitable for use in game development. It comes with a very simple runtime
- library for loading and querying the database, requiring no external
- dependencies (including CRT), and an additional utility library for object
- management and serialisation.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>Cling C++ Interpreter</h3>
-
-<div>
-
-<p><a href="http://cern.ch/cling">Cling</a> is an interactive compiler interface
- (aka C++ interpreter). It supports C++ and C, and uses LLVM's JIT and the
- Clang parser. It has a prompt interface, runs source files, calls into shared
- libraries, prints the value of expressions, even does runtime lookup of
- identifiers (dynamic scopes). And it just behaves like one would expect from
- an interpreter.</p>
+ a lot of other language and tools projects. This section lists some of the
+ projects that have already been updated to work with LLVM 3.2.</p>
-</div>
-
-<!--=========================================================================-->
-<h3>Crack Programming Language</h3>
+<h3>Crack</h3>
<div>
</div>
-<!--=========================================================================-->
-<h3>Eero</h3>
-
-<div>
-
-<p><a href="http://eerolanguage.org/">Eero</a> is a fully
- header-and-binary-compatible dialect of Objective-C 2.0, implemented with a
- patched version of the Clang/LLVM compiler. It features a streamlined syntax,
- Python-like indentation, and new operators, for improved readability and
- reduced code clutter. It also has new features such as limited forms of
- operator overloading and namespaces, and strict (type-and-operator-safe)
- enumerations. It is inspired by languages such as Smalltalk, Python, and
- Ruby.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>FAUST Real-Time Audio Signal Processing Language</h3>
+<h3>FAUST</h3>
<div>
<p><a href="http://faust.grame.fr/">FAUST</a> is a compiled language for
- real-time audio signal processing. The name FAUST stands for Functional
- AUdio STream. Its programming model combines two approaches: functional
- programming and block diagram composition. In addition with the C, C++, Java
- output formats, the Faust compiler can now generate LLVM bitcode, and works
- with LLVM 2.7-3.0.
- </p>
+ real-time audio signal processing. The name FAUST stands for Functional
+ AUdio STream. Its programming model combines two approaches: functional
+ programming and block diagram composition. In addition with the C, C++, Java,
+ JavaScript output formats, the Faust compiler can generate LLVM bitcode, and
+ works with LLVM 2.7-3.1.</p>
</div>
-<!--=========================================================================-->
<h3>Glasgow Haskell Compiler (GHC)</h3>
<div>
-<p>GHC is an open source, state-of-the-art programming suite for Haskell, a
- standard lazy functional programming language. It includes an optimizing
- static compiler generating good code for a variety of platforms, together
- with an interactive system for convenient, quick development.</p>
+<p><a href="http://www.haskell.org/ghc/">GHC</a> is an open source compiler and
+ programming suite for Haskell, a lazy functional programming language. It
+ includes an optimizing static compiler generating good code for a variety of
+ platforms, together with an interactive system for convenient, quick
+ development.</p>
<p>GHC 7.0 and onwards include an LLVM code generator, supporting LLVM 2.8 and
- later. Since LLVM 2.9, GHC now includes experimental support for the ARM
- platform with LLVM 3.0.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>gwXscript</h3>
-
-<div>
-
-<p><a href="http://botwars.tk/gwscript/">gwXscript</a> is an object oriented,
- aspect oriented programming language which can create both executables (ELF,
- EXE) and shared libraries (DLL, SO, DYNLIB). The compiler is implemented in
- its own language and translates scripts into LLVM-IR which can be optimized
- and translated into native code by the LLVM framework. Source code in
- gwScript contains definitions that expand the namespaces. So you can build
- your project and simply 'plug out' features by removing a file. The remaining
- project does not leave scars since you directly separate concerns by the
- 'template' feature of gwX. It is also possible to add new features to a
- project by just adding files and without editing the original project. This
- language is used for example to create games or content management systems
- that should be extendable.</p>
-
-<p>gwXscript is strongly typed and offers comfort with its native types string,
- hash and array. You can easily write new libraries in gwXscript or native
- code. gwXscript is type safe and users should not be able to crash your
- program or execute malicious code except code that is eating CPU time.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>include-what-you-use</h3>
-
-<div>
-
-<p><a href="http://code.google.com/p/include-what-you-use">include-what-you-use</a>
- is a tool to ensure that a file directly <code>#include</code>s
- all <code>.h</code> files that provide a symbol that the file uses. It also
- removes superfluous <code>#include</code>s from source files.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>ispc: The Intel SPMD Program Compiler</h3>
-
-<div>
-
-<p><a href="http://ispc.github.com">ispc</a> is a compiler for "single program,
- multiple data" (SPMD) programs. It compiles a C-based SPMD programming
- language to run on the SIMD units of CPUs; it often delivers 5-6x speedups on
- a single core of a CPU with an 8-wide SIMD unit compared to serial code,
- while still providing a clean and easy-to-understand programming model. For
- an introduction to the language and its performance,
- see <a href="http://ispc.github.com/example.html">the walkthrough</a> of a short
- example program. ispc is licensed under the BSD license.</p>
+ later.</p>
</div>
-<!--=========================================================================-->
-<h3>The Julia Programming Language</h3>
+<h3>Julia</h3>
<div>
-<p><a href="http://github.com/JuliaLang/julia">Julia</a> is a high-level,
- high-performance dynamic language for technical
- computing. It provides a sophisticated compiler, distributed parallel
- execution, numerical accuracy, and an extensive mathematical function
- library. The compiler uses type inference to generate fast code
- without any type declarations, and uses LLVM's optimization passes and
- JIT compiler. The language is designed around multiple dispatch,
- giving programs a large degree of flexibility. It is ready for use on many
- kinds of problems.</p>
-</div>
-
-<!--=========================================================================-->
-<h3>LanguageKit and Pragmatic Smalltalk</h3>
-
-<div>
-
-<p><a href="http://etoileos.com/etoile/features/languagekit/">LanguageKit</a> is
- a framework for implementing dynamic languages sharing an object model with
- Objective-C. It provides static and JIT compilation using LLVM along with
- its own interpreter. Pragmatic Smalltalk is a dialect of Smalltalk, built on
- top of LanguageKit, that interfaces directly with Objective-C, sharing the
- same object representation and message sending behaviour. These projects are
- developed as part of the Étoilé desktop environment.</p>
+<p><a href="https://github.com/JuliaLang/julia">Julia</a> is a high-level,
+ high-performance dynamic language for technical computing. It provides a
+ sophisticated compiler, distributed parallel execution, numerical accuracy,
+ and an extensive mathematical function library. The compiler uses type
+ inference to generate fast code without any type declarations, and uses
+ LLVM's optimization passes and JIT compiler. The
+ <a href="http://julialang.org/"> Julia Language</a> is designed
+ around multiple dispatch, giving programs a large degree of flexibility. It
+ is ready for use on many kinds of problems.</p>
</div>
-<!--=========================================================================-->
-<h3>LuaAV</h3>
+<h3>LLVM D Compiler</h3>
<div>
-<p><a href="http://lua-av.mat.ucsb.edu/blog/">LuaAV</a> is a real-time
- audiovisual scripting environment based around the Lua language and a
- collection of libraries for sound, graphics, and other media protocols. LuaAV
- uses LLVM and Clang to JIT compile efficient user-defined audio synthesis
- routines specified in a declarative syntax.</p>
+<p><a href="https://github.com/ldc-developers/ldc">LLVM D Compiler</a> (LDC) is
+ a compiler for the D programming Language. It is based on the DMD frontend
+ and uses LLVM as backend.</p>
</div>
-<!--=========================================================================-->
-<h3>Mono</h3>
+<h3>Open Shading Language</h3>
<div>
-<p>An open source, cross-platform implementation of C# and the CLR that is
- binary compatible with Microsoft.NET. Has an optional, dynamically-loaded
- LLVM code generation backend in Mini, the JIT compiler.</p>
+<p><a href="https://github.com/imageworks/OpenShadingLanguage/">Open Shading
+ Language (OSL)</a> is a small but rich language for programmable shading in
+ advanced global illumination renderers and other applications, ideal for
+ describing materials, lights, displacement, and pattern generation. It uses
+ LLVM to JIT complex shader networks to x86 code at runtime.</p>
-<p>Note that we use a Git mirror of LLVM <a
- href="https://github.com/mono/llvm">with some patches</a>.</p>
+<p>OSL was developed by Sony Pictures Imageworks for use in its in-house
+ renderer used for feature film animation and visual effects, and is
+ distributed as open source software with the "New BSD" license.</p>
</div>
-<!--=========================================================================-->
-<h3>Polly</h3>
-
-<div>
-
-<p><a href="http://polly.grosser.es">Polly</a> is an advanced data-locality
- optimizer and automatic parallelizer. It uses an advanced, mathematical
- model to calculate detailed data dependency information which it uses to
- optimize the loop structure of a program. Polly can speed up sequential code
- by improving memory locality and consequently the cache use. Furthermore,
- Polly is able to expose different kind of parallelism which it exploits by
- introducing (basic) OpenMP and SIMD code. A mid-term goal of Polly is to
- automatically create optimized GPU code.</p>
-
-</div>
-
-<!--=========================================================================-->
<h3>Portable OpenCL (pocl)</h3>
<div>
-<p>Portable OpenCL is an open source implementation of the OpenCL standard which
- can be easily adapted for new targets. One of the goals of the project is
- improving performance portability of OpenCL programs, avoiding the need for
- target-dependent manual optimizations. A "native" target is included, which
- allows running OpenCL kernels on the host (CPU).</p>
+<p>In addition to producing an easily portable open source OpenCL
+ implementation, another major goal of <a href="http://pocl.sourceforge.net/">
+ pocl</a> is improving performance portability of OpenCL programs with
+ compiler optimizations, reducing the need for target-dependent manual
+ optimizations. An important part of pocl is a set of LLVM passes used to
+ statically parallelize multiple work-items with the kernel compiler, even in
+ the presence of work-group barriers. This enables static parallelization of
+ the fine-grained static concurrency in the work groups in multiple ways
+ (SIMD, VLIW, superscalar,...).</p>
</div>
-<!--=========================================================================-->
<h3>Pure</h3>
<div>
-<p><a href="http://pure-lang.googlecode.com/">Pure</a> is an
- algebraic/functional programming language based on term rewriting. Programs
- are collections of equations which are used to evaluate expressions in a
- symbolic fashion. The interpreter uses LLVM as a backend to JIT-compile Pure
- programs to fast native code. Pure offers dynamic typing, eager and lazy
- evaluation, lexical closures, a hygienic macro system (also based on term
- rewriting), built-in list and matrix support (including list and matrix
- comprehensions) and an easy-to-use interface to C and other programming
- languages (including the ability to load LLVM bitcode modules, and inline C,
- C++, Fortran and Faust code in Pure programs if the corresponding LLVM-enabled
- compilers are installed).</p>
-
-<p>Pure version 0.48 has been tested and is known to work with LLVM 3.0
- (and continues to work with older LLVM releases >= 2.5).</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>Renderscript</h3>
-
-<div>
-
-<p><a href="http://developer.android.com/guide/topics/renderscript/index.html">Renderscript</a>
- is Android's advanced 3D graphics rendering and compute API. It provides a
- portable C99-based language with extensions to facilitate common use cases
- for enhancing graphics and thread level parallelism. The Renderscript
- compiler frontend is based on Clang/LLVM. It emits a portable bitcode format
- for the actual compiled script code, as well as reflects a Java interface for
- developers to control the execution of the compiled bitcode. Executable
- machine code is then generated from this bitcode by an LLVM backend on the
- device. Renderscript is thus able to provide a mechanism by which Android
- developers can improve performance of their applications while retaining
- portability.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>SAFECode</h3>
-
-<div>
-
-<p><a href="http://safecode.cs.illinois.edu">SAFECode</a> is a memory safe C/C++
- compiler built using LLVM. It takes standard, unannotated C/C++ code,
- analyzes the code to ensure that memory accesses and array indexing
- operations are safe, and instruments the code with run-time checks when
- safety cannot be proven statically. SAFECode can be used as a debugging aid
- (like Valgrind) to find and repair memory safety bugs. It can also be used
- to protect code from security attacks at run-time.</p>
-
-</div>
-<!--=========================================================================-->
-<h3>The Stupid D Compiler (SDC)</h3>
-
-<div>
+<p><a href="http://pure-lang.googlecode.com/">Pure</a> is an
+ algebraic/functional programming language based on term rewriting. Programs
+ are collections of equations which are used to evaluate expressions in a
+ symbolic fashion. The interpreter uses LLVM as a backend to JIT-compile Pure
+ programs to fast native code. Pure offers dynamic typing, eager and lazy
+ evaluation, lexical closures, a hygienic macro system (also based on term
+ rewriting), built-in list and matrix support (including list and matrix
+ comprehensions) and an easy-to-use interface to C and other programming
+ languages (including the ability to load LLVM bitcode modules, and inline C,
+ C++, Fortran and Faust code in Pure programs if the corresponding
+ LLVM-enabled compilers are installed).</p>
-<p><a href="https://github.com/bhelyer/SDC">The Stupid D Compiler</a> is a
- project seeking to write a self-hosting compiler for the D programming
- language without using the frontend of the reference compiler (DMD).</p>
+<p>Pure version 0.54 has been tested and is known to work with LLVM 3.1 (and
+ continues to work with older LLVM releases >= 2.5).</p>
</div>
-<!--=========================================================================-->
<h3>TTA-based Co-design Environment (TCE)</h3>
<div>
-<p>TCE is a toolset for designing application-specific processors (ASP) based on
- the Transport triggered architecture (TTA). The toolset provides a complete
- co-design flow from C/C++ programs down to synthesizable VHDL and parallel
- program binaries. Processor customization points include the register files,
- function units, supported operations, and the interconnection network.</p>
+<p><a href="http://tce.cs.tut.fi/">TCE</a> is a toolset for designing
+ application-specific processors (ASP) based on the Transport triggered
+ architecture (TTA). The toolset provides a complete co-design flow from C/C++
+ programs down to synthesizable VHDL/Verilog and parallel program binaries.
+ Processor customization points include the register files, function units,
+ supported operations, and the interconnection network.</p>
<p>TCE uses Clang and LLVM for C/C++ language support, target independent
optimizations and also for parts of code generation. It generates new
</div>
-<!--=========================================================================-->
-<h3>Tart Programming Language</h3>
-
-<div>
-
-<p><a href="http://code.google.com/p/tart/">Tart</a> is a general-purpose,
- strongly typed programming language designed for application
- developers. Strongly inspired by Python and C#, Tart focuses on practical
- solutions for the professional software developer, while avoiding the clutter
- and boilerplate of legacy languages like Java and C++. Although Tart is still
- in development, the current implementation supports many features expected of
- a modern programming language, such as garbage collection, powerful
- bidirectional type inference, a greatly simplified syntax for template
- metaprogramming, closures and function literals, reflection, operator
- overloading, explicit mutability and immutability, and much more. Tart is
- flexible enough to accommodate a broad range of programming styles and
- philosophies, while maintaining a strong commitment to simplicity, minimalism
- and elegance in design.</p>
-
-</div>
-
-<!--=========================================================================-->
-<h3>ThreadSanitizer</h3>
-
-<div>
-
-<p><a href="http://code.google.com/p/data-race-test/">ThreadSanitizer</a> is a
- data race detector for (mostly) C and C++ code, available for Linux, Mac OS
- and Windows. On different systems, we use binary instrumentation frameworks
- (Valgrind and Pin) as frontends that generate the program events for the race
- detection algorithm. On Linux, there's an option of using LLVM-based
- compile-time instrumentation.</p>
-
-</div>
-
</div>
<!-- *********************************************************************** -->
<h2>
- <a name="whatsnew">What's New in LLVM 3.0?</a>
+ <a name="whatsnew">What's New in LLVM 3.2?</a>
</h2>
<!-- *********************************************************************** -->
<div>
<p>This release includes a huge number of bug fixes, performance tweaks and
- minor improvements. Some of the major improvements and new features are
+ minor improvements. Some of the major improvements and new features are
listed in this section.</p>
<!--=========================================================================-->
<div>
- <!-- Features that need text if they're finished for 3.1:
+ <!-- Features that need text if they're finished for 3.2:
ARM EHABI
combiner-aa?
strong phi elim
loop dependence analysis
CorrelatedValuePropagation
- lib/Transforms/IPO/MergeFunctions.cpp => consider for 3.1.
+ lib/Transforms/IPO/MergeFunctions.cpp => consider for 3.2.
Integrated assembler on by default for arm/thumb?
-->
llvm/lib/Archive - replace with lib object?
-->
-<p>LLVM 3.0 includes several major changes and big features:</p>
+<p>LLVM 3.2 includes several major changes and big features:</p>
<ul>
-<li>llvm-gcc is no longer supported, and not included in the release. We
- recommend switching to <a
- href="http://clang.llvm.org/">Clang</a> or <a
- href="http://dragonegg.llvm.org/">DragonEgg</a>.</li>
-
-<li>The linear scan register allocator has been replaced with a new "greedy"
- register allocator, enabling live range splitting and many other
- optimizations that lead to better code quality. Please see its <a
- href="http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html">blog post</a> or its talk at the <a
- href="http://llvm.org/devmtg/2011-11/">Developer Meeting</a>
- for more information.</li>
-<li>LLVM IR now includes full support for <a href="Atomics.html">atomics
- memory operations</a> intended to support the C++'11 and C'1x memory models.
- This includes <a href="LangRef.html#memoryops">atomic load and store,
- compare and exchange, and read/modify/write instructions</a> as well as a
- full set of <a href="LangRef.html#ordering">memory ordering constraints</a>.
- Please see the <a href="Atomics.html">Atomics Guide</a> for more
- information.
-</li>
-<li>The LLVM IR exception handling representation has been redesigned and
- reimplemented, making it more elegant, fixing a huge number of bugs, and
- enabling inlining and other optimizations. Please see its blog post (XXX
- not yet) and the <a href="ExceptionHandling.html">Exception Handling
- documentation</a> for more information.</li>
-<li>The LLVM IR Type system has been redesigned and reimplemented, making it
- faster and solving some long-standing problems.
- Please see its <a
- href="http://blog.llvm.org/2011/11/llvm-30-type-system-rewrite.html">blog
- post</a> for more information.</li>
-
-<li>The MIPS backend has made major leaps in this release, going from an
- experimental target to being virtually production quality and supporting a
- wide variety of MIPS subtargets. See the <a href="#MIPS">MIPS section</a>
- below for more information.</li>
-
-<li>The optimizer and code generator now supports gprof and gcov-style coverage
- and profiling information, and includes a new llvm-cov tool (but also works
- with gcov). Clang exposes coverage and profiling through GCC-compatible
- command line options.</li>
+ <li>...</li>
</ul>
</div>
<p>LLVM IR has several new features for better support of new targets and that
expose new optimization opportunities:</p>
- <ul>
- <li><a href="Atomics.html">Atomic memory accesses and memory ordering</a> are
- now directly expressible in the IR.</li>
- <li>A new <a href="LangRef.html#int_fma">llvm.fma intrinsic</a> directly
- represents floating point multiply accumulate operations without an
- intermediate rounding stage.</li>
- <li>A new llvm.expect intrinsic (XXX not documented in langref) allows a
- frontend to express expected control flow (and the __builtin_expect builtin
- from GNU C).</li>
- <li>The <a href="LangRef.html#int_prefetch">llvm.prefetch intrinsic</a> now
- takes a 4th argument that specifies whether the prefetch happens from the
- icache or dcache.</li>
- <li>The new <a href="LangRef.html#uwtable">uwtable function attribute</a>
- allows a frontend to control emission of unwind tables.</li>
- <li>The new <a href="LangRef.html#fnattrs">nonlazybind function
- attribute</a> allow optimization of Global Offset Table (GOT) accesses.</li>
- <li>The new <a href="LangRef.html#returns_twice">returns_twice attribute</a>
- allows better modeling of functions like setjmp.</li>
- <li>The <a href="LangRef.html#datalayout">target datalayout</a> string can now
- encode the natural alignment of the target's stack for better optimization.
- </li>
- </ul>
+<ul>
+ <li>Thread local variables may have a specified TLS model. See the
+ <a href="LangRef.html#globalvars">Language Reference Manual</a>.</li>
+ <li>...</li>
+</ul>
+
</div>
<!--=========================================================================-->
<div>
-<p>In addition to many minor performance tweaks and bug fixes, this
- release includes a few major enhancements and additions to the
- optimizers:</p>
+<p>In addition to many minor performance tweaks and bug fixes, this release
+ includes a few major enhancements and additions to the optimizers:</p>
-<ul>
-<li>The pass manager now has an extension API that allows front-ends and plugins
- to insert their own optimizations in the well-known places in the standard
- pass optimization pipeline.</li>
-
-<li>Information about <a href="BranchWeightMetadata.html">branch probability</a>
- and basic block frequency is now available within LLVM, based on a
- combination of static branch prediction heuristics and
- <code>__builtin_expect</code> calls. That information is currently used for
- register spill placement and if-conversion, with additional optimizations
- planned for future releases. The same framework is intended for eventual
- use with profile-guided optimization.</li>
-
-<li>The "-indvars" induction variable simplification pass only modifies
- induction variables when profitable. Sign and zero extension
- elimination, linear function test replacement, loop unrolling, and
- other simplifications that require induction variable analysis have
- been generalized so they no longer require loops to be rewritten into
- canonical form prior to optimization. This new design
- preserves more IR level information, avoids undoing earlier loop
- optimizations (particularly hand-optimized loops), and no longer
- requires the code generator to reconstruct loops into an optimal form -
- an intractable problem.</li>
-
-<li>LLVM now includes a pass to optimize retain/release calls for the
- <a href="http://clang.llvm.org/docs/AutomaticReferenceCounting.html">Automatic
- Reference Counting</a> (ARC) Objective-C language feature (in
- lib/Transforms/Scalar/ObjCARC.cpp). It is a decent example of implementing
- a source-language-specific optimization in LLVM.</li>
+<p> Loop Vectorizer - We've added a loop vectorizer and we are now able to
+ vectorize small loops. The loop vectorizer is disabled by default and
+ can be enabled using the <b>-mllvm -vectorize-loops</b> flag.
+ The SIMD vector width can be specified using the flag
+ <b>-mllvm -force-vector-width=4</b>.
+ The default value is <b>0</b> which means auto-select.
+ <br/>
+ We can now vectorize this function:
+
+ <pre class="doc_code">
+ unsigned sum_arrays(int *A, int *B, int start, int end) {
+ unsigned sum = 0;
+ for (int i = start; i < end; ++i)
+ sum += A[i] + B[i] + i;
+
+ return sum;
+ }
+ </pre>
+ We vectorize under the following loops:
+ <ul>
+ <li>The inner most loops must have a single basic block.</li>
+ <li>The number of iterations are known before the loop starts to execute.</li>
+ <li>The loop counter needs to be incremented by one.</li>
+ <li>The loop trip count <b>can</b> be a variable.</li>
+ <li>Loops do <b>not</b> need to start at zero.</li>
+ <li>The induction variable can be used inside the loop.</li>
+ <li>Loop reductions are supported.</li>
+ <li>Arrays with affine access pattern do <b>not</b> need to be marked as 'noalias' and are checked at runtime.</li>
+ <li>...</li>
+ </ul>
+
+</p>
+
+<p>SROA - We've re-written SROA to be significantly more powerful.
+<!-- FIXME: Add more text here... --></p>
+
+<ul>
+ <li>Branch weight metadata is preseved through more of the optimizer.</li>
+ <li>...</li>
</ul>
</div>
<p>The LLVM Machine Code (aka MC) subsystem was created to solve a number of
problems in the realm of assembly, disassembly, object file format handling,
and a number of other related areas that CPU instruction-set level tools work
- in. For more information, please see
- the <a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro
- to the LLVM MC Project Blog Post</a>.</p>
+ in. For more information, please see the
+ <a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro
+ to the LLVM MC Project Blog Post</a>.</p>
<ul>
- <li>The MC layer has undergone significant refactoring to eliminate layering
- violations that caused it to pull in the LLVM compiler backend code.</li>
- <li>The ELF object file writers are much more full featured.</li>
- <li>The integrated assembler now supports #line directives.</li>
- <li>An early implementation of a JIT built on top of the MC framework (known
- as MC-JIT) has been implemented and will eventually replace the old JIT.
- It emits object files direct to memory and uses a runtime dynamic linker to
- resolve references and drive lazy compilation. The MC-JIT enables much
- greater code reuse between the JIT and the static compiler and provides
- better integration with the platform ABI as a result.
- </li>
- <li>The assembly printer now makes uses of assemblers instruction aliases
- (InstAliases) to print simplified mneumonics when possible.</li>
- <li>TableGen can now autogenerate MC expansion logic for pseudo
- instructions that expand to multiple MC instructions (through the
- PseudoInstExpansion class).</li>
- <li>A new llvm-dwarfdump tool provides a start of a drop-in
- replacement for the corresponding tool that use LLVM libraries. As part of
- this, LLVM has the beginnings of a dwarf parsing library.</li>
- <li>llvm-objdump has more output including, symbol by symbol disassembly,
- inline relocations, section headers, symbol tables, and section contents.
- Support for archive files has also been added.</li>
- <li>llvm-nm has gained support for archives of binary files.</li>
- <li>llvm-size has been added. This tool prints out section sizes.</li>
+ <li>...</li>
</ul>
</div>
<div>
+<p>Stack Coloring - We have implemented a new optimization pass
+ to merge stack objects which are used in disjoin areas of the code.
+ This optimization reduces the required stack space significantly, in cases
+ where it is clear to the optimizer that the stack slot is not shared.
+ We use the lifetime markers to tell the codegen that a certain alloca
+ is used within a region.</p>
+
+<p> We now merge consecutive loads and stores. </p>
+
<p>We have put a significant amount of work into the code generator
infrastructure, which allows us to implement more aggressive algorithms and
make it run faster:</p>
<ul>
-<li>XXX: Segmented stacks.</li>
-<li>LLVM generates substantially better code for indirect gotos due to a new
- tail duplication pass, which can be a substantial performance win for
- interpreter loops that use them.</li>
-<li>Exception handling and debug information is now emitted with CFI directives,
- yielding <a href="http://blog.mozilla.com/respindola/2011/05/12/cfi-directives/">much smaller executables</a> for some C++ applications.
-</li>
-
-<li>The code generator now supports vector "select" operations on vector
- comparisons, turning them into various optimized code sequences (e.g.
- using the SSE4/AVX "blend" instructions).</li>
-<li>The SSE execution domain fix pass and the ARM NEON move fix pass have been
- merged to a target independent execution dependency fix pass. This pass is
- used to select alternative equivalent opcodes in a way that minimizes
- execution domain crossings. Closely connected instructions are moved to
- the same execution domain when possible. Targets can override the
- <code>getExecutionDomain</code> and <code>setExecutionDomain</code> hooks
- to use the pass.</li>
+ <li>...</li>
</ul>
+
+<p> We added new TableGen infrastructure to support bundling for
+ Very Long Instruction Word (VLIW) architectures. TableGen can now
+ automatically generate a deterministic finite automaton from a VLIW
+ target's schedule description which can be queried to determine
+ legal groupings of instructions in a bundle.</p>
+
+<p> We have added a new target independent VLIW packetizer based on the
+ DFA infrastructure to group machine instructions into bundles.</p>
+
+</div>
+
+<h4>
+<a name="blockplacement">Basic Block Placement</a>
+</h4>
+
+<div>
+
+<p>A probability based block placement and code layout algorithm was added to
+ LLVM's code generator. This layout pass supports probabilities derived from
+ static heuristics as well as source code annotations such as
+ <code>__builtin_expect</code>.</p>
+
</div>
<!--=========================================================================-->
<p>New features and major changes in the X86 target include:</p>
<ul>
-<li>The X86 backend, assembler and disassembler now have full support for AVX 1.
- To enable it pass <code>-mavx</code> to the compiler. AVX2 implementation is
- underway on mainline.</li>
-<li>The integrated assembler and disassembler now support a broad range of new
- instructions including Atom, Ivy Bridge, <a
- href="http://en.wikipedia.org/wiki/SSE4a">SSE4a/BMI</a> instructions, <a
- href="http://en.wikipedia.org/wiki/RdRand">rdrand</a> and many others.</li>
-<li>The X86 backend now fully supports the <a href="http://llvm.org/PR879">X87
- floating point stack inline assembly constraints</a>.</li>
-<li>The integrated assembler now supports the <tt>.code32</tt> and
- <tt>.code64</tt> directives to switch between 32-bit and 64-bit
- instructions.</li>
-<li>The X86 backend now synthesizes horizontal add/sub instructions from generic
- vector code when the appropriate instructions are enabled.</li>
-<li>The X86-64 backend generates smaller and faster code at -O0 due to
- improvements in fast instruction selection.</li>
-<li><a href="http://code.google.com/p/nativeclient/">Native Client</a>
- subtarget support has been added.</li>
-
-<li>The CRC32 intrinsics have been renamed. The intrinsics were previously
- <code>@llvm.x86.sse42.crc32.[8|16|32]</code>
- and <code>@llvm.x86.sse42.crc64.[8|64]</code>. They have been renamed to
- <code>@llvm.x86.sse42.crc32.32.[8|16|32]</code> and
- <code>@llvm.x86.sse42.crc32.64.[8|64]</code>.</li>
+ <li>...</li>
</ul>
</div>
<p>New features of the ARM target include:</p>
<ul>
-<li>The ARM backend generates much faster code for Cortex-A9 chips.</li>
-<li>The ARM backend has improved support for Cortex-M series processors.</li>
-<li>The ARM inline assembly constraints have been implemented and are now fully
- supported.</li>
-<li>NEON code produced by Clang often runs much faster due to improvements in
- the Scalar Replacement of Aggregates pass.</li>
-<li>The old ARM disassembler is replaced with a new one based on autogenerated
- encoding information from ARM .td files.</li>
-<li>The integrated assembler has made major leaps forward, but is still beta quality in LLVM 3.0.</li>
+ <li>...</li>
</ul>
+
+<!--_________________________________________________________________________-->
+
+<h4>
+<a name="armintegratedassembler">ARM Integrated Assembler</a>
+</h4>
+
+<div>
+
+<p>The ARM target now includes a full featured macro assembler, including
+ direct-to-object module support for clang. The assembler is currently enabled
+ by default for Darwin only pending testing and any additional necessary
+ platform specific support for Linux.</p>
+
+<p>Full support is included for Thumb1, Thumb2 and ARM modes, along with
+ subtarget and CPU specific extensions for VFP2, VFP3 and NEON.</p>
+
+<p>The assembler is Unified Syntax only (see ARM Architecural Reference Manual
+ for details). While there is some, and growing, support for pre-unfied
+ (divided) syntax, there are still significant gaps in that support.</p>
+
</div>
+</div>
<!--=========================================================================-->
<h3>
<div>
-<p>This release has seen major new work on just about every aspect of the MIPS
- backend. Some of the major new features include:</p>
+<p>New features and major changes in the MIPS target include:</p>
<ul>
- <li>Most MIPS32r1 and r2 instructions are now supported.</li>
- <li>LE/BE MIPS32r1/r2 has been tested extensively.</li>
- <li>O32 ABI has been fully tested.</li>
- <li>MIPS backend has migrated to using the MC infrastructure for assembly printing. Initial support for direct object code emission has been implemented too.</li>
- <li>Delay slot filler has been updated. Now it tries to fill delay slots with useful instructions instead of always filling them with NOPs.</li>
- <li>Support for old-style JIT is complete.</li>
- <li>Support for old architectures (MIPS1 and MIPS2) has been removed.</li>
- <li>Initial support for MIPS64 has been added.</li>
+ <li>...</li>
</ul>
+
</div>
<!--=========================================================================-->
<h3>
- <a name="PTX">PTX Target Improvements</a>
+<a name="PowerPC">PowerPC Target Improvements</a>
</h3>
<div>
- <p>
- The PTX back-end is still experimental, but is fairly usable for compute kernels
- in LLVM 3.0. Most scalar arithmetic is implemented, as well as intrinsics to
- access the special PTX registers and sync instructions. The major missing
- pieces are texture/sampler support and some vector operations.</p>
-
- <p>That said, the backend is already being used for domain-specific languages
- and works well with the <a href="http://www.pcc.me.uk/~peter/libclc/">libclc
- library</a> to supply OpenCL built-ins. With it, you can use Clang to compile
- OpenCL code into PTX and execute it by loading the resulting PTX as a binary
- blob using the nVidia OpenCL library. It has been tested with several OpenCL
- programs, including some from the nVidia GPU Computing SDK, and the performance
- is on par with the nVidia compiler.</p>
+<ul>
+<p>Many fixes and changes across LLVM (and Clang) for better compliance with
+ the 64-bit PowerPC ELF Application Binary Interface, interoperability with
+ GCC, and overall 64-bit PowerPC support. Some highlights include:</p>
+<ul>
+ <li> MCJIT support added.</li>
+ <li> PPC64 relocation support and (small code model) TOC handling
+ added.</li>
+ <li> Parameter passing and return value fixes (alignment issues,
+ padding, varargs support, proper register usage, odd-sized
+ structure support, float support, extension of return values
+ for i32 return values).</li>
+ <li> Fixes in spill and reload code for vector registers.</li>
+ <li> C++ exception handling enabled.</li>
+ <li> Changes to remediate double-rounding compatibility issues with
+ respect to GCC behavior.</li>
+ <li> Refactoring to disentangle ppc64-elf-linux ABI from Darwin
+ ppc64 ABI support.</li>
+ <li> Assorted new test cases and test case fixes (endian and word
+ size issues).</li>
+ <li> Fixes for big-endian codegen bugs, instruction encodings, and
+ instruction constraints.</li>
+ <li> Implemented -integrated-as support.</li>
+ <li> Additional support for Altivec compare operations.</li>
+ <li> IBM long double support.</li>
+</ul>
+<p>There have also been code generation improvements for both 32- and 64-bit
+ code. Instruction scheduling support for the Freescale e500mc and e5500
+ cores has been added.</p>
+</ul>
</div>
<div>
<ul>
-<li>Many PowerPC improvements have been implemented for ELF targets, including
- support for varargs and initial support for direct .o file emission.</li>
-
-<li>MicroBlaze scheduling itineraries were added that model the
- 3-stage and the 5-stage pipeline architectures. The 3-stage
- pipeline model can be selected with <code>-mcpu=mblaze3</code>
- and the 5-stage pipeline model can be selected with
- <code>-mcpu=mblaze5</code>.</li>
-
+ <li>...</li>
</ul>
</div>
<div>
<p>If you're already an LLVM user or developer with out-of-tree changes based on
- LLVM 2.9, this section lists some "gotchas" that you may run into upgrading
+ LLVM 3.2, this section lists some "gotchas" that you may run into upgrading
from the previous release.</p>
<ul>
-<li>LLVM 3.0 removes support for reading LLVM 2.8 and earlier files, and LLVM
- 3.1 will eliminate support for reading LLVM 2.9 files. Going forward, we
- aim for all future versions of LLVM to read bitcode files and .ll files
- produced by LLVM 3.0.</li>
-<li>Tablegen has been split into a library, allowing the clang tblgen pieces
- now live in the clang tree. The llvm version has been renamed to
- llvm-tblgen instead of tblgen.</li>
- <li>The <code>LLVMC</code> meta compiler driver was removed.</li>
- <li>The unused PostOrder Dominator Frontiers and LowerSetJmp passes were removed.</li>
-
-
- <li>The old <code>TailDup</code> pass was not used in the standard pipeline
- and was unable to update ssa form, so it has been removed.
- <li>The syntax of volatile loads and stores in IR has been changed to
- "<code>load volatile</code>"/"<code>store volatile</code>". The old
- syntax ("<code>volatile load</code>"/"<code>volatile store</code>")
- is still accepted, but is now considered deprecated and will be removed in
- 3.1.</li>
- <li>llvm-gcc's frontend tests have been removed from llvm/test/Frontend*, sunk
- into the clang and dragonegg testsuites.</li>
- <li>The old atomic intrinsics (<code>llvm.memory.barrier</code> and
- <code>llvm.atomic.*</code>) are now gone. Please use the new atomic
- instructions, described in the <a href="Atomics.html">atomics guide</a>.
- <li>LLVM's configure script doesn't depend on llvm-gcc anymore, eliminating a
- strange circular dependence between projects.</li>
+ <li>The CellSPU port has been removed. It can still be found in older
+ versions.</li>
+ <li>...</li>
</ul>
-<h4>Windows (32-bit)</h4>
+</div>
+
+<!--=========================================================================-->
+<h3>
+<a name="api_changes">Internal API Changes</a>
+</h3>
+
<div>
+<p>In addition, many APIs have changed in this release. Some of the major
+ LLVM API changes are:</p>
+
+<p> We've added a new interface for allowing IR-level passes to access
+ target-specific information. A new IR-level pass, called
+ "TargetTransformInfo" provides a number of low-level interfaces.
+ LSR and LowerInvoke already use the new interface. </p>
+
+<p> The TargetData structure has been renamed to DataLayout and moved to VMCore
+to remove a dependency on Target. </p>
+
<ul>
- <li>On Win32(MinGW32 and MSVC), Windows 2000 will not be supported.
- Windows XP or higher is required.</li>
+ <li>...</li>
</ul>
</div>
+<!--=========================================================================-->
+<h3>
+<a name="tools_changes">Tools Changes</a>
+</h3>
+
+<div>
+
+<p>In addition, some tools have changed in this release. Some of the changes
+ are:</p>
+
+<ul>
+ <li>...</li>
+</ul>
+
</div>
+
<!--=========================================================================-->
<h3>
-<a name="api_changes">Internal API Changes</a>
+<a name="python">Python Bindings</a>
</h3>
<div>
-<p>In addition, many APIs have changed in this release. Some of the major
- LLVM API changes are:</p>
+<p>Officially supported Python bindings have been added! Feature support is far
+ from complete. The current bindings support interfaces to:</p>
<ul>
- <li>The biggest and most pervasive change is that the type system has been
- rewritten: <code>PATypeHolder</code> and <code>OpaqueType</code> are gone,
- and all APIs deal with <code>Type*</code> instead of <code>const
- Type*</code>. If you need to create recursive structures, then create a
- named structure, and use <code>setBody()</code> when all its elements are
- built. Type merging and refining is gone too: named structures are not
- merged with other structures, even if their layout is identical. (of
- course anonymous structures are still uniqued by layout).</li>
-
- <li><code>PHINode::reserveOperandSpace</code> has been removed. Instead, you
- must specify how many operands to reserve space for when you create the
- PHINode, by passing an extra argument
- into <code>PHINode::Create</code>.</li>
-
- <li>PHINodes no longer store their incoming BasicBlocks as operands. Instead,
- the list of incoming BasicBlocks is stored separately, and can be accessed
- with new functions <code>PHINode::block_begin</code>
- and <code>PHINode::block_end</code>.</li>
-
- <li>Various functions now take an <code>ArrayRef</code> instead of either a
- pair of pointers (or iterators) to the beginning and end of a range, or a
- pointer and a length. Others now return an <code>ArrayRef</code> instead
- of a reference to a <code>SmallVector</code>
- or <code>std::vector</code>. These include:
-<ul>
-<!-- Please keep this list sorted. -->
-<li><code>CallInst::Create</code></li>
-<li><code>ComputeLinearIndex</code> (in <code>llvm/CodeGen/Analysis.h</code>)</li>
-<li><code>ConstantArray::get</code></li>
-<li><code>ConstantExpr::getExtractElement</code></li>
-<li><code>ConstantExpr::getGetElementPtr</code></li>
-<li><code>ConstantExpr::getInBoundsGetElementPtr</code></li>
-<li><code>ConstantExpr::getIndices</code></li>
-<li><code>ConstantExpr::getInsertElement</code></li>
-<li><code>ConstantExpr::getWithOperands</code></li>
-<li><code>ConstantFoldCall</code> (in <code>llvm/Analysis/ConstantFolding.h</code>)</li>
-<li><code>ConstantFoldInstOperands</code> (in <code>llvm/Analysis/ConstantFolding.h</code>)</li>
-<li><code>ConstantVector::get</code></li>
-<li><code>DIBuilder::createComplexVariable</code></li>
-<li><code>DIBuilder::getOrCreateArray</code></li>
-<li><code>ExtractValueInst::Create</code></li>
-<li><code>ExtractValueInst::getIndexedType</code></li>
-<li><code>ExtractValueInst::getIndices</code></li>
-<li><code>FindInsertedValue</code> (in <code>llvm/Analysis/ValueTracking.h</code>)</li>
-<li><code>gep_type_begin</code> (in <code>llvm/Support/GetElementPtrTypeIterator.h</code>)</li>
-<li><code>gep_type_end</code> (in <code>llvm/Support/GetElementPtrTypeIterator.h</code>)</li>
-<li><code>GetElementPtrInst::Create</code></li>
-<li><code>GetElementPtrInst::CreateInBounds</code></li>
-<li><code>GetElementPtrInst::getIndexedType</code></li>
-<li><code>InsertValueInst::Create</code></li>
-<li><code>InsertValueInst::getIndices</code></li>
-<li><code>InvokeInst::Create</code></li>
-<li><code>IRBuilder::CreateCall</code></li>
-<li><code>IRBuilder::CreateExtractValue</code></li>
-<li><code>IRBuilder::CreateGEP</code></li>
-<li><code>IRBuilder::CreateInBoundsGEP</code></li>
-<li><code>IRBuilder::CreateInsertValue</code></li>
-<li><code>IRBuilder::CreateInvoke</code></li>
-<li><code>MDNode::get</code></li>
-<li><code>MDNode::getIfExists</code></li>
-<li><code>MDNode::getTemporary</code></li>
-<li><code>MDNode::getWhenValsUnresolved</code></li>
-<li><code>SimplifyGEPInst</code> (in <code>llvm/Analysis/InstructionSimplify.h</code>)</li>
-<li><code>TargetData::getIndexedOffset</code></li>
-</ul></li>
-
- <li>All forms of <code>StringMap::getOrCreateValue</code> have been remove
- except for the one which takes a <code>StringRef</code>.</li>
-
- <li>The <code>LLVMBuildUnwind</code> function from the C API was removed. The
- LLVM <code>unwind</code> instruction has been deprecated for a long time
- and isn't used by the current front-ends. So this was removed during the
- exception handling rewrite.</li>
-
- <li>The <code>LLVMAddLowerSetJmpPass</code> function from the C API was
- removed because the <code>LowerSetJmp</code> pass was removed.</li>
-
- <li>The <code>DIBuilder</code> interface used by front ends to encode
- debugging information in the LLVM IR now expects clients to
- use <code>DIBuilder::finalize()</code> at the end of translation unit to
- complete debugging information encoding.</li>
-
- <li>TargetSelect.h moved to Support/ from Target/</li>
-
- <li>UpgradeIntrinsicCall no longer upgrades pre-2.9 intrinsic calls (for
- example <code>llvm.memset.i32</code>).</li>
-
- <li>It is mandatory to initialize all out-of-tree passes too and their dependencies now with
- <code>INITIALIZE_PASS{BEGIN,END,}</code>
- and <code>INITIALIZE_{PASS,AG}_DEPENDENCY</code>.</li>
-
- <li>The interface for MemDepResult in MemoryDependenceAnalysis has been
- enhanced with new return types Unknown and NonFuncLocal, in addition to
- the existing types Clobber, Def, and NonLocal.</li>
+ <li>...</li>
</ul>
</div>
<p>LLVM is generally a production quality compiler, and is used by a broad range
of applications and shipping in many products. That said, not every
subsystem is as mature as the aggregate, particularly the more obscure
- targets. If you run into a problem, please check the <a
- href="http://llvm.org/bugs/">LLVM bug database</a> and submit a bug if
- there isn't already one or ask on the <a
- href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev
- list</a>.</p>
+ targets. If you run into a problem, please check
+ the <a href="http://llvm.org/bugs/">LLVM bug database</a> and submit a bug if
+ there isn't already one or ask on
+ the <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">LLVMdev
+ list</a>.</p>
<p>Known problem areas include:</p>
<ul>
- <li>The Alpha, Blackfin, CellSPU, MSP430, PTX, SystemZ and
- XCore backends are experimental, and the Alpha, Blackfin and SystemZ
- targets have already been removed from mainline.</li>
+ <li>The CellSPU, MSP430, PTX and XCore backends are experimental.</li>
<li>The integrated assembler, disassembler, and JIT is not supported by
- several targets. If an integrated assembler is not supported, then a
+ several targets. If an integrated assembler is not supported, then a
system assembler is required. For more details, see the <a
href="CodeGenerator.html#targetfeatures">Target Features Matrix</a>.
</li>
-
- <li>The C backend has numerous problems and is not being actively maintained.
- Depending on it for anything serious is not advised.</li>
</ul>
</div>
-</div>
-
<!-- *********************************************************************** -->
<h2>
<a name="additionalinfo">Additional Information</a>
</div>
-<!--=========================================================================-->
-
-<!-- EH details: to be moved to a blog post:
-
-
-
-
-<p>One of the biggest changes is that 3.0 has a new exception handling
- system. The old system used LLVM intrinsics to convey the exception handling
- information to the code generator. It worked in most cases, but not
- all. Inlining was especially difficult to get right. Also, the intrinsics
- could be moved away from the <code>invoke</code> instruction, making it hard
- to recover that information.</p>
-
-<p>The new EH system makes exception handling a first-class member of the IR. It
- adds two new instructions:</p>
-
-<ul>
- <li><a href="LangRef.html#i_landingpad"><code>landingpad</code></a> —
- this instruction defines a landing pad basic block. It contains all of the
- information that's needed by the code generator. It's also required to be
- the first non-PHI instruction in the landing pad. In addition, a landing
- pad may be jumped to only by the unwind edge of an <code>invoke</code>
- instruction.</li>
-
- <li><a href="LangRef.html#i_resume"><code>resume</code></a> — this
- instruction causes the current exception to resume traveling up the
- stack. It replaces the <code>@llvm.eh.resume</code> intrinsic.</li>
-</ul>
-
-<p>Converting from the old EH API to the new EH API is rather simple, because a
- lot of complexity has been removed. The two intrinsics,
- <code>@llvm.eh.exception</code> and <code>@llvm.eh.selector</code> have been
- superseded by the <code>landingpad</code> instruction. Instead of generating
- a call to <code>@llvm.eh.exception</code> and <code>@llvm.eh.selector</code>:
-
-<div class="doc_code">
-<pre>
-Function *ExcIntr = Intrinsic::getDeclaration(TheModule,
- Intrinsic::eh_exception);
-Function *SlctrIntr = Intrinsic::getDeclaration(TheModule,
- Intrinsic::eh_selector);
-
-// The exception pointer.
-Value *ExnPtr = Builder.CreateCall(ExcIntr, "exc_ptr");
-
-std::vector<Value*> Args;
-Args.push_back(ExnPtr);
-Args.push_back(Builder.CreateBitCast(Personality,
- Type::getInt8PtrTy(Context)));
-
-<i>// Add selector clauses to Args.</i>
-
-// The selector call.
-Builder.CreateCall(SlctrIntr, Args, "exc_sel");
-</pre>
-</div>
-
-<p>You should instead generate a <code>landingpad</code> instruction, that
- returns an exception object and selector value:</p>
-
-<div class="doc_code">
-<pre>
-LandingPadInst *LPadInst =
- Builder.CreateLandingPad(StructType::get(Int8PtrTy, Int32Ty, NULL),
- Personality, 0);
-
-Value *LPadExn = Builder.CreateExtractValue(LPadInst, 0);
-Builder.CreateStore(LPadExn, getExceptionSlot());
-
-Value *LPadSel = Builder.CreateExtractValue(LPadInst, 1);
-Builder.CreateStore(LPadSel, getEHSelectorSlot());
-</pre>
-</div>
-
-<p>It's now trivial to add the individual clauses to the <code>landingpad</code>
- instruction.</p>
-
-<div class="doc_code">
-<pre>
-<i><b>// Adding a catch clause</b></i>
-Constant *TypeInfo = getTypeInfo();
-LPadInst->addClause(TypeInfo);
-
-<i><b>// Adding a C++ catch-all</b></i>
-LPadInst->addClause(Constant::getNullValue(Builder.getInt8PtrTy()));
-
-<i><b>// Adding a cleanup</b></i>
-LPadInst->setCleanup(true);
-
-<i><b>// Adding a filter clause</b></i>
-std::vector<Constant*> TypeInfos;
-Constant *TypeInfo = getFilterTypeInfo();
-TypeInfos.push_back(Builder.CreateBitCast(TypeInfo, Builder.getInt8PtrTy()));
-
-ArrayType *FilterTy = ArrayType::get(Int8PtrTy, TypeInfos.size());
-LPadInst->addClause(ConstantArray::get(FilterTy, TypeInfos));
-</pre>
-</div>
-
-<p>Converting from using the <code>@llvm.eh.resume</code> intrinsic to
- the <code>resume</code> instruction is trivial. It takes the exception
- pointer and exception selector values returned by
- the <code>landingpad</code> instruction:</p>
-
-<div class="doc_code">
-<pre>
-Type *UnwindDataTy = StructType::get(Builder.getInt8PtrTy(),
- Builder.getInt32Ty(), NULL);
-Value *UnwindData = UndefValue::get(UnwindDataTy);
-Value *ExcPtr = Builder.CreateLoad(getExceptionObjSlot());
-Value *ExcSel = Builder.CreateLoad(getExceptionSelSlot());
-UnwindData = Builder.CreateInsertValue(UnwindData, ExcPtr, 0, "exc_ptr");
-UnwindData = Builder.CreateInsertValue(UnwindData, ExcSel, 1, "exc_sel");
-Builder.CreateResume(UnwindData);
-</pre>
-</div>
-
-
-
-
- -->
-
-
<!-- *********************************************************************** -->
<hr>