-.. code-block:: c++
-
- unsigned sum_arrays(int *A, int *B, int start, int end) {
- unsigned sum = 0;
- for (int i = start; i < end; ++i)
- sum += A[i] + B[i] + i;
- return sum;
- }
-
-We vectorize under the following loops:
-
-#. The inner most loops must have a single basic block.
-#. The number of iterations are known before the loop starts to execute.
-#. The loop counter needs to be incremented by one.
-#. The loop trip count **can** be a variable.
-#. Loops do **not** need to start at zero.
-#. The induction variable can be used inside the loop.
-#. Loop reductions are supported.
-#. Arrays with affine access pattern do **not** need to be marked as
- '``noalias``' and are checked at runtime.
-#. ...
-
-SROA - We've re-written SROA to be significantly more powerful and generate
-code which is much more friendly to the rest of the optimization pipeline.
-Previously this pass had scaling problems that required it to only operate on
-relatively small aggregates, and at times it would mistakenly replace a large
-aggregate with a single very large integer in order to make it a scalar SSA
-value. The result was a large number of i1024 and i2048 values representing any
-small stack buffer. These in turn slowed down many subsequent optimization
-paths.
-
-The new SROA pass uses a different algorithm that allows it to only promote to
-scalars the pieces of the aggregate actively in use. Because of this it doesn't
-require any thresholds. It also always deduces the scalar values from the uses
-of the aggregate rather than the specific LLVM type of the aggregate. These
-features combine to both optimize more code with the pass but to improve the
-compile time of many functions dramatically.
-
-#. Branch weight metadata is preseved through more of the optimizer.
-#. ...
-
-MC Level Improvements
----------------------
-
-The LLVM Machine Code (aka MC) subsystem was created to solve a number of
-problems in the realm of assembly, disassembly, object file format handling,
-and a number of other related areas that CPU instruction-set level tools work
-in. For more information, please see the `Intro to the LLVM MC Project Blog
-Post <http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html>`_.
-
-#. ...
-
-.. _codegen:
-
-Target Independent Code Generator Improvements
-----------------------------------------------
-
-Stack Coloring - We have implemented a new optimization pass to merge stack
-objects which are used in disjoin areas of the code. This optimization reduces
-the required stack space significantly, in cases where it is clear to the
-optimizer that the stack slot is not shared. We use the lifetime markers to
-tell the codegen that a certain alloca is used within a region.
-
-We now merge consecutive loads and stores.
-
-We have put a significant amount of work into the code generator
-infrastructure, which allows us to implement more aggressive algorithms and
-make it run faster:
-
-#. ...
-
-We added new TableGen infrastructure to support bundling for Very Long
-Instruction Word (VLIW) architectures. TableGen can now automatically generate
-a deterministic finite automaton from a VLIW target's schedule description
-which can be queried to determine legal groupings of instructions in a bundle.
-
-We have added a new target independent VLIW packetizer based on the DFA
-infrastructure to group machine instructions into bundles.
-
-Basic Block Placement
-^^^^^^^^^^^^^^^^^^^^^
-
-A probability based block placement and code layout algorithm was added to
-LLVM's code generator. This layout pass supports probabilities derived from
-static heuristics as well as source code annotations such as
-``__builtin_expect``.
-
-X86-32 and X86-64 Target Improvements
--------------------------------------
-
-New features and major changes in the X86 target include:
-
-#. ...
-
-.. _ARM:
-
-ARM Target Improvements
------------------------
-
-New features of the ARM target include:
-
-#. ...
-
-.. _armintegratedassembler:
-
-ARM Integrated Assembler
-^^^^^^^^^^^^^^^^^^^^^^^^
-
-The ARM target now includes a full featured macro assembler, including
-direct-to-object module support for clang. The assembler is currently enabled
-by default for Darwin only pending testing and any additional necessary
-platform specific support for Linux.
-
-Full support is included for Thumb1, Thumb2 and ARM modes, along with subtarget
-and CPU specific extensions for VFP2, VFP3 and NEON.
-
-The assembler is Unified Syntax only (see ARM Architecural Reference Manual for
-details). While there is some, and growing, support for pre-unfied (divided)
-syntax, there are still significant gaps in that support.
-
-MIPS Target Improvements
-------------------------
-
-New features and major changes in the MIPS target include:
-
-#. ...
-
-PowerPC Target Improvements
----------------------------
-
-Many fixes and changes across LLVM (and Clang) for better compliance with the
-64-bit PowerPC ELF Application Binary Interface, interoperability with GCC, and
-overall 64-bit PowerPC support. Some highlights include:
-
-#. MCJIT support added.
-#. PPC64 relocation support and (small code model) TOC handling added.
-#. Parameter passing and return value fixes (alignment issues, padding, varargs
- support, proper register usage, odd-sized structure support, float support,
- extension of return values for i32 return values).
-#. Fixes in spill and reload code for vector registers.
-#. C++ exception handling enabled.
-#. Changes to remediate double-rounding compatibility issues with respect to
- GCC behavior.
-#. Refactoring to disentangle ``ppc64-elf-linux`` ABI from Darwin ppc64 ABI
- support.
-#. Assorted new test cases and test case fixes (endian and word size issues).
-#. Fixes for big-endian codegen bugs, instruction encodings, and instruction
- constraints.
-#. Implemented ``-integrated-as`` support.
-#. Additional support for Altivec compare operations.
-#. IBM long double support.
-
-There have also been code generation improvements for both 32- and 64-bit code.
-Instruction scheduling support for the Freescale e500mc and e5500 cores has
-been added.
-
-PTX/NVPTX Target Improvements
------------------------------
-
-The PTX back-end has been replaced by the NVPTX back-end, which is based on the
-LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler. Some
-highlights include:
-
-#. Compatibility with PTX 3.1 and SM 3.5.
-#. Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK.
-#. Full compatibility with old PTX back-end, with much greater coverage of LLVM
- SIR.
-
-Please submit any back-end bugs to the LLVM Bugzilla site.
-
-Other Target Specific Improvements
-----------------------------------
-
-#. ...
-
-Major Changes and Removed Features
-----------------------------------
-
-If you're already an LLVM user or developer with out-of-tree changes based on
-LLVM 3.2, this section lists some "gotchas" that you may run into upgrading
-from the previous release.
-
-#. The CellSPU port has been removed. It can still be found in older versions.
-#. ...
-
-Internal API Changes
---------------------
-
-In addition, many APIs have changed in this release. Some of the major LLVM
-API changes are:
-
-We've added a new interface for allowing IR-level passes to access
-target-specific information. A new IR-level pass, called
-``TargetTransformInfo`` provides a number of low-level interfaces. LSR and
-LowerInvoke already use the new interface.
-
-The ``TargetData`` structure has been renamed to ``DataLayout`` and moved to
-``VMCore`` to remove a dependency on ``Target``.
-
-#. ...
-
-Tools Changes
--------------
-
-In addition, some tools have changed in this release. Some of the changes are:
-
-#. ...
-
-Python Bindings
----------------
-
-Officially supported Python bindings have been added! Feature support is far
-from complete. The current bindings support interfaces to:
-
-#. ...
-
-Known Problems
-==============
-
-LLVM is generally a production quality compiler, and is used by a broad range
-of applications and shipping in many products. That said, not every subsystem
-is as mature as the aggregate, particularly the more obscure1 targets. If you
-run into a problem, please check the `LLVM bug database
-<http://llvm.org/bugs/>`_ and submit a bug if there isn't already one or ask on
-the `LLVMdev list <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_.