From 32e89f2b922aae3d6a99428ae3d1c96304f05e1d Mon Sep 17 00:00:00 2001 From: Chris Lattner Date: Sun, 16 Oct 2005 18:31:08 +0000 Subject: [PATCH] Fill this out some more. Add description of MBB/MF. Fix some broken links, turn some broken into 's. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23762 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/CodeGenerator.html | 144 +++++++++++++++++++++++++++++++--------- 1 file changed, 113 insertions(+), 31 deletions(-) diff --git a/docs/CodeGenerator.html b/docs/CodeGenerator.html index 6bf38134deb..1b699dd9822 100644 --- a/docs/CodeGenerator.html +++ b/docs/CodeGenerator.html @@ -35,6 +35,9 @@
  • Machine code description classes
  • Target-independent code generation algorithms @@ -50,14 +53,19 @@
  • SelectionDAG Optimization Phase: the DAG Combiner
  • SelectionDAG Select Phase
  • -
  • SelectionDAG Scheduling and Emission +
  • SelectionDAG Scheduling and Formation Phase
  • Future directions for the SelectionDAG
  • +
  • Code Emission +
  • -
  • Target description implementations +
  • Target-specific Implementation Notes @@ -163,7 +171,7 @@ LLVM machine description model: programmable FPGAs for example.

    Important Note: For historical reasons, the LLVM SparcV9 code generator uses almost entirely different code paths than described in this document. For this reason, there are some deprecated interfaces (such as -TargetRegInfo and TargetSchedInfo), which are only used by the +TargetSchedInfo), which are only used by the V9 backend and should not be used by any other targets. Also, all code in the lib/Target/SparcV9 directory and subdirectories should be considered deprecated, and should not be used as the basis for future code generator work. @@ -185,36 +193,44 @@ quality code generation for standard register-based microprocessors. Code generation in this model is divided into the following stages:

      -
    1. Instruction Selection - Determining an -efficient implementation of the input LLVM code in the target instruction set. +
    2. Instruction Selection - This phase +determines an efficient way to express the input LLVM code in the target +instruction set. This stage produces the initial code for the program in the target instruction set, then makes use of virtual registers in SSA form and physical registers that represent any required register assignments due to target constraints or calling -conventions.
    3. +conventions. This step turns the LLVM code into a DAG of target +instructions. + +
    4. Scheduling and Formation - This +phase takes the DAG of target instructions produced by the instruction selection +phase, determines an ordering of the instructions, then emits the instructions +as MachineInstrs with that ordering. +
    5. SSA-based Machine Code Optimizations - This optional stage consists of a series of machine-code optimizations that operate on the SSA-form produced by the instruction selector. Optimizations -like modulo-scheduling, normal scheduling, or peephole optimization work here. +like modulo-scheduling or peephole optimization work here.
    6. -
    7. Register Allocation - The +
    8. Register Allocation - The target code is transformed from an infinite virtual register file in SSA form to the concrete register file used by the target. This phase introduces spill code and eliminates all virtual register references from the program.
    9. -
    10. Prolog/Epilog Code Insertion - Once the +
    11. Prolog/Epilog Code Insertion - Once the machine code has been generated for the function and the amount of stack space required is known (used for LLVM alloca's and spill slots), the prolog and epilog code for the function can be inserted and "abstract stack location references" can be eliminated. This stage is responsible for implementing optimizations like frame-pointer elimination and stack packing.
    12. -
    13. Late Machine Code Optimizations - Optimizations +
    14. Late Machine Code Optimizations - Optimizations that operate on "final" machine code can go here, such as spill code scheduling and peephole optimizations.
    15. -
    16. Code Emission - The final stage actually +
    17. Code Emission - The final stage actually puts out the code for the current function, either in the target assembler format or in machine code.
    18. @@ -259,6 +275,16 @@ domain-specific and target-specific abstractions to reduce the amount of repetition.

      +

      As LLVM continues to be developed and refined, we plan to move more and more +of the target description to be in .td form. Doing so gives us a +number of advantages. The most important is that it makes it easier to port +LLVM, because it reduces the amount of C++ code that has to be written and the +surface area of the code generator that needs to be understood before someone +can get in an get something working. Second, it is also important to us because +it makes it easier to change things: in particular, if tables and other things +are all emitted by tblgen, we only need to change one place (tblgen) to update +all of the targets to a new interface.

      + @@ -274,8 +300,7 @@ repetition. target machine; independent of any particular client. These classes are designed to capture the abstract properties of the target (such as the instructions and registers it has), and do not incorporate any particular pieces -of code generation algorithms. These interfaces do not take interference graphs -as inputs or other algorithm-specific data structures.

      +of code generation algorithms.

      All of the target description classes (except the TargetData class) are designed to be subclassed by @@ -315,8 +340,8 @@ implemented as well.

      The TargetData class is the only required target description class, -and it is the only class that is not extensible. You cannot derived a new -class from it. TargetData specifies information about how the target +and it is the only class that is not extensible (you cannot derived a new +class from it). TargetData specifies information about how the target lays out memory for structures, the alignment requirements for various data types, the size of pointers in the target, and whether the target is little-endian or big-endian.

      @@ -333,18 +358,16 @@ little-endian or big-endian.

      The TargetLowering class is used by SelectionDAG based instruction selectors primarily to describe how LLVM code should be lowered to SelectionDAG operations. Among other things, this class indicates: -

      • an initial register class to use for various ValueTypes,
      • -
      • which operations are natively supported by the target machine,
      • -
      • the return type of setcc operations, and
      • -
      • the type to use for shift amounts, etc
      • . +
        • an initial register class to use for various ValueTypes
        • +
        • which operations are natively supported by the target machine
        • +
        • the return type of setcc operations
        • +
        • the type to use for shift amounts
        • +
        • various high-level characteristics, like whether it is profitable to turn + division by a constant into a multiplication sequence

    - - - -
    The MRegisterInfo class @@ -359,7 +382,7 @@ target and any interactions between the registers.

    Registers in the code generator are represented in the code generator by unsigned numbers. Physical registers (those that actually exist in the target description) are unique small numbers, and virtual registers are generally -large.

    +large. Note that register #0 is reserved as a flag value.

    Each register in the processor description has an associated TargetRegisterDesc entry, which provides a textual name for the register @@ -438,7 +461,8 @@ href="TableGenFundamentals.html">TableGen description of the register file.

    At the high-level, LLVM code is translated to a machine specific representation -formed out of MachineFunction, MachineBasicBlock, and MachineFunction, +MachineBasicBlock, and MachineInstr instances (defined in include/llvm/CodeGen). This representation is completely target agnostic, representing instructions in their most abstract form: an opcode and a @@ -624,6 +648,43 @@ are no virtual registers left in the code.

    + + + +
    + +

    The MachineBasicBlock class contains a list of machine instructions +(MachineInstr instances). It roughly corresponds to +the LLVM code input to the instruction selector, but there can be a one-to-many +mapping (i.e. one LLVM basic block can map to multiple machine basic blocks). +The MachineBasicBlock class has a "getBasicBlock" method, which returns +the LLVM basic block that it comes from. +

    + +
    + + + + +
    + +

    The MachineFunction class contains a list of machine basic blocks +(MachineBasicBlock instances). It corresponds +one-to-one with the LLVM function input to the instruction selector. In +addition to a list of basic blocks, the MachineFunction contains a +the MachineConstantPool, MachineFrameInfo, MachineFunctionInfo, +SSARegMap, and a set of live in and live out registers for the function. See +MachineFunction.h for more information. +

    + +
    + + +
    Target-independent code generation algorithms @@ -633,7 +694,7 @@ are no virtual registers left in the code.

    This section documents the phases described in the high-level design of the code generator. It +href="#high-level-design">high-level design of the code generator. It explains how they work and some of the rationale behind their design.

    @@ -755,7 +816,7 @@ SelectionDAG-based instruction selection consists of the following steps: the target instruction selector matches the DAG operations to target instructions. This process translates the target-independent input DAG into another DAG of target instructions.
  • -
  • SelectionDAG Scheduling and Emission +
  • SelectionDAG Scheduling and Formation - The last phase assigns a linear order to the instructions in the target-instruction DAG and emits them into the MachineFunction being compiled. This step uses traditional prepass scheduling techniques.
  • @@ -892,7 +953,7 @@ want to make the Select phase as simple and mechanical as possible.

    - SelectionDAG Scheduling and Emission Phase + SelectionDAG Scheduling and Formation Phase
    @@ -944,12 +1005,33 @@ Selection DAG is destroyed.

    To Be Written

    + + + + +
    + +
    + + + + + +
    +

    For the JIT or .o file writer

    +
    + + @@ -995,7 +1077,7 @@ that people test.
  • i386-unknown-freebsd5.3 - FreeBSD 5.3
  • i686-pc-cygwin - Cygwin on Win32
  • i686-pc-mingw32 - MingW on Win32
  • -
  • i686-apple-darwin* - Apple Darwin
  • +
  • i686-apple-darwin* - Apple Darwin on X86
  • -- 2.34.1