docs/WritingAnLLVMBackend.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   2                       "http://www.w3.org/TR/html4/strict.dtd">
   3 <html>
   4 <head>
   5   <title>Writing an LLVM Compiler Backend</title>
   6   <link rel="stylesheet" href="llvm.css" type="text/css">
   7 </head>
   8
   9 <body>
  10
  11 <div class="doc_title">
  12   Writing an LLVM Compiler Backend
  13 </div>
  14
  15 <ol>
  16   <li><a href="#intro">Introduction</a>
  17   <ul>
  18     <li><a href="#Audience">Audience</a></li>
  19     <li><a href="#Prerequisite">Prerequisite Reading</a></li>
  20     <li><a href="#Basic">Basic Steps</a></li>
  21     <li><a href="#Preliminaries">Preliminaries</a></li>
  22   </ul>
  23   <li><a href="#TargetMachine">Target Machine</a></li>
  24   <li><a href="#RegisterSet">Register Set and Register Classes</a>
  25   <ul>
  26     <li><a href="#RegisterDef">Defining a Register</a></li>
  27     <li><a href="#RegisterClassDef">Defining a Register Class</a></li>
  28     <li><a href="#implementRegister">Implement a subclass of TargetRegisterInfo</a></li>
  29   </ul></li>
  30   <li><a href="#InstructionSet">Instruction Set</a>
  31   <ul>
  32     <li><a href="#operandMapping">Instruction Operand Mapping</a></li>
  33     <li><a href="#implementInstr">Implement a subclass of TargetInstrInfo</a></li>
  34     <li><a href="#branchFolding">Branch Folding and If Conversion</a></li>
  35   </ul></li>
  36   <li><a href="#InstructionSelector">Instruction Selector</a>
  37   <ul>
  38     <li><a href="#LegalizePhase">The SelectionDAG Legalize Phase</a>
  39     <ul>
  40       <li><a href="#promote">Promote</a></li>
  41       <li><a href="#expand">Expand</a></li>
  42       <li><a href="#custom">Custom</a></li>
  43       <li><a href="#legal">Legal</a></li>
  44     </ul></li>
  45     <li><a href="#callingConventions">Calling Conventions</a></li>
  46   </ul></li>
  47   <li><a href="#assemblyPrinter">Assembly Printer</a></li>
  48   <li><a href="#subtargetSupport">Subtarget Support</a></li>
  49   <li><a href="#jitSupport">JIT Support</a>
  50   <ul>
  51     <li><a href="#mce">Machine Code Emitter</a></li>
  52     <li><a href="#targetJITInfo">Target JIT Info</a></li>
  53   </ul></li>
  54 </ol>
  55
  56 <div class="doc_author">
  57   <p>Written by <a href="http://www.woo.com">Mason Woo</a> and
  58                 <a href="http://misha.brukman.net">Misha Brukman</a></p>
  59 </div>
  60
  61 <!-- *********************************************************************** -->
  62 <div class="doc_section">
  63   <a name="intro">Introduction</a>
  64 </div>
  65 <!-- *********************************************************************** -->
  66
  67 <div class="doc_text">
  68
  69 <p>
  70 This document describes techniques for writing compiler backends that convert
  71 the LLVM Intermediate Representation (IR) to code for a specified machine or
  72 other languages. Code intended for a specific machine can take the form of
  73 either assembly code or binary code (usable for a JIT compiler).
  74 </p>
  75
  76 <p>
  77 The backend of LLVM features a target-independent code generator that may create
  78 output for several types of target CPUs &mdash; including X86, PowerPC, Alpha,
  79 and SPARC. The backend may also be used to generate code targeted at SPUs of the
  80 Cell processor or GPUs to support the execution of compute kernels.
  81 </p>
  82
  83 <p>
  84 The document focuses on existing examples found in subdirectories
  85 of <tt>llvm/lib/Target</tt> in a downloaded LLVM release. In particular, this
  86 document focuses on the example of creating a static compiler (one that emits
  87 text assembly) for a SPARC target, because SPARC has fairly standard
  88 characteristics, such as a RISC instruction set and straightforward calling
  89 conventions.
  90 </p>
  91
  92 </div>
  93
  94 <div class="doc_subsection">
  95   <a name="Audience">Audience</a>
  96 </div>
  97
  98 <div class="doc_text">
  99
 100 <p>
 101 The audience for this document is anyone who needs to write an LLVM backend to
 102 generate code for a specific hardware or software target.
 103 </p>
 104
 105 </div>
 106
 107 <div class="doc_subsection">
 108   <a name="Prerequisite">Prerequisite Reading</a>
 109 </div>
 110
 111 <div class="doc_text">
 112
 113 <p>
 114 These essential documents must be read before reading this document:
 115 </p>
 116
 117 <ul>
 118 <li><i><a href="http://www.llvm.org/docs/LangRef.html">LLVM Language Reference
 119     Manual</a></i> &mdash; a reference manual for the LLVM assembly language.</li>
 120
 121 <li><i><a href="http://www.llvm.org/docs/CodeGenerator.html">The LLVM
 122     Target-Independent Code Generator</a></i> &mdash; a guide to the components
 123     (classes and code generation algorithms) for translating the LLVM internal
 124     representation into machine code for a specified target.  Pay particular
 125     attention to the descriptions of code generation stages: Instruction
 126     Selection, Scheduling and Formation, SSA-based Optimization, Register
 127     Allocation, Prolog/Epilog Code Insertion, Late Machine Code Optimizations,
 128     and Code Emission.</li>
 129
 130 <li><i><a href="http://www.llvm.org/docs/TableGenFundamentals.html">TableGen
 131     Fundamentals</a></i> &mdash;a document that describes the TableGen
 132     (<tt>tblgen</tt>) application that manages domain-specific information to
 133     support LLVM code generation. TableGen processes input from a target
 134     description file (<tt>.td</tt> suffix) and generates C++ code that can be
 135     used for code generation.</li>
 136
 137 <li><i><a href="http://www.llvm.org/docs/WritingAnLLVMPass.html">Writing an LLVM
 138     Pass</a></i> &mdash; The assembly printer is a <tt>FunctionPass</tt>, as are
 139     several SelectionDAG processing steps.</li>
 140 </ul>
 141
 142 <p>
 143 To follow the SPARC examples in this document, have a copy of
 144 <i><a href="http://www.sparc.org/standards/V8.pdf">The SPARC Architecture
 145 Manual, Version 8</a></i> for reference. For details about the ARM instruction
 146 set, refer to the <i><a href="http://infocenter.arm.com/">ARM Architecture
 147 Reference Manual</a></i>. For more about the GNU Assembler format
 148 (<tt>GAS</tt>), see
 149 <i><a href="http://sourceware.org/binutils/docs/as/index.html">Using As</a></i>,
 150 especially for the assembly printer. <i>Using As</i> contains a list of target
 151 machine dependent features.
 152 </p>
 153
 154 </div>
 155
 156 <div class="doc_subsection">
 157   <a name="Basic">Basic Steps</a>
 158 </div>
 159
 160 <div class="doc_text">
 161
 162 <p>
 163 To write a compiler backend for LLVM that converts the LLVM IR to code for a
 164 specified target (machine or other language), follow these steps:
 165 </p>
 166
 167 <ul>
 168 <li>Create a subclass of the TargetMachine class that describes characteristics
 169     of your target machine. Copy existing examples of specific TargetMachine
 170     class and header files; for example, start with
 171     <tt>SparcTargetMachine.cpp</tt> and <tt>SparcTargetMachine.h</tt>, but
 172     change the file names for your target. Similarly, change code that
 173     references "Sparc" to reference your target. </li>
 174
 175 <li>Describe the register set of the target. Use TableGen to generate code for
 176     register definition, register aliases, and register classes from a
 177     target-specific <tt>RegisterInfo.td</tt> input file. You should also write
 178     additional code for a subclass of the TargetRegisterInfo class that
 179     represents the class register file data used for register allocation and
 180     also describes the interactions between registers.</li>
 181
 182 <li>Describe the instruction set of the target. Use TableGen to generate code
 183     for target-specific instructions from target-specific versions of
 184     <tt>TargetInstrFormats.td</tt> and <tt>TargetInstrInfo.td</tt>. You should
 185     write additional code for a subclass of the TargetInstrInfo class to
 186     represent machine instructions supported by the target machine. </li>
 187
 188 <li>Describe the selection and conversion of the LLVM IR from a Directed Acyclic
 189     Graph (DAG) representation of instructions to native target-specific
 190     instructions. Use TableGen to generate code that matches patterns and
 191     selects instructions based on additional information in a target-specific
 192     version of <tt>TargetInstrInfo.td</tt>. Write code
 193     for <tt>XXXISelDAGToDAG.cpp</tt>, where XXX identifies the specific target,
 194     to perform pattern matching and DAG-to-DAG instruction selection. Also write
 195     code in <tt>XXXISelLowering.cpp</tt> to replace or remove operations and
 196     data types that are not supported natively in a SelectionDAG. </li>
 197
 198 <li>Write code for an assembly printer that converts LLVM IR to a GAS format for
 199     your target machine.  You should add assembly strings to the instructions
 200     defined in your target-specific version of <tt>TargetInstrInfo.td</tt>. You
 201     should also write code for a subclass of AsmPrinter that performs the
 202     LLVM-to-assembly conversion and a trivial subclass of TargetAsmInfo.</li>
 203
 204 <li>Optionally, add support for subtargets (i.e., variants with different
 205     capabilities). You should also write code for a subclass of the
 206     TargetSubtarget class, which allows you to use the <tt>-mcpu=</tt>
 207     and <tt>-mattr=</tt> command-line options.</li>
 208
 209 <li>Optionally, add JIT support and create a machine code emitter (subclass of
 210     TargetJITInfo) that is used to emit binary code directly into memory. </li>
 211 </ul>
 212
 213 <p>
 214 In the <tt>.cpp</tt> and <tt>.h</tt>. files, initially stub up these methods and
 215 then implement them later. Initially, you may not know which private members
 216 that the class will need and which components will need to be subclassed.
 217 </p>
 218
 219 </div>
 220
 221 <div class="doc_subsection">
 222   <a name="Preliminaries">Preliminaries</a>
 223 </div>
 224
 225 <div class="doc_text">
 226
 227 <p>
 228 To actually create your compiler backend, you need to create and modify a few
 229 files. The absolute minimum is discussed here. But to actually use the LLVM
 230 target-independent code generator, you must perform the steps described in
 231 the <a href="http://www.llvm.org/docs/CodeGenerator.html">LLVM
 232 Target-Independent Code Generator</a> document.
 233 </p>
 234
 235 <p>
 236 First, you should create a subdirectory under <tt>lib/Target</tt> to hold all
 237 the files related to your target. If your target is called "Dummy," create the
 238 directory <tt>lib/Target/Dummy</tt>.
 239 </p>
 240
 241 <p>
 242 In this new
 243 directory, create a <tt>Makefile</tt>. It is easiest to copy a
 244 <tt>Makefile</tt> of another target and modify it. It should at least contain
 245 the <tt>LEVEL</tt>, <tt>LIBRARYNAME</tt> and <tt>TARGET</tt> variables, and then
 246 include <tt>$(LEVEL)/Makefile.common</tt>. The library can be
 247 named <tt>LLVMDummy</tt> (for example, see the MIPS target). Alternatively, you
 248 can split the library into <tt>LLVMDummyCodeGen</tt>
 249 and <tt>LLVMDummyAsmPrinter</tt>, the latter of which should be implemented in a
 250 subdirectory below <tt>lib/Target/Dummy</tt> (for example, see the PowerPC
 251 target).
 252 </p>
 253
 254 <p>
 255 Note that these two naming schemes are hardcoded into <tt>llvm-config</tt>.
 256 Using any other naming scheme will confuse <tt>llvm-config</tt> and produce a
 257 lot of (seemingly unrelated) linker errors when linking <tt>llc</tt>.
 258 </p>
 259
 260 <p>
 261 To make your target actually do something, you need to implement a subclass of
 262 <tt>TargetMachine</tt>. This implementation should typically be in the file
 263 <tt>lib/Target/DummyTargetMachine.cpp</tt>, but any file in
 264 the <tt>lib/Target</tt> directory will be built and should work. To use LLVM's
 265 target independent code generator, you should do what all current machine
 266 backends do: create a subclass of <tt>LLVMTargetMachine</tt>. (To create a
 267 target from scratch, create a subclass of <tt>TargetMachine</tt>.)
 268 </p>
 269
 270 <p>
 271 To get LLVM to actually build and link your target, you need to add it to
 272 the <tt>TARGETS_TO_BUILD</tt> variable. To do this, you modify the configure
 273 script to know about your target when parsing the <tt>--enable-targets</tt>
 274 option. Search the configure script for <tt>TARGETS_TO_BUILD</tt>, add your
 275 target to the lists there (some creativity required), and then
 276 reconfigure. Alternatively, you can change <tt>autotools/configure.ac</tt> and
 277 regenerate configure by running <tt>./autoconf/AutoRegen.sh</tt>.
 278 </p>
 279
 280 </div>
 281
 282 <!-- *********************************************************************** -->
 283 <div class="doc_section">
 284   <a name="TargetMachine">Target Machine</a>
 285 </div>
 286 <!-- *********************************************************************** -->
 287
 288 <div class="doc_text">
 289
 290 <p>
 291 <tt>LLVMTargetMachine</tt> is designed as a base class for targets implemented
 292 with the LLVM target-independent code generator. The <tt>LLVMTargetMachine</tt>
 293 class should be specialized by a concrete target class that implements the
 294 various virtual methods. <tt>LLVMTargetMachine</tt> is defined as a subclass of
 295 <tt>TargetMachine</tt> in <tt>include/llvm/Target/TargetMachine.h</tt>. The
 296 <tt>TargetMachine</tt> class implementation (<tt>TargetMachine.cpp</tt>) also
 297 processes numerous command-line options.
 298 </p>
 299
 300 <p>
 301 To create a concrete target-specific subclass of <tt>LLVMTargetMachine</tt>,
 302 start by copying an existing <tt>TargetMachine</tt> class and header.  You
 303 should name the files that you create to reflect your specific target. For
 304 instance, for the SPARC target, name the files <tt>SparcTargetMachine.h</tt> and
 305 <tt>SparcTargetMachine.cpp</tt>.
 306 </p>
 307
 308 <p>
 309 For a target machine <tt>XXX</tt>, the implementation of
 310 <tt>XXXTargetMachine</tt> must have access methods to obtain objects that
 311 represent target components.  These methods are named <tt>get*Info</tt>, and are
 312 intended to obtain the instruction set (<tt>getInstrInfo</tt>), register set
 313 (<tt>getRegisterInfo</tt>), stack frame layout (<tt>getFrameInfo</tt>), and
 314 similar information. <tt>XXXTargetMachine</tt> must also implement the
 315 <tt>getTargetData</tt> method to access an object with target-specific data
 316 characteristics, such as data type size and alignment requirements.
 317 </p>
 318
 319 <p>
 320 For instance, for the SPARC target, the header file
 321 <tt>SparcTargetMachine.h</tt> declares prototypes for several <tt>get*Info</tt>
 322 and <tt>getTargetData</tt> methods that simply return a class member.
 323 </p>
 324
 325 <div class="doc_code">
 326 <pre>
 327 namespace llvm {
 328
 329 class Module;
 330
 331 class SparcTargetMachine : public LLVMTargetMachine {
 332   const TargetData DataLayout;       // Calculates type size &amp; alignment
 333   SparcSubtarget Subtarget;
 334   SparcInstrInfo InstrInfo;
 335   TargetFrameInfo FrameInfo;
 336
 337 protected:
 338   virtual const TargetAsmInfo *createTargetAsmInfo() const;
 339
 340 public:
 341   SparcTargetMachine(const Module &amp;M, const std::string &amp;FS);
 342
 343   virtual const SparcInstrInfo *getInstrInfo() const {return &amp;InstrInfo; }
 344   virtual const TargetFrameInfo *getFrameInfo() const {return &amp;FrameInfo; }
 345   virtual const TargetSubtarget *getSubtargetImpl() const{return &amp;Subtarget; }
 346   virtual const TargetRegisterInfo *getRegisterInfo() const {
 347     return &amp;InstrInfo.getRegisterInfo();
 348   }
 349   virtual const TargetData *getTargetData() const { return &amp;DataLayout; }
 350   static unsigned getModuleMatchQuality(const Module &amp;M);
 351
 352   // Pass Pipeline Configuration
 353   virtual bool addInstSelector(PassManagerBase &amp;PM, bool Fast);
 354   virtual bool addPreEmitPass(PassManagerBase &amp;PM, bool Fast);
 355   virtual bool addAssemblyEmitter(PassManagerBase &amp;PM, bool Fast,
 356                                   std::ostream &amp;Out);
 357 };
 358
 359 } // end namespace llvm
 360 </pre>
 361 </div>
 362
 363 </div>
 364
 365
 366 <div class="doc_text">
 367
 368 <ul>
 369 <li><tt>getInstrInfo()</tt></li>
 370 <li><tt>getRegisterInfo()</tt></li>
 371 <li><tt>getFrameInfo()</tt></li>
 372 <li><tt>getTargetData()</tt></li>
 373 <li><tt>getSubtargetImpl()</tt></li>
 374 </ul>
 375
 376 <p>For some targets, you also need to support the following methods:</p>
 377
 378 <ul>
 379 <li><tt>getTargetLowering()</tt></li>
 380 <li><tt>getJITInfo()</tt></li>
 381 </ul>
 382
 383 <p>
 384 In addition, the <tt>XXXTargetMachine</tt> constructor should specify a
 385 <tt>TargetDescription</tt> string that determines the data layout for the target
 386 machine, including characteristics such as pointer size, alignment, and
 387 endianness. For example, the constructor for SparcTargetMachine contains the
 388 following:
 389 </p>
 390
 391 <div class="doc_code">
 392 <pre>
 393 SparcTargetMachine::SparcTargetMachine(const Module &amp;M, const std::string &amp;FS)
 394   : DataLayout("E-p:32:32-f128:128:128"),
 395     Subtarget(M, FS), InstrInfo(Subtarget),
 396     FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
 397 }
 398 </pre>
 399 </div>
 400
 401 </div>
 402
 403 <div class="doc_text">
 404
 405 <p>Hyphens separate portions of the <tt>TargetDescription</tt> string.</p>
 406
 407 <ul>
 408 <li>An upper-case "<tt>E</tt>" in the string indicates a big-endian target data
 409     model. a lower-case "<tt>e</tt>" indicates little-endian.</li>
 410
 411 <li>"<tt>p:</tt>" is followed by pointer information: size, ABI alignment, and
 412     preferred alignment. If only two figures follow "<tt>p:</tt>", then the
 413     first value is pointer size, and the second value is both ABI and preferred
 414     alignment.</li>
 415
 416 <li>Then a letter for numeric type alignment: "<tt>i</tt>", "<tt>f</tt>",
 417     "<tt>v</tt>", or "<tt>a</tt>" (corresponding to integer, floating point,
 418     vector, or aggregate). "<tt>i</tt>", "<tt>v</tt>", or "<tt>a</tt>" are
 419     followed by ABI alignment and preferred alignment. "<tt>f</tt>" is followed
 420     by three values: the first indicates the size of a long double, then ABI
 421     alignment, and then ABI preferred alignment.</li>
 422 </ul>
 423
 424 <p>
 425 You must also register your target using the <tt>RegisterTarget</tt>
 426 template. (See the <tt>TargetMachineRegistry</tt> class.) For example,
 427 in <tt>SparcTargetMachine.cpp</tt>, the target is registered with:
 428 </p>
 429
 430 <div class="doc_code">
 431 <pre>
 432 namespace {
 433   // Register the target.
 434   RegisterTarget&lt;SparcTargetMachine&gt;X("sparc", "SPARC");
 435 }
 436 </pre>
 437 </div>
 438
 439 </div>
 440
 441 <!-- *********************************************************************** -->
 442 <div class="doc_section">
 443   <a name="RegisterSet">Register Set and Register Classes</a>
 444 </div>
 445 <!-- *********************************************************************** -->
 446
 447 <div class="doc_text">
 448
 449 <p>
 450 You should describe a concrete target-specific class that represents the
 451 register file of a target machine. This class is called <tt>XXXRegisterInfo</tt>
 452 (where <tt>XXX</tt> identifies the target) and represents the class register
 453 file data that is used for register allocation. It also describes the
 454 interactions between registers.
 455 </p>
 456
 457 <p>
 458 You also need to define register classes to categorize related registers. A
 459 register class should be added for groups of registers that are all treated the
 460 same way for some instruction. Typical examples are register classes for
 461 integer, floating-point, or vector registers. A register allocator allows an
 462 instruction to use any register in a specified register class to perform the
 463 instruction in a similar manner. Register classes allocate virtual registers to
 464 instructions from these sets, and register classes let the target-independent
 465 register allocator automatically choose the actual registers.
 466 </p>
 467
 468 <p>
 469 Much of the code for registers, including register definition, register aliases,
 470 and register classes, is generated by TableGen from <tt>XXXRegisterInfo.td</tt>
 471 input files and placed in <tt>XXXGenRegisterInfo.h.inc</tt> and
 472 <tt>XXXGenRegisterInfo.inc</tt> output files. Some of the code in the
 473 implementation of <tt>XXXRegisterInfo</tt> requires hand-coding.
 474 </p>
 475
 476 </div>
 477
 478 <!-- ======================================================================= -->
 479 <div class="doc_subsection">
 480   <a name="RegisterDef">Defining a Register</a>
 481 </div>
 482
 483 <div class="doc_text">
 484
 485 <p>
 486 The <tt>XXXRegisterInfo.td</tt> file typically starts with register definitions
 487 for a target machine. The <tt>Register</tt> class (specified
 488 in <tt>Target.td</tt>) is used to define an object for each register. The
 489 specified string <tt>n</tt> becomes the <tt>Name</tt> of the register. The
 490 basic <tt>Register</tt> object does not have any subregisters and does not
 491 specify any aliases.
 492 </p>
 493
 494 <div class="doc_code">
 495 <pre>
 496 class Register&lt;string n&gt; {
 497   string Namespace = "";
 498   string AsmName = n;
 499   string Name = n;
 500   int SpillSize = 0;
 501   int SpillAlignment = 0;
 502   list&lt;Register&gt; Aliases = [];
 503   list&lt;Register&gt; SubRegs = [];
 504   list&lt;int&gt; DwarfNumbers = [];
 505 }
 506 </pre>
 507 </div>
 508
 509 <p>
 510 For example, in the <tt>X86RegisterInfo.td</tt> file, there are register
 511 definitions that utilize the Register class, such as:
 512 </p>
 513
 514 <div class="doc_code">
 515 <pre>
 516 def AL : Register&lt;"AL"&gt;, DwarfRegNum&lt;[0, 0, 0]&gt;;
 517 </pre>
 518 </div>
 519
 520 <p>
 521 This defines the register <tt>AL</tt> and assigns it values (with
 522 <tt>DwarfRegNum</tt>) that are used by <tt>gcc</tt>, <tt>gdb</tt>, or a debug
 523 information writer (such as <tt>DwarfWriter</tt>
 524 in <tt>llvm/lib/CodeGen/AsmPrinter</tt>) to identify a register. For register
 525 <tt>AL</tt>, <tt>DwarfRegNum</tt> takes an array of 3 values representing 3
 526 different modes: the first element is for X86-64, the second for exception
 527 handling (EH) on X86-32, and the third is generic. -1 is a special Dwarf number
 528 that indicates the gcc number is undefined, and -2 indicates the register number
 529 is invalid for this mode.
 530 </p>
 531
 532 <p>
 533 From the previously described line in the <tt>X86RegisterInfo.td</tt> file,
 534 TableGen generates this code in the <tt>X86GenRegisterInfo.inc</tt> file:
 535 </p>
 536
 537 <div class="doc_code">
 538 <pre>
 539 static const unsigned GR8[] = { X86::AL, ... };
 540
 541 const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
 542
 543 const TargetRegisterDesc RegisterDescriptors[] = {
 544   ...
 545 { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
 546 </pre>
 547 </div>
 548
 549 <p>
 550 From the register info file, TableGen generates a <tt>TargetRegisterDesc</tt>
 551 object for each register. <tt>TargetRegisterDesc</tt> is defined in
 552 <tt>include/llvm/Target/TargetRegisterInfo.h</tt> with the following fields:
 553 </p>
 554
 555 <div class="doc_code">
 556 <pre>
 557 struct TargetRegisterDesc {
 558   const char     *AsmName;      // Assembly language name for the register
 559   const char     *Name;         // Printable name for the reg (for debugging)
 560   const unsigned *AliasSet;     // Register Alias Set
 561   const unsigned *SubRegs;      // Sub-register set
 562   const unsigned *ImmSubRegs;   // Immediate sub-register set
 563   const unsigned *SuperRegs;    // Super-register set
 564 };</pre>
 565 </div>
 566
 567 <p>
 568 TableGen uses the entire target description file (<tt>.td</tt>) to determine
 569 text names for the register (in the <tt>AsmName</tt> and <tt>Name</tt> fields of
 570 <tt>TargetRegisterDesc</tt>) and the relationships of other registers to the
 571 defined register (in the other <tt>TargetRegisterDesc</tt> fields). In this
 572 example, other definitions establish the registers "<tt>AX</tt>",
 573 "<tt>EAX</tt>", and "<tt>RAX</tt>" as aliases for one another, so TableGen
 574 generates a null-terminated array (<tt>AL_AliasSet</tt>) for this register alias
 575 set.
 576 </p>
 577
 578 <p>
 579 The <tt>Register</tt> class is commonly used as a base class for more complex
 580 classes. In <tt>Target.td</tt>, the <tt>Register</tt> class is the base for the
 581 <tt>RegisterWithSubRegs</tt> class that is used to define registers that need to
 582 specify subregisters in the <tt>SubRegs</tt> list, as shown here:
 583 </p>
 584
 585 <div class="doc_code">
 586 <pre>
 587 class RegisterWithSubRegs&lt;string n,
 588 list&lt;Register&gt; subregs&gt; : Register&lt;n&gt; {
 589   let SubRegs = subregs;
 590 }
 591 </pre>
 592 </div>
 593
 594 <p>
 595 In <tt>SparcRegisterInfo.td</tt>, additional register classes are defined for
 596 SPARC: a Register subclass, SparcReg, and further subclasses: <tt>Ri</tt>,
 597 <tt>Rf</tt>, and <tt>Rd</tt>. SPARC registers are identified by 5-bit ID
 598 numbers, which is a feature common to these subclasses. Note the use of
 599 '<tt>let</tt>' expressions to override values that are initially defined in a
 600 superclass (such as <tt>SubRegs</tt> field in the <tt>Rd</tt> class).
 601 </p>
 602
 603 <div class="doc_code">
 604 <pre>
 605 class SparcReg&lt;string n&gt; : Register&lt;n&gt; {
 606   field bits&lt;5&gt; Num;
 607   let Namespace = "SP";
 608 }
 609 // Ri - 32-bit integer registers
 610 class Ri&lt;bits&lt;5&gt; num, string n&gt; :
 611 SparcReg&lt;n&gt; {
 612   let Num = num;
 613 }
 614 // Rf - 32-bit floating-point registers
 615 class Rf&lt;bits&lt;5&gt; num, string n&gt; :
 616 SparcReg&lt;n&gt; {
 617   let Num = num;
 618 }
 619 // Rd - Slots in the FP register file for 64-bit
 620 floating-point values.
 621 class Rd&lt;bits&lt;5&gt; num, string n,
 622 list&lt;Register&gt; subregs&gt; : SparcReg&lt;n&gt; {
 623   let Num = num;
 624   let SubRegs = subregs;
 625 }
 626 </pre>
 627 </div>
 628
 629 <p>
 630 In the <tt>SparcRegisterInfo.td</tt> file, there are register definitions that
 631 utilize these subclasses of <tt>Register</tt>, such as:
 632 </p>
 633
 634 <div class="doc_code">
 635 <pre>
 636 def G0 : Ri&lt; 0, "G0"&gt;,
 637 DwarfRegNum&lt;[0]&gt;;
 638 def G1 : Ri&lt; 1, "G1"&gt;, DwarfRegNum&lt;[1]&gt;;
 639 ...
 640 def F0 : Rf&lt; 0, "F0"&gt;,
 641 DwarfRegNum&lt;[32]&gt;;
 642 def F1 : Rf&lt; 1, "F1"&gt;,
 643 DwarfRegNum&lt;[33]&gt;;
 644 ...
 645 def D0 : Rd&lt; 0, "F0", [F0, F1]&gt;,
 646 DwarfRegNum&lt;[32]&gt;;
 647 def D1 : Rd&lt; 2, "F2", [F2, F3]&gt;,
 648 DwarfRegNum&lt;[34]&gt;;
 649 </pre>
 650 </div>
 651
 652 <p>
 653 The last two registers shown above (<tt>D0</tt> and <tt>D1</tt>) are
 654 double-precision floating-point registers that are aliases for pairs of
 655 single-precision floating-point sub-registers. In addition to aliases, the
 656 sub-register and super-register relationships of the defined register are in
 657 fields of a register's TargetRegisterDesc.
 658 </p>
 659
 660 </div>
 661
 662 <!-- ======================================================================= -->
 663 <div class="doc_subsection">
 664   <a name="RegisterClassDef">Defining a Register Class</a>
 665 </div>
 666
 667 <div class="doc_text">
 668
 669 <p>
 670 The <tt>RegisterClass</tt> class (specified in <tt>Target.td</tt>) is used to
 671 define an object that represents a group of related registers and also defines
 672 the default allocation order of the registers. A target description file
 673 <tt>XXXRegisterInfo.td</tt> that uses <tt>Target.td</tt> can construct register
 674 classes using the following class:
 675 </p>
 676
 677 <div class="doc_code">
 678 <pre>
 679 class RegisterClass&lt;string namespace,
 680 list&lt;ValueType&gt; regTypes, int alignment,
 681                     list&lt;Register&gt; regList&gt; {
 682   string Namespace = namespace;
 683   list&lt;ValueType&gt; RegTypes = regTypes;
 684   int Size = 0;  // spill size, in bits; zero lets tblgen pick the size
 685   int Alignment = alignment;
 686
 687   // CopyCost is the cost of copying a value between two registers
 688   // default value 1 means a single instruction
 689   // A negative value means copying is extremely expensive or impossible
 690   int CopyCost = 1;
 691   list&lt;Register&gt; MemberList = regList;
 692
 693   // for register classes that are subregisters of this class
 694   list&lt;RegisterClass&gt; SubRegClassList = [];
 695
 696   code MethodProtos = [{}];  // to insert arbitrary code
 697   code MethodBodies = [{}];
 698 }
 699 </pre>
 700 </div>
 701
 702 <p>To define a RegisterClass, use the following 4 arguments:</p>
 703
 704 <ul>
 705 <li>The first argument of the definition is the name of the namespace.</li>
 706
 707 <li>The second argument is a list of <tt>ValueType</tt> register type values
 708     that are defined in <tt>include/llvm/CodeGen/ValueTypes.td</tt>. Defined
 709     values include integer types (such as <tt>i16</tt>, <tt>i32</tt>,
 710     and <tt>i1</tt> for Boolean), floating-point types
 711     (<tt>f32</tt>, <tt>f64</tt>), and vector types (for example, <tt>v8i16</tt>
 712     for an <tt>8 x i16</tt> vector). All registers in a <tt>RegisterClass</tt>
 713     must have the same <tt>ValueType</tt>, but some registers may store vector
 714     data in different configurations. For example a register that can process a
 715     128-bit vector may be able to handle 16 8-bit integer elements, 8 16-bit
 716     integers, 4 32-bit integers, and so on. </li>
 717
 718 <li>The third argument of the <tt>RegisterClass</tt> definition specifies the
 719     alignment required of the registers when they are stored or loaded to
 720     memory.</li>
 721
 722 <li>The final argument, <tt>regList</tt>, specifies which registers are in this
 723     class.  If an <tt>allocation_order_*</tt> method is not specified,
 724     then <tt>regList</tt> also defines the order of allocation used by the
 725     register allocator.</li>
 726 </ul>
 727
 728 <p>
 729 In <tt>SparcRegisterInfo.td</tt>, three RegisterClass objects are defined:
 730 <tt>FPRegs</tt>, <tt>DFPRegs</tt>, and <tt>IntRegs</tt>. For all three register
 731 classes, the first argument defines the namespace with the string
 732 '<tt>SP</tt>'. <tt>FPRegs</tt> defines a group of 32 single-precision
 733 floating-point registers (<tt>F0</tt> to <tt>F31</tt>); <tt>DFPRegs</tt> defines
 734 a group of 16 double-precision registers
 735 (<tt>D0-D15</tt>). For <tt>IntRegs</tt>, the <tt>MethodProtos</tt>
 736 and <tt>MethodBodies</tt> methods are used by TableGen to insert the specified
 737 code into generated output.
 738 </p>
 739
 740 <div class="doc_code">
 741 <pre>
 742 def FPRegs : RegisterClass&lt;"SP", [f32], 32,
 743   [F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14, F15,
 744    F16, F17, F18, F19, F20, F21, F22, F23, F24, F25, F26, F27, F28, F29, F30, F31]&gt;;
 745
 746 def DFPRegs : RegisterClass&lt;"SP", [f64], 64,
 747   [D0, D1, D2, D3, D4, D5, D6, D7, D8, D9, D10, D11, D12, D13, D14, D15]&gt;;
 748 &nbsp;
 749 def IntRegs : RegisterClass&lt;"SP", [i32], 32,
 750     [L0, L1, L2, L3, L4, L5, L6, L7,
 751      I0, I1, I2, I3, I4, I5,
 752      O0, O1, O2, O3, O4, O5, O7,
 753      G1,
 754      // Non-allocatable regs:
 755      G2, G3, G4,
 756      O6,        // stack ptr
 757     I6,        // frame ptr
 758      I7,        // return address
 759      G0,        // constant zero
 760      G5, G6, G7 // reserved for kernel
 761     ]&gt; {
 762   let MethodProtos = [{
 763     iterator allocation_order_end(const MachineFunction &amp;MF) const;
 764   }];
 765   let MethodBodies = [{
 766     IntRegsClass::iterator
 767     IntRegsClass::allocation_order_end(const MachineFunction &amp;MF) const {
 768       return end() - 10  // Don't allocate special registers
 769          -1;
 770     }
 771   }];
 772 }
 773 </pre>
 774 </div>
 775
 776 <p>
 777 Using <tt>SparcRegisterInfo.td</tt> with TableGen generates several output files
 778 that are intended for inclusion in other source code that you write.
 779 <tt>SparcRegisterInfo.td</tt> generates <tt>SparcGenRegisterInfo.h.inc</tt>,
 780 which should be included in the header file for the implementation of the SPARC
 781 register implementation that you write (<tt>SparcRegisterInfo.h</tt>). In
 782 <tt>SparcGenRegisterInfo.h.inc</tt> a new structure is defined called
 783 <tt>SparcGenRegisterInfo</tt> that uses <tt>TargetRegisterInfo</tt> as its
 784 base. It also specifies types, based upon the defined register
 785 classes: <tt>DFPRegsClass</tt>, <tt>FPRegsClass</tt>, and <tt>IntRegsClass</tt>.
 786 </p>
 787
 788 <p>
 789 <tt>SparcRegisterInfo.td</tt> also generates <tt>SparcGenRegisterInfo.inc</tt>,
 790 which is included at the bottom of <tt>SparcRegisterInfo.cpp</tt>, the SPARC
 791 register implementation. The code below shows only the generated integer
 792 registers and associated register classes. The order of registers
 793 in <tt>IntRegs</tt> reflects the order in the definition of <tt>IntRegs</tt> in
 794 the target description file. Take special note of the use
 795 of <tt>MethodBodies</tt> in <tt>SparcRegisterInfo.td</tt> to create code in
 796 <tt>SparcGenRegisterInfo.inc</tt>. <tt>MethodProtos</tt> generates similar code
 797 in <tt>SparcGenRegisterInfo.h.inc</tt>.
 798 </p>
 799
 800 <div class="doc_code">
 801 <pre>  // IntRegs Register Class...
 802   static const unsigned IntRegs[] = {
 803     SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
 804     SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3,
 805     SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3,
 806     SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3,
 807     SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5,
 808     SP::G6, SP::G7,
 809   };
 810
 811   // IntRegsVTs Register Class Value Types...
 812   static const MVT::ValueType IntRegsVTs[] = {
 813     MVT::i32, MVT::Other
 814   };
 815
 816 namespace SP {   // Register class instances
 817   DFPRegsClass&nbsp;&nbsp;&nbsp; DFPRegsRegClass;
 818   FPRegsClass&nbsp;&nbsp;&nbsp;&nbsp; FPRegsRegClass;
 819   IntRegsClass&nbsp;&nbsp;&nbsp; IntRegsRegClass;
 820 ...
 821   // IntRegs Sub-register Classess...
 822   static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
 823     NULL
 824   };
 825 ...
 826   // IntRegs Super-register Classess...
 827   static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
 828     NULL
 829   };
 830 ...
 831   // IntRegs Register Class sub-classes...
 832   static const TargetRegisterClass* const IntRegsSubclasses [] = {
 833     NULL
 834   };
 835 ...
 836   // IntRegs Register Class super-classes...
 837   static const TargetRegisterClass* const IntRegsSuperclasses [] = {
 838     NULL
 839   };
 840 ...
 841   IntRegsClass::iterator
 842   IntRegsClass::allocation_order_end(const MachineFunction &amp;MF) const {
 843      return end()-10  // Don't allocate special registers
 844          -1;
 845   }
 846
 847   IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
 848     IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
 849     IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
 850 }
 851 </pre>
 852 </div>
 853
 854 </div>
 855
 856 <!-- ======================================================================= -->
 857 <div class="doc_subsection">
 858   <a name="implementRegister">Implement a subclass of</a>
 859   <a href="http://www.llvm.org/docs/CodeGenerator.html#targetregisterinfo">TargetRegisterInfo</a>
 860 </div>
 861
 862 <div class="doc_text">
 863
 864 <p>
 865 The final step is to hand code portions of <tt>XXXRegisterInfo</tt>, which
 866 implements the interface described in <tt>TargetRegisterInfo.h</tt>. These
 867 functions return <tt>0</tt>, <tt>NULL</tt>, or <tt>false</tt>, unless
 868 overridden. Here is a list of functions that are overridden for the SPARC
 869 implementation in <tt>SparcRegisterInfo.cpp</tt>:
 870 </p>
 871
 872 <ul>
 873 <li><tt>getCalleeSavedRegs</tt> &mdash; Returns a list of callee-saved registers
 874     in the order of the desired callee-save stack frame offset.</li>
 875
 876 <li><tt>getCalleeSavedRegClasses</tt> &mdash; Returns a list of preferred
 877     register classes with which to spill each callee saved register.</li>
 878
 879 <li><tt>getReservedRegs</tt> &mdash; Returns a bitset indexed by physical
 880     register numbers, indicating if a particular register is unavailable.</li>
 881
 882 <li><tt>hasFP</tt> &mdash; Return a Boolean indicating if a function should have
 883     a dedicated frame pointer register.</li>
 884
 885 <li><tt>eliminateCallFramePseudoInstr</tt> &mdash; If call frame setup or
 886     destroy pseudo instructions are used, this can be called to eliminate
 887     them.</li>
 888
 889 <li><tt>eliminateFrameIndex</tt> &mdash; Eliminate abstract frame indices from
 890     instructions that may use them.</li>
 891
 892 <li><tt>emitPrologue</tt> &mdash; Insert prologue code into the function.</li>
 893
 894 <li><tt>emitEpilogue</tt> &mdash; Insert epilogue code into the function.</li>
 895 </ul>
 896
 897 </div>
 898
 899 <!-- *********************************************************************** -->
 900 <div class="doc_section">
 901   <a name="InstructionSet">Instruction Set</a>
 902 </div>
 903
 904 <!-- *********************************************************************** -->
 905 <div class="doc_text">
 906
 907 <p>
 908 During the early stages of code generation, the LLVM IR code is converted to a
 909 <tt>SelectionDAG</tt> with nodes that are instances of the <tt>SDNode</tt> class
 910 containing target instructions. An <tt>SDNode</tt> has an opcode, operands, type
 911 requirements, and operation properties. For example, is an operation
 912 commutative, does an operation load from memory. The various operation node
 913 types are described in the <tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt>
 914 file (values of the <tt>NodeType</tt> enum in the <tt>ISD</tt> namespace).
 915 </p>
 916
 917 <p>
 918 TableGen uses the following target description (<tt>.td</tt>) input files to
 919 generate much of the code for instruction definition:
 920 </p>
 921
 922 <ul>
 923 <li><tt>Target.td</tt> &mdash; Where the <tt>Instruction</tt>, <tt>Operand</tt>,
 924     <tt>InstrInfo</tt>, and other fundamental classes are defined.</li>
 925
 926 <li><tt>TargetSelectionDAG.td</tt>&mdash; Used by <tt>SelectionDAG</tt>
 927     instruction selection generators, contains <tt>SDTC*</tt> classes (selection
 928     DAG type constraint), definitions of <tt>SelectionDAG</tt> nodes (such as
 929     <tt>imm</tt>, <tt>cond</tt>, <tt>bb</tt>, <tt>add</tt>, <tt>fadd</tt>,
 930     <tt>sub</tt>), and pattern support (<tt>Pattern</tt>, <tt>Pat</tt>,
 931     <tt>PatFrag</tt>, <tt>PatLeaf</tt>, <tt>ComplexPattern</tt>.</li>
 932
 933 <li><tt>XXXInstrFormats.td</tt> &mdash; Patterns for definitions of
 934     target-specific instructions.</li>
 935
 936 <li><tt>XXXInstrInfo.td</tt> &mdash; Target-specific definitions of instruction
 937     templates, condition codes, and instructions of an instruction set. For
 938     architecture modifications, a different file name may be used. For example,
 939     for Pentium with SSE instruction, this file is <tt>X86InstrSSE.td</tt>, and
 940     for Pentium with MMX, this file is <tt>X86InstrMMX.td</tt>.</li>
 941 </ul>
 942
 943 <p>
 944 There is also a target-specific <tt>XXX.td</tt> file, where <tt>XXX</tt> is the
 945 name of the target. The <tt>XXX.td</tt> file includes the other <tt>.td</tt>
 946 input files, but its contents are only directly important for subtargets.
 947 </p>
 948
 949 <p>
 950 You should describe a concrete target-specific class <tt>XXXInstrInfo</tt> that
 951 represents machine instructions supported by a target machine.
 952 <tt>XXXInstrInfo</tt> contains an array of <tt>XXXInstrDescriptor</tt> objects,
 953 each of which describes one instruction. An instruction descriptor defines:</p>
 954
 955 <ul>
 956 <li>Opcode mnemonic</li>
 957
 958 <li>Number of operands</li>
 959
 960 <li>List of implicit register definitions and uses</li>
 961
 962 <li>Target-independent properties (such as memory access, is commutable)</li>
 963
 964 <li>Target-specific flags </li>
 965 </ul>
 966
 967 <p>
 968 The Instruction class (defined in <tt>Target.td</tt>) is mostly used as a base
 969 for more complex instruction classes.
 970 </p>
 971
 972 <div class="doc_code">
 973 <pre>class Instruction {
 974   string Namespace = "";
 975   dag OutOperandList;       // An dag containing the MI def operand list.
 976   dag InOperandList;        // An dag containing the MI use operand list.
 977   string AsmString = "";    // The .s format to print the instruction with.
 978   list&lt;dag&gt; Pattern;  // Set to the DAG pattern for this instruction
 979   list&lt;Register&gt; Uses = [];
 980   list&lt;Register&gt; Defs = [];
 981   list&lt;Predicate&gt; Predicates = [];  // predicates turned into isel match code
 982   ... remainder not shown for space ...
 983 }
 984 </pre>
 985 </div>
 986
 987 <p>
 988 A <tt>SelectionDAG</tt> node (<tt>SDNode</tt>) should contain an object
 989 representing a target-specific instruction that is defined
 990 in <tt>XXXInstrInfo.td</tt>. The instruction objects should represent
 991 instructions from the architecture manual of the target machine (such as the
 992 SPARC Architecture Manual for the SPARC target).
 993 </p>
 994
 995 <p>
 996 A single instruction from the architecture manual is often modeled as multiple
 997 target instructions, depending upon its operands. For example, a manual might
 998 describe an add instruction that takes a register or an immediate operand. An
 999 LLVM target could model this with two instructions named <tt>ADDri</tt> and
1000 <tt>ADDrr</tt>.
1001 </p>
1002
1003 <p>
1004 You should define a class for each instruction category and define each opcode
1005 as a subclass of the category with appropriate parameters such as the fixed
1006 binary encoding of opcodes and extended opcodes. You should map the register
1007 bits to the bits of the instruction in which they are encoded (for the
1008 JIT). Also you should specify how the instruction should be printed when the
1009 automatic assembly printer is used.
1010 </p>
1011
1012 <p>
1013 As is described in the SPARC Architecture Manual, Version 8, there are three
1014 major 32-bit formats for instructions. Format 1 is only for the <tt>CALL</tt>
1015 instruction. Format 2 is for branch on condition codes and <tt>SETHI</tt> (set
1016 high bits of a register) instructions.  Format 3 is for other instructions.
1017 </p>
1018
1019 <p>
1020 Each of these formats has corresponding classes in <tt>SparcInstrFormat.td</tt>.
1021 <tt>InstSP</tt> is a base class for other instruction classes. Additional base
1022 classes are specified for more precise formats: for example
1023 in <tt>SparcInstrFormat.td</tt>, <tt>F2_1</tt> is for <tt>SETHI</tt>,
1024 and <tt>F2_2</tt> is for branches. There are three other base
1025 classes: <tt>F3_1</tt> for register/register operations, <tt>F3_2</tt> for
1026 register/immediate operations, and <tt>F3_3</tt> for floating-point
1027 operations. <tt>SparcInstrInfo.td</tt> also adds the base class Pseudo for
1028 synthetic SPARC instructions.
1029 </p>
1030
1031 <p>
1032 <tt>SparcInstrInfo.td</tt> largely consists of operand and instruction
1033 definitions for the SPARC target. In <tt>SparcInstrInfo.td</tt>, the following
1034 target description file entry, <tt>LDrr</tt>, defines the Load Integer
1035 instruction for a Word (the <tt>LD</tt> SPARC opcode) from a memory address to a
1036 register. The first parameter, the value 3 (<tt>11<sub>2</sub></tt>), is the
1037 operation value for this category of operation. The second parameter
1038 (<tt>000000<sub>2</sub></tt>) is the specific operation value
1039 for <tt>LD</tt>/Load Word. The third parameter is the output destination, which
1040 is a register operand and defined in the <tt>Register</tt> target description
1041 file (<tt>IntRegs</tt>).
1042 </p>
1043
1044 <div class="doc_code">
1045 <pre>def LDrr : F3_1 &lt;3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
1046                  "ld [$addr], $dst",
1047                  [(set IntRegs:$dst, (load ADDRrr:$addr))]&gt;;
1048 </pre>
1049 </div>
1050
1051 <p>
1052 The fourth parameter is the input source, which uses the address
1053 operand <tt>MEMrr</tt> that is defined earlier in <tt>SparcInstrInfo.td</tt>:
1054 </p>
1055
1056 <div class="doc_code">
1057 <pre>def MEMrr : Operand&lt;i32&gt; {
1058   let PrintMethod = "printMemOperand";
1059   let MIOperandInfo = (ops IntRegs, IntRegs);
1060 }
1061 </pre>
1062 </div>
1063
1064 <p>
1065 The fifth parameter is a string that is used by the assembly printer and can be
1066 left as an empty string until the assembly printer interface is implemented. The
1067 sixth and final parameter is the pattern used to match the instruction during
1068 the SelectionDAG Select Phase described in
1069 (<a href="http://www.llvm.org/docs/CodeGenerator.html">The LLVM
1070 Target-Independent Code Generator</a>).  This parameter is detailed in the next
1071 section, <a href="#InstructionSelector">Instruction Selector</a>.
1072 </p>
1073
1074 <p>
1075 Instruction class definitions are not overloaded for different operand types, so
1076 separate versions of instructions are needed for register, memory, or immediate
1077 value operands. For example, to perform a Load Integer instruction for a Word
1078 from an immediate operand to a register, the following instruction class is
1079 defined:
1080 </p>
1081
1082 <div class="doc_code">
1083 <pre>def LDri : F3_2 &lt;3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
1084                  "ld [$addr], $dst",
1085                  [(set IntRegs:$dst, (load ADDRri:$addr))]&gt;;
1086 </pre>
1087 </div>
1088
1089 <p>
1090 Writing these definitions for so many similar instructions can involve a lot of
1091 cut and paste. In td files, the <tt>multiclass</tt> directive enables the
1092 creation of templates to define several instruction classes at once (using
1093 the <tt>defm</tt> directive). For example in <tt>SparcInstrInfo.td</tt>, the
1094 <tt>multiclass</tt> pattern <tt>F3_12</tt> is defined to create 2 instruction
1095 classes each time <tt>F3_12</tt> is invoked:
1096 </p>
1097
1098 <div class="doc_code">
1099 <pre>multiclass F3_12 &lt;string OpcStr, bits&lt;6&gt; Op3Val, SDNode OpNode&gt; {
1100   def rr  : F3_1 &lt;2, Op3Val,
1101                  (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
1102                  !strconcat(OpcStr, " $b, $c, $dst"),
1103                  [(set IntRegs:$dst, (OpNode IntRegs:$b, IntRegs:$c))]&gt;;
1104   def ri  : F3_2 &lt;2, Op3Val,
1105                  (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
1106                  !strconcat(OpcStr, " $b, $c, $dst"),
1107                  [(set IntRegs:$dst, (OpNode IntRegs:$b, simm13:$c))]&gt;;
1108 }
1109 </pre>
1110 </div>
1111
1112 <p>
1113 So when the <tt>defm</tt> directive is used for the <tt>XOR</tt>
1114 and <tt>ADD</tt> instructions, as seen below, it creates four instruction
1115 objects: <tt>XORrr</tt>, <tt>XORri</tt>, <tt>ADDrr</tt>, and <tt>ADDri</tt>.
1116 </p>
1117
1118 <div class="doc_code">
1119 <pre>
1120 defm XOR   : F3_12&lt;"xor", 0b000011, xor&gt;;
1121 defm ADD   : F3_12&lt;"add", 0b000000, add&gt;;
1122 </pre>
1123 </div>
1124
1125 <p>
1126 <tt>SparcInstrInfo.td</tt> also includes definitions for condition codes that
1127 are referenced by branch instructions. The following definitions
1128 in <tt>SparcInstrInfo.td</tt> indicate the bit location of the SPARC condition
1129 code. For example, the 10<sup>th</sup> bit represents the 'greater than'
1130 condition for integers, and the 22<sup>nd</sup> bit represents the 'greater
1131 than' condition for floats.
1132 </p>
1133
1134 <div class="doc_code">
1135 <pre>
1136 def ICC_NE  : ICC_VAL&lt; 9&gt;;  // Not Equal
1137 def ICC_E   : ICC_VAL&lt; 1&gt;;  // Equal
1138 def ICC_G   : ICC_VAL&lt;10&gt;;  // Greater
1139 ...
1140 def FCC_U   : FCC_VAL&lt;23&gt;;  // Unordered
1141 def FCC_G   : FCC_VAL&lt;22&gt;;  // Greater
1142 def FCC_UG  : FCC_VAL&lt;21&gt;;  // Unordered or Greater
1143 ...
1144 </pre>
1145 </div>
1146
1147 <p>
1148 (Note that <tt>Sparc.h</tt> also defines enums that correspond to the same SPARC
1149 condition codes. Care must be taken to ensure the values in <tt>Sparc.h</tt>
1150 correspond to the values in <tt>SparcInstrInfo.td</tt>. I.e.,
1151 <tt>SPCC::ICC_NE = 9</tt>, <tt>SPCC::FCC_U = 23</tt> and so on.)
1152 </p>
1153
1154 </div>
1155
1156 <!-- ======================================================================= -->
1157 <div class="doc_subsection">
1158   <a name="operandMapping">Instruction Operand Mapping</a>
1159 </div>
1160
1161 <div class="doc_text">
1162
1163 <p>
1164 The code generator backend maps instruction operands to fields in the
1165 instruction.  Operands are assigned to unbound fields in the instruction in the
1166 order they are defined. Fields are bound when they are assigned a value.  For
1167 example, the Sparc target defines the <tt>XNORrr</tt> instruction as
1168 a <tt>F3_1</tt> format instruction having three operands.
1169 </p>
1170
1171 <div class="doc_code">
1172 <pre>
1173 def XNORrr  : F3_1&lt;2, 0b000111,
1174                    (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
1175                    "xnor $b, $c, $dst",
1176                    [(set IntRegs:$dst, (not (xor IntRegs:$b, IntRegs:$c)))]&gt;;
1177 </pre>
1178 </div>
1179
1180 <p>
1181 The instruction templates in <tt>SparcInstrFormats.td</tt> show the base class
1182 for <tt>F3_1</tt> is <tt>InstSP</tt>.
1183 </p>
1184
1185 <div class="doc_code">
1186 <pre>
1187 class InstSP&lt;dag outs, dag ins, string asmstr, list&lt;dag&gt; pattern&gt; : Instruction {
1188   field bits&lt;32&gt; Inst;
1189   let Namespace = "SP";
1190   bits&lt;2&gt; op;
1191   let Inst{31-30} = op;
1192   dag OutOperandList = outs;
1193   dag InOperandList = ins;
1194   let AsmString   = asmstr;
1195   let Pattern = pattern;
1196 }
1197 </pre>
1198 </div>
1199
1200 <p><tt>InstSP</tt> leaves the <tt>op</tt> field unbound.</p>
1201
1202 <div class="doc_code">
1203 <pre>
1204 class F3&lt;dag outs, dag ins, string asmstr, list&lt;dag&gt; pattern&gt;
1205     : InstSP&lt;outs, ins, asmstr, pattern&gt; {
1206   bits&lt;5&gt; rd;
1207   bits&lt;6&gt; op3;
1208   bits&lt;5&gt; rs1;
1209   let op{1} = 1;   // Op = 2 or 3
1210   let Inst{29-25} = rd;
1211   let Inst{24-19} = op3;
1212   let Inst{18-14} = rs1;
1213 }
1214 </pre>
1215 </div>
1216
1217 <p>
1218 <tt>F3</tt> binds the <tt>op</tt> field and defines the <tt>rd</tt>,
1219 <tt>op3</tt>, and <tt>rs1</tt> fields.  <tt>F3</tt> format instructions will
1220 bind the operands <tt>rd</tt>, <tt>op3</tt>, and <tt>rs1</tt> fields.
1221 </p>
1222
1223 <div class="doc_code">
1224 <pre>
1225 class F3_1&lt;bits&lt;2&gt; opVal, bits&lt;6&gt; op3val, dag outs, dag ins,
1226            string asmstr, list&lt;dag&gt; pattern&gt; : F3&lt;outs, ins, asmstr, pattern&gt; {
1227   bits&lt;8&gt; asi = 0; // asi not currently used
1228   bits&lt;5&gt; rs2;
1229   let op         = opVal;
1230   let op3        = op3val;
1231   let Inst{13}   = 0;     // i field = 0
1232   let Inst{12-5} = asi;   // address space identifier
1233   let Inst{4-0}  = rs2;
1234 }
1235 </pre>
1236 </div>
1237
1238 <p>
1239 <tt>F3_1</tt> binds the <tt>op3</tt> field and defines the <tt>rs2</tt>
1240 fields.  <tt>F3_1</tt> format instructions will bind the operands to the <tt>rd</tt>,
1241 <tt>rs1</tt>, and <tt>rs2</tt> fields. This results in the <tt>XNORrr</tt>
1242 instruction binding <tt>$dst</tt>, <tt>$b</tt>, and <tt>$c</tt> operands to
1243 the <tt>rd</tt>, <tt>rs1</tt>, and <tt>rs2</tt> fields respectively.
1244 </p>
1245
1246 </div>
1247
1248 <!-- ======================================================================= -->
1249 <div class="doc_subsection">
1250   <a name="implementInstr">Implement a subclass of </a>
1251   <a href="http://www.llvm.org/docs/CodeGenerator.html#targetinstrinfo">TargetInstrInfo</a>
1252 </div>
1253
1254 <div class="doc_text">
1255
1256 <p>
1257 The final step is to hand code portions of <tt>XXXInstrInfo</tt>, which
1258 implements the interface described in <tt>TargetInstrInfo.h</tt>. These
1259 functions return <tt>0</tt> or a Boolean or they assert, unless
1260 overridden. Here's a list of functions that are overridden for the SPARC
1261 implementation in <tt>SparcInstrInfo.cpp</tt>:
1262 </p>
1263
1264 <ul>
1265 <li><tt>isMoveInstr</tt> &mdash; Return true if the instruction is a register to
1266     register move; false, otherwise.</li>
1267
1268 <li><tt>isLoadFromStackSlot</tt> &mdash; If the specified machine instruction is
1269     a direct load from a stack slot, return the register number of the
1270     destination and the <tt>FrameIndex</tt> of the stack slot.</li>
1271
1272 <li><tt>isStoreToStackSlot</tt> &mdash; If the specified machine instruction is
1273     a direct store to a stack slot, return the register number of the
1274     destination and the <tt>FrameIndex</tt> of the stack slot.</li>
1275
1276 <li><tt>copyRegToReg</tt> &mdash; Copy values between a pair of registers.</li>
1277
1278 <li><tt>storeRegToStackSlot</tt> &mdash; Store a register value to a stack
1279     slot.</li>
1280
1281 <li><tt>loadRegFromStackSlot</tt> &mdash; Load a register value from a stack
1282     slot.</li>
1283
1284 <li><tt>storeRegToAddr</tt> &mdash; Store a register value to memory.</li>
1285
1286 <li><tt>loadRegFromAddr</tt> &mdash; Load a register value from memory.</li>
1287
1288 <li><tt>foldMemoryOperand</tt> &mdash; Attempt to combine instructions of any
1289     load or store instruction for the specified operand(s).</li>
1290 </ul>
1291
1292 </div>
1293
1294 <!-- ======================================================================= -->
1295 <div class="doc_subsection">
1296   <a name="branchFolding">Branch Folding and If Conversion</a>
1297 </div>
1298 <div class="doc_text">
1299
1300 <p>
1301 Performance can be improved by combining instructions or by eliminating
1302 instructions that are never reached. The <tt>AnalyzeBranch</tt> method
1303 in <tt>XXXInstrInfo</tt> may be implemented to examine conditional instructions
1304 and remove unnecessary instructions. <tt>AnalyzeBranch</tt> looks at the end of
1305 a machine basic block (MBB) for opportunities for improvement, such as branch
1306 folding and if conversion. The <tt>BranchFolder</tt> and <tt>IfConverter</tt>
1307 machine function passes (see the source files <tt>BranchFolding.cpp</tt> and
1308 <tt>IfConversion.cpp</tt> in the <tt>lib/CodeGen</tt> directory) call
1309 <tt>AnalyzeBranch</tt> to improve the control flow graph that represents the
1310 instructions.
1311 </p>
1312
1313 <p>
1314 Several implementations of <tt>AnalyzeBranch</tt> (for ARM, Alpha, and X86) can
1315 be examined as models for your own <tt>AnalyzeBranch</tt> implementation. Since
1316 SPARC does not implement a useful <tt>AnalyzeBranch</tt>, the ARM target
1317 implementation is shown below.
1318 </p>
1319
1320 <p><tt>AnalyzeBranch</tt> returns a Boolean value and takes four parameters:</p>
1321
1322 <ul>
1323 <li><tt>MachineBasicBlock &amp;MBB</tt> &mdash; The incoming block to be
1324     examined.</li>
1325
1326 <li><tt>MachineBasicBlock *&amp;TBB</tt> &mdash; A destination block that is
1327     returned. For a conditional branch that evaluates to true, <tt>TBB</tt> is
1328     the destination.</li>
1329
1330 <li><tt>MachineBasicBlock *&amp;FBB</tt> &mdash; For a conditional branch that
1331     evaluates to false, <tt>FBB</tt> is returned as the destination.</li>
1332
1333 <li><tt>std::vector&lt;MachineOperand&gt; &amp;Cond</tt> &mdash; List of
1334     operands to evaluate a condition for a conditional branch.</li>
1335 </ul>
1336
1337 <p>
1338 In the simplest case, if a block ends without a branch, then it falls through to
1339 the successor block. No destination blocks are specified for either <tt>TBB</tt>
1340 or <tt>FBB</tt>, so both parameters return <tt>NULL</tt>. The start of
1341 the <tt>AnalyzeBranch</tt> (see code below for the ARM target) shows the
1342 function parameters and the code for the simplest case.
1343 </p>
1344
1345 <div class="doc_code">
1346 <pre>bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &amp;MBB,
1347         MachineBasicBlock *&amp;TBB, MachineBasicBlock *&amp;FBB,
1348         std::vector&lt;MachineOperand&gt; &amp;Cond) const
1349 {
1350   MachineBasicBlock::iterator I = MBB.end();
1351   if (I == MBB.begin() || !isUnpredicatedTerminator(--I))
1352     return false;
1353 </pre>
1354 </div>
1355
1356 <p>
1357 If a block ends with a single unconditional branch instruction, then
1358 <tt>AnalyzeBranch</tt> (shown below) should return the destination of that
1359 branch in the <tt>TBB</tt> parameter.
1360 </p>
1361
1362 <div class="doc_code">
1363 <pre>
1364   if (LastOpc == ARM::B || LastOpc == ARM::tB) {
1365     TBB = LastInst-&gt;getOperand(0).getMBB();
1366     return false;
1367   }
1368 </pre>
1369 </div>
1370
1371 <p>
1372 If a block ends with two unconditional branches, then the second branch is never
1373 reached. In that situation, as shown below, remove the last branch instruction
1374 and return the penultimate branch in the <tt>TBB</tt> parameter.
1375 </p>
1376
1377 <div class="doc_code">
1378 <pre>
1379   if ((SecondLastOpc == ARM::B || SecondLastOpc==ARM::tB) &amp;&amp;
1380       (LastOpc == ARM::B || LastOpc == ARM::tB)) {
1381     TBB = SecondLastInst-&gt;getOperand(0).getMBB();
1382     I = LastInst;
1383     I-&gt;eraseFromParent();
1384     return false;
1385   }
1386 </pre>
1387 </div>
1388
1389 <p>
1390 A block may end with a single conditional branch instruction that falls through
1391 to successor block if the condition evaluates to false. In that case,
1392 <tt>AnalyzeBranch</tt> (shown below) should return the destination of that
1393 conditional branch in the <tt>TBB</tt> parameter and a list of operands in
1394 the <tt>Cond</tt> parameter to evaluate the condition.
1395 </p>
1396
1397 <div class="doc_code">
1398 <pre>
1399   if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) {
1400     // Block ends with fall-through condbranch.
1401     TBB = LastInst-&gt;getOperand(0).getMBB();
1402     Cond.push_back(LastInst-&gt;getOperand(1));
1403     Cond.push_back(LastInst-&gt;getOperand(2));
1404     return false;
1405   }
1406 </pre>
1407 </div>
1408
1409 <p>
1410 If a block ends with both a conditional branch and an ensuing unconditional
1411 branch, then <tt>AnalyzeBranch</tt> (shown below) should return the conditional
1412 branch destination (assuming it corresponds to a conditional evaluation of
1413 '<tt>true</tt>') in the <tt>TBB</tt> parameter and the unconditional branch
1414 destination in the <tt>FBB</tt> (corresponding to a conditional evaluation of
1415 '<tt>false</tt>').  A list of operands to evaluate the condition should be
1416 returned in the <tt>Cond</tt> parameter.
1417 </p>
1418
1419 <div class="doc_code">
1420 <pre>
1421   unsigned SecondLastOpc = SecondLastInst-&gt;getOpcode();
1422
1423   if ((SecondLastOpc == ARM::Bcc &amp;&amp; LastOpc == ARM::B) ||
1424       (SecondLastOpc == ARM::tBcc &amp;&amp; LastOpc == ARM::tB)) {
1425     TBB =  SecondLastInst-&gt;getOperand(0).getMBB();
1426     Cond.push_back(SecondLastInst-&gt;getOperand(1));
1427     Cond.push_back(SecondLastInst-&gt;getOperand(2));
1428     FBB = LastInst-&gt;getOperand(0).getMBB();
1429     return false;
1430   }
1431 </pre>
1432 </div>
1433
1434 <p>
1435 For the last two cases (ending with a single conditional branch or ending with
1436 one conditional and one unconditional branch), the operands returned in
1437 the <tt>Cond</tt> parameter can be passed to methods of other instructions to
1438 create new branches or perform other operations. An implementation
1439 of <tt>AnalyzeBranch</tt> requires the helper methods <tt>RemoveBranch</tt>
1440 and <tt>InsertBranch</tt> to manage subsequent operations.
1441 </p>
1442
1443 <p>
1444 <tt>AnalyzeBranch</tt> should return false indicating success in most circumstances.
1445 <tt>AnalyzeBranch</tt> should only return true when the method is stumped about what to
1446 do, for example, if a block has three terminating branches. <tt>AnalyzeBranch</tt> may
1447 return true if it encounters a terminator it cannot handle, such as an indirect
1448 branch.
1449 </p>
1450
1451 </div>
1452
1453 <!-- *********************************************************************** -->
1454 <div class="doc_section">
1455   <a name="InstructionSelector">Instruction Selector</a>
1456 </div>
1457 <!-- *********************************************************************** -->
1458
1459 <div class="doc_text">
1460
1461 <p>
1462 LLVM uses a <tt>SelectionDAG</tt> to represent LLVM IR instructions, and nodes
1463 of the <tt>SelectionDAG</tt> ideally represent native target
1464 instructions. During code generation, instruction selection passes are performed
1465 to convert non-native DAG instructions into native target-specific
1466 instructions. The pass described in <tt>XXXISelDAGToDAG.cpp</tt> is used to
1467 match patterns and perform DAG-to-DAG instruction selection. Optionally, a pass
1468 may be defined (in <tt>XXXBranchSelector.cpp</tt>) to perform similar DAG-to-DAG
1469 operations for branch instructions. Later, the code in
1470 <tt>XXXISelLowering.cpp</tt> replaces or removes operations and data types not
1471 supported natively (legalizes) in a <tt>SelectionDAG</tt>.
1472 </p>
1473
1474 <p>
1475 TableGen generates code for instruction selection using the following target
1476 description input files:
1477 </p>
1478
1479 <ul>
1480 <li><tt>XXXInstrInfo.td</tt> &mdash; Contains definitions of instructions in a
1481     target-specific instruction set, generates <tt>XXXGenDAGISel.inc</tt>, which
1482     is included in <tt>XXXISelDAGToDAG.cpp</tt>.</li>
1483
1484 <li><tt>XXXCallingConv.td</tt> &mdash; Contains the calling and return value
1485     conventions for the target architecture, and it generates
1486     <tt>XXXGenCallingConv.inc</tt>, which is included in
1487     <tt>XXXISelLowering.cpp</tt>.</li>
1488 </ul>
1489
1490 <p>
1491 The implementation of an instruction selection pass must include a header that
1492 declares the <tt>FunctionPass</tt> class or a subclass of <tt>FunctionPass</tt>. In
1493 <tt>XXXTargetMachine.cpp</tt>, a Pass Manager (PM) should add each instruction
1494 selection pass into the queue of passes to run.
1495 </p>
1496
1497 <p>
1498 The LLVM static compiler (<tt>llc</tt>) is an excellent tool for visualizing the
1499 contents of DAGs. To display the <tt>SelectionDAG</tt> before or after specific
1500 processing phases, use the command line options for <tt>llc</tt>, described
1501 at <a href="http://llvm.org/docs/CodeGenerator.html#selectiondag_process">
1502 SelectionDAG Instruction Selection Process</a>.
1503 </p>
1504
1505 <p>
1506 To describe instruction selector behavior, you should add patterns for lowering
1507 LLVM code into a <tt>SelectionDAG</tt> as the last parameter of the instruction
1508 definitions in <tt>XXXInstrInfo.td</tt>. For example, in
1509 <tt>SparcInstrInfo.td</tt>, this entry defines a register store operation, and
1510 the last parameter describes a pattern with the store DAG operator.
1511 </p>
1512
1513 <div class="doc_code">
1514 <pre>
1515 def STrr  : F3_1&lt; 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
1516                  "st $src, [$addr]", [(store IntRegs:$src, ADDRrr:$addr)]&gt;;
1517 </pre>
1518 </div>
1519
1520 <p>
1521 <tt>ADDRrr</tt> is a memory mode that is also defined in
1522 <tt>SparcInstrInfo.td</tt>:
1523 </p>
1524
1525 <div class="doc_code">
1526 <pre>
1527 def ADDRrr : ComplexPattern&lt;i32, 2, "SelectADDRrr", [], []&gt;;
1528 </pre>
1529 </div>
1530
1531 <p>
1532 The definition of <tt>ADDRrr</tt> refers to <tt>SelectADDRrr</tt>, which is a
1533 function defined in an implementation of the Instructor Selector (such
1534 as <tt>SparcISelDAGToDAG.cpp</tt>).
1535 </p>
1536
1537 <p>
1538 In <tt>lib/Target/TargetSelectionDAG.td</tt>, the DAG operator for store is
1539 defined below:
1540 </p>
1541
1542 <div class="doc_code">
1543 <pre>
1544 def store : PatFrag&lt;(ops node:$val, node:$ptr),
1545                     (st node:$val, node:$ptr), [{
1546   if (StoreSDNode *ST = dyn_cast&lt;StoreSDNode&gt;(N))
1547     return !ST-&gt;isTruncatingStore() &amp;&amp;
1548            ST-&gt;getAddressingMode() == ISD::UNINDEXED;
1549   return false;
1550 }]&gt;;
1551 </pre>
1552 </div>
1553
1554 <p>
1555 <tt>XXXInstrInfo.td</tt> also generates (in <tt>XXXGenDAGISel.inc</tt>) the
1556 <tt>SelectCode</tt> method that is used to call the appropriate processing
1557 method for an instruction. In this example, <tt>SelectCode</tt>
1558 calls <tt>Select_ISD_STORE</tt> for the <tt>ISD::STORE</tt> opcode.
1559 </p>
1560
1561 <div class="doc_code">
1562 <pre>
1563 SDNode *SelectCode(SDValue N) {
1564   ...
1565   MVT::ValueType NVT = N.getNode()-&gt;getValueType(0);
1566   switch (N.getOpcode()) {
1567   case ISD::STORE: {
1568     switch (NVT) {
1569     default:
1570       return Select_ISD_STORE(N);
1571       break;
1572     }
1573     break;
1574   }
1575   ...
1576 </pre>
1577 </div>
1578
1579 <p>
1580 The pattern for <tt>STrr</tt> is matched, so elsewhere in
1581 <tt>XXXGenDAGISel.inc</tt>, code for <tt>STrr</tt> is created for
1582 <tt>Select_ISD_STORE</tt>. The <tt>Emit_22</tt> method is also generated
1583 in <tt>XXXGenDAGISel.inc</tt> to complete the processing of this
1584 instruction.
1585 </p>
1586
1587 <div class="doc_code">
1588 <pre>
1589 SDNode *Select_ISD_STORE(const SDValue &amp;N) {
1590   SDValue Chain = N.getOperand(0);
1591   if (Predicate_store(N.getNode())) {
1592     SDValue N1 = N.getOperand(1);
1593     SDValue N2 = N.getOperand(2);
1594     SDValue CPTmp0;
1595     SDValue CPTmp1;
1596
1597     // Pattern: (st:void IntRegs:i32:$src,
1598     //           ADDRrr:i32:$addr)&lt;&lt;P:Predicate_store&gt;&gt;
1599     // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
1600     // Pattern complexity = 13  cost = 1  size = 0
1601     if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &amp;&amp;
1602         N1.getNode()-&gt;getValueType(0) == MVT::i32 &amp;&amp;
1603         N2.getNode()-&gt;getValueType(0) == MVT::i32) {
1604       return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
1605     }
1606 ...
1607 </pre>
1608 </div>
1609
1610 </div>
1611
1612 <!-- ======================================================================= -->
1613 <div class="doc_subsection">
1614   <a name="LegalizePhase">The SelectionDAG Legalize Phase</a>
1615 </div>
1616
1617 <div class="doc_text">
1618
1619 <p>
1620 The Legalize phase converts a DAG to use types and operations that are natively
1621 supported by the target. For natively unsupported types and operations, you need
1622 to add code to the target-specific XXXTargetLowering implementation to convert
1623 unsupported types and operations to supported ones.
1624 </p>
1625
1626 <p>
1627 In the constructor for the <tt>XXXTargetLowering</tt> class, first use the
1628 <tt>addRegisterClass</tt> method to specify which types are supports and which
1629 register classes are associated with them. The code for the register classes are
1630 generated by TableGen from <tt>XXXRegisterInfo.td</tt> and placed
1631 in <tt>XXXGenRegisterInfo.h.inc</tt>. For example, the implementation of the
1632 constructor for the SparcTargetLowering class (in
1633 <tt>SparcISelLowering.cpp</tt>) starts with the following code:
1634 </p>
1635
1636 <div class="doc_code">
1637 <pre>
1638 addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
1639 addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
1640 addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
1641 </pre>
1642 </div>
1643
1644 <p>
1645 You should examine the node types in the <tt>ISD</tt> namespace
1646 (<tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt>) and determine which
1647 operations the target natively supports. For operations that do <b>not</b> have
1648 native support, add a callback to the constructor for the XXXTargetLowering
1649 class, so the instruction selection process knows what to do. The TargetLowering
1650 class callback methods (declared in <tt>llvm/Target/TargetLowering.h</tt>) are:
1651 </p>
1652
1653 <ul>
1654 <li><tt>setOperationAction</tt> &mdash; General operation.</li>
1655
1656 <li><tt>setLoadExtAction</tt> &mdash; Load with extension.</li>
1657
1658 <li><tt>setTruncStoreAction</tt> &mdash; Truncating store.</li>
1659
1660 <li><tt>setIndexedLoadAction</tt> &mdash; Indexed load.</li>
1661
1662 <li><tt>setIndexedStoreAction</tt> &mdash; Indexed store.</li>
1663
1664 <li><tt>setConvertAction</tt> &mdash; Type conversion.</li>
1665
1666 <li><tt>setCondCodeAction</tt> &mdash; Support for a given condition code.</li>
1667 </ul>
1668
1669 <p>
1670 Note: on older releases, <tt>setLoadXAction</tt> is used instead
1671 of <tt>setLoadExtAction</tt>.  Also, on older releases,
1672 <tt>setCondCodeAction</tt> may not be supported. Examine your release
1673 to see what methods are specifically supported.
1674 </p>
1675
1676 <p>
1677 These callbacks are used to determine that an operation does or does not work
1678 with a specified type (or types). And in all cases, the third parameter is
1679 a <tt>LegalAction</tt> type enum value: <tt>Promote</tt>, <tt>Expand</tt>,
1680 <tt>Custom</tt>, or <tt>Legal</tt>. <tt>SparcISelLowering.cpp</tt>
1681 contains examples of all four <tt>LegalAction</tt> values.
1682 </p>
1683
1684 </div>
1685
1686 <!-- _______________________________________________________________________ -->
1687 <div class="doc_subsubsection">
1688   <a name="promote">Promote</a>
1689 </div>
1690
1691 <div class="doc_text">
1692
1693 <p>
1694 For an operation without native support for a given type, the specified type may
1695 be promoted to a larger type that is supported. For example, SPARC does not
1696 support a sign-extending load for Boolean values (<tt>i1</tt> type), so
1697 in <tt>SparcISelLowering.cpp</tt> the third parameter below, <tt>Promote</tt>,
1698 changes <tt>i1</tt> type values to a large type before loading.
1699 </p>
1700
1701 <div class="doc_code">
1702 <pre>
1703 setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
1704 </pre>
1705 </div>
1706
1707 </div>
1708
1709 <!-- _______________________________________________________________________ -->
1710 <div class="doc_subsubsection">
1711   <a name="expand">Expand</a>
1712 </div>
1713
1714 <div class="doc_text">
1715
1716 <p>
1717 For a type without native support, a value may need to be broken down further,
1718 rather than promoted. For an operation without native support, a combination of
1719 other operations may be used to similar effect. In SPARC, the floating-point
1720 sine and cosine trig operations are supported by expansion to other operations,
1721 as indicated by the third parameter, <tt>Expand</tt>, to
1722 <tt>setOperationAction</tt>:
1723 </p>
1724
1725 <div class="doc_code">
1726 <pre>
1727 setOperationAction(ISD::FSIN, MVT::f32, Expand);
1728 setOperationAction(ISD::FCOS, MVT::f32, Expand);
1729 </pre>
1730 </div>
1731
1732 </div>
1733
1734 <!-- _______________________________________________________________________ -->
1735 <div class="doc_subsubsection">
1736   <a name="custom">Custom</a>
1737 </div>
1738
1739 <div class="doc_text">
1740
1741 <p>
1742 For some operations, simple type promotion or operation expansion may be
1743 insufficient. In some cases, a special intrinsic function must be implemented.
1744 </p>
1745
1746 <p>
1747 For example, a constant value may require special treatment, or an operation may
1748 require spilling and restoring registers in the stack and working with register
1749 allocators.
1750 </p>
1751
1752 <p>
1753 As seen in <tt>SparcISelLowering.cpp</tt> code below, to perform a type
1754 conversion from a floating point value to a signed integer, first the
1755 <tt>setOperationAction</tt> should be called with <tt>Custom</tt> as the third
1756 parameter:
1757 </p>
1758
1759 <div class="doc_code">
1760 <pre>
1761 setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
1762 </pre>
1763 </div>
1764
1765 <p>
1766 In the <tt>LowerOperation</tt> method, for each <tt>Custom</tt> operation, a
1767 case statement should be added to indicate what function to call. In the
1768 following code, an <tt>FP_TO_SINT</tt> opcode will call
1769 the <tt>LowerFP_TO_SINT</tt> method:
1770 </p>
1771
1772 <div class="doc_code">
1773 <pre>
1774 SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &amp;DAG) {
1775   switch (Op.getOpcode()) {
1776   case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
1777   ...
1778   }
1779 }
1780 </pre>
1781 </div>
1782
1783 <p>
1784 Finally, the <tt>LowerFP_TO_SINT</tt> method is implemented, using an FP
1785 register to convert the floating-point value to an integer.
1786 </p>
1787
1788 <div class="doc_code">
1789 <pre>
1790 static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &amp;DAG) {
1791   assert(Op.getValueType() == MVT::i32);
1792   Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
1793   return DAG.getNode(ISD::BIT_CONVERT, MVT::i32, Op);
1794 }
1795 </pre>
1796 </div>
1797
1798 </div>
1799
1800 <!-- _______________________________________________________________________ -->
1801 <div class="doc_subsubsection">
1802   <a name="legal">Legal</a>
1803 </div>
1804
1805 <div class="doc_text">
1806
1807 <p>
1808 The <tt>Legal</tt> LegalizeAction enum value simply indicates that an
1809 operation <b>is</b> natively supported. <tt>Legal</tt> represents the default
1810 condition, so it is rarely used. In <tt>SparcISelLowering.cpp</tt>, the action
1811 for <tt>CTPOP</tt> (an operation to count the bits set in an integer) is
1812 natively supported only for SPARC v9. The following code enables
1813 the <tt>Expand</tt> conversion technique for non-v9 SPARC implementations.
1814 </p>
1815
1816 <div class="doc_code">
1817 <pre>
1818 setOperationAction(ISD::CTPOP, MVT::i32, Expand);
1819 ...
1820 if (TM.getSubtarget&lt;SparcSubtarget&gt;().isV9())
1821   setOperationAction(ISD::CTPOP, MVT::i32, Legal);
1822   case ISD::SETULT: return SPCC::ICC_CS;
1823   case ISD::SETULE: return SPCC::ICC_LEU;
1824   case ISD::SETUGT: return SPCC::ICC_GU;
1825   case ISD::SETUGE: return SPCC::ICC_CC;
1826   }
1827 }
1828 </pre>
1829 </div>
1830
1831 </div>
1832
1833 <!-- ======================================================================= -->
1834 <div class="doc_subsection">
1835   <a name="callingConventions">Calling Conventions</a>
1836 </div>
1837
1838 <div class="doc_text">
1839
1840 <p>
1841 To support target-specific calling conventions, <tt>XXXGenCallingConv.td</tt>
1842 uses interfaces (such as CCIfType and CCAssignToReg) that are defined in
1843 <tt>lib/Target/TargetCallingConv.td</tt>. TableGen can take the target
1844 descriptor file <tt>XXXGenCallingConv.td</tt> and generate the header
1845 file <tt>XXXGenCallingConv.inc</tt>, which is typically included
1846 in <tt>XXXISelLowering.cpp</tt>. You can use the interfaces in
1847 <tt>TargetCallingConv.td</tt> to specify:
1848 </p>
1849
1850 <ul>
1851 <li>The order of parameter allocation.</li>
1852
1853 <li>Where parameters and return values are placed (that is, on the stack or in
1854     registers).</li>
1855
1856 <li>Which registers may be used.</li>
1857
1858 <li>Whether the caller or callee unwinds the stack.</li>
1859 </ul>
1860
1861 <p>
1862 The following example demonstrates the use of the <tt>CCIfType</tt> and
1863 <tt>CCAssignToReg</tt> interfaces. If the <tt>CCIfType</tt> predicate is true
1864 (that is, if the current argument is of type <tt>f32</tt> or <tt>f64</tt>), then
1865 the action is performed. In this case, the <tt>CCAssignToReg</tt> action assigns
1866 the argument value to the first available register: either <tt>R0</tt>
1867 or <tt>R1</tt>.
1868 </p>
1869
1870 <div class="doc_code">
1871 <pre>
1872 CCIfType&lt;[f32,f64], CCAssignToReg&lt;[R0, R1]&gt;&gt;
1873 </pre>
1874 </div>
1875
1876 <p>
1877 <tt>SparcCallingConv.td</tt> contains definitions for a target-specific
1878 return-value calling convention (RetCC_Sparc32) and a basic 32-bit C calling
1879 convention (<tt>CC_Sparc32</tt>). The definition of <tt>RetCC_Sparc32</tt>
1880 (shown below) indicates which registers are used for specified scalar return
1881 types. A single-precision float is returned to register <tt>F0</tt>, and a
1882 double-precision float goes to register <tt>D0</tt>. A 32-bit integer is
1883 returned in register <tt>I0</tt> or <tt>I1</tt>.
1884 </p>
1885
1886 <div class="doc_code">
1887 <pre>
1888 def RetCC_Sparc32 : CallingConv&lt;[
1889   CCIfType&lt;[i32], CCAssignToReg&lt;[I0, I1]&gt;&gt;,
1890   CCIfType&lt;[f32], CCAssignToReg&lt;[F0]&gt;&gt;,
1891   CCIfType&lt;[f64], CCAssignToReg&lt;[D0]&gt;&gt;
1892 ]&gt;;
1893 </pre>
1894 </div>
1895
1896 <p>
1897 The definition of <tt>CC_Sparc32</tt> in <tt>SparcCallingConv.td</tt> introduces
1898 <tt>CCAssignToStack</tt>, which assigns the value to a stack slot with the
1899 specified size and alignment. In the example below, the first parameter, 4,
1900 indicates the size of the slot, and the second parameter, also 4, indicates the
1901 stack alignment along 4-byte units. (Special cases: if size is zero, then the
1902 ABI size is used; if alignment is zero, then the ABI alignment is used.)
1903 </p>
1904
1905 <div class="doc_code">
1906 <pre>
1907 def CC_Sparc32 : CallingConv&lt;[
1908   // All arguments get passed in integer registers if there is space.
1909   CCIfType&lt;[i32, f32, f64], CCAssignToReg&lt;[I0, I1, I2, I3, I4, I5]&gt;&gt;,
1910   CCAssignToStack&lt;4, 4&gt;
1911 ]&gt;;
1912 </pre>
1913 </div>
1914
1915 <p>
1916 <tt>CCDelegateTo</tt> is another commonly used interface, which tries to find a
1917 specified sub-calling convention, and, if a match is found, it is invoked. In
1918 the following example (in <tt>X86CallingConv.td</tt>), the definition of
1919 <tt>RetCC_X86_32_C</tt> ends with <tt>CCDelegateTo</tt>. After the current value
1920 is assigned to the register <tt>ST0</tt> or <tt>ST1</tt>,
1921 the <tt>RetCC_X86Common</tt> is invoked.
1922 </p>
1923
1924 <div class="doc_code">
1925 <pre>
1926 def RetCC_X86_32_C : CallingConv&lt;[
1927   CCIfType&lt;[f32], CCAssignToReg&lt;[ST0, ST1]&gt;&gt;,
1928   CCIfType&lt;[f64], CCAssignToReg&lt;[ST0, ST1]&gt;&gt;,
1929   CCDelegateTo&lt;RetCC_X86Common&gt;
1930 ]&gt;;
1931 </pre>
1932 </div>
1933
1934 <p>
1935 <tt>CCIfCC</tt> is an interface that attempts to match the given name to the
1936 current calling convention. If the name identifies the current calling
1937 convention, then a specified action is invoked. In the following example (in
1938 <tt>X86CallingConv.td</tt>), if the <tt>Fast</tt> calling convention is in use,
1939 then <tt>RetCC_X86_32_Fast</tt> is invoked. If the <tt>SSECall</tt> calling
1940 convention is in use, then <tt>RetCC_X86_32_SSE</tt> is invoked.
1941 </p>
1942
1943 <div class="doc_code">
1944 <pre>
1945 def RetCC_X86_32 : CallingConv&lt;[
1946   CCIfCC&lt;"CallingConv::Fast", CCDelegateTo&lt;RetCC_X86_32_Fast&gt;&gt;,
1947   CCIfCC&lt;"CallingConv::X86_SSECall", CCDelegateTo&lt;RetCC_X86_32_SSE&gt;&gt;,
1948   CCDelegateTo&lt;RetCC_X86_32_C&gt;
1949 ]&gt;;
1950 </pre>
1951 </div>
1952
1953 <p>Other calling convention interfaces include:</p>
1954
1955 <ul>
1956 <li><tt>CCIf &lt;predicate, action&gt;</tt> &mdash; If the predicate matches,
1957     apply the action.</li>
1958
1959 <li><tt>CCIfInReg &lt;action&gt;</tt> &mdash; If the argument is marked with the
1960     '<tt>inreg</tt>' attribute, then apply the action.</li>
1961
1962 <li><tt>CCIfNest &lt;action&gt;</tt> &mdash; Inf the argument is marked with the
1963     '<tt>nest</tt>' attribute, then apply the action.</li>
1964
1965 <li><tt>CCIfNotVarArg &lt;action&gt;</tt> &mdash; If the current function does
1966     not take a variable number of arguments, apply the action.</li>
1967
1968 <li><tt>CCAssignToRegWithShadow &lt;registerList, shadowList&gt;</tt> &mdash;
1969     similar to <tt>CCAssignToReg</tt>, but with a shadow list of registers.</li>
1970
1971 <li><tt>CCPassByVal &lt;size, align&gt;</tt> &mdash; Assign value to a stack
1972     slot with the minimum specified size and alignment.</li>
1973
1974 <li><tt>CCPromoteToType &lt;type&gt;</tt> &mdash; Promote the current value to
1975     the specified type.</li>
1976
1977 <li><tt>CallingConv &lt;[actions]&gt;</tt> &mdash; Define each calling
1978     convention that is supported.</li>
1979 </ul>
1980
1981 </div>
1982
1983 <!-- *********************************************************************** -->
1984 <div class="doc_section">
1985   <a name="assemblyPrinter">Assembly Printer</a>
1986 </div>
1987 <!-- *********************************************************************** -->
1988
1989 <div class="doc_text">
1990
1991 <p>
1992 During the code emission stage, the code generator may utilize an LLVM pass to
1993 produce assembly output. To do this, you want to implement the code for a
1994 printer that converts LLVM IR to a GAS-format assembly language for your target
1995 machine, using the following steps:
1996 </p>
1997
1998 <ul>
1999 <li>Define all the assembly strings for your target, adding them to the
2000     instructions defined in the <tt>XXXInstrInfo.td</tt> file.
2001     (See <a href="#InstructionSet">Instruction Set</a>.)  TableGen will produce
2002     an output file (<tt>XXXGenAsmWriter.inc</tt>) with an implementation of
2003     the <tt>printInstruction</tt> method for the XXXAsmPrinter class.</li>
2004
2005 <li>Write <tt>XXXTargetAsmInfo.h</tt>, which contains the bare-bones declaration
2006     of the <tt>XXXTargetAsmInfo</tt> class (a subclass
2007     of <tt>TargetAsmInfo</tt>).</li>
2008
2009 <li>Write <tt>XXXTargetAsmInfo.cpp</tt>, which contains target-specific values
2010     for <tt>TargetAsmInfo</tt> properties and sometimes new implementations for
2011     methods.</li>
2012
2013 <li>Write <tt>XXXAsmPrinter.cpp</tt>, which implements the <tt>AsmPrinter</tt>
2014     class that performs the LLVM-to-assembly conversion.</li>
2015 </ul>
2016
2017 <p>
2018 The code in <tt>XXXTargetAsmInfo.h</tt> is usually a trivial declaration of the
2019 <tt>XXXTargetAsmInfo</tt> class for use in <tt>XXXTargetAsmInfo.cpp</tt>.
2020 Similarly, <tt>XXXTargetAsmInfo.cpp</tt> usually has a few declarations of
2021 <tt>XXXTargetAsmInfo</tt> replacement values that override the default values
2022 in <tt>TargetAsmInfo.cpp</tt>. For example in <tt>SparcTargetAsmInfo.cpp</tt>:
2023 </p>
2024
2025 <div class="doc_code">
2026 <pre>
2027 SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &amp;TM) {
2028   Data16bitsDirective = "\t.half\t";
2029   Data32bitsDirective = "\t.word\t";
2030   Data64bitsDirective = 0;  // .xword is only supported by V9.
2031   ZeroDirective = "\t.skip\t";
2032   CommentString = "!";
2033   ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
2034 }
2035 </pre>
2036 </div>
2037
2038 <p>
2039 The X86 assembly printer implementation (<tt>X86TargetAsmInfo</tt>) is an
2040 example where the target specific <tt>TargetAsmInfo</tt> class uses overridden
2041 methods: <tt>ExpandInlineAsm</tt> and <tt>PreferredEHDataFormat</tt>.
2042 </p>
2043
2044 <p>
2045 A target-specific implementation of AsmPrinter is written in
2046 <tt>XXXAsmPrinter.cpp</tt>, which implements the <tt>AsmPrinter</tt> class that
2047 converts the LLVM to printable assembly. The implementation must include the
2048 following headers that have declarations for the <tt>AsmPrinter</tt> and
2049 <tt>MachineFunctionPass</tt> classes. The <tt>MachineFunctionPass</tt> is a
2050 subclass of <tt>FunctionPass</tt>.
2051 </p>
2052
2053 <div class="doc_code">
2054 <pre>
2055 #include "llvm/CodeGen/AsmPrinter.h"
2056 #include "llvm/CodeGen/MachineFunctionPass.h"
2057 </pre>
2058 </div>
2059
2060 <p>
2061 As a <tt>FunctionPass</tt>, <tt>AsmPrinter</tt> first
2062 calls <tt>doInitialization</tt> to set up the <tt>AsmPrinter</tt>. In
2063 <tt>SparcAsmPrinter</tt>, a <tt>Mangler</tt> object is instantiated to process
2064 variable names.
2065 </p>
2066
2067 <p>
2068 In <tt>XXXAsmPrinter.cpp</tt>, the <tt>runOnMachineFunction</tt> method
2069 (declared in <tt>MachineFunctionPass</tt>) must be implemented
2070 for <tt>XXXAsmPrinter</tt>. In <tt>MachineFunctionPass</tt>,
2071 the <tt>runOnFunction</tt> method invokes <tt>runOnMachineFunction</tt>.
2072 Target-specific implementations of <tt>runOnMachineFunction</tt> differ, but
2073 generally do the following to process each machine function:
2074 </p>
2075
2076 <ul>
2077 <li>Call <tt>SetupMachineFunction</tt> to perform initialization.</li>
2078
2079 <li>Call <tt>EmitConstantPool</tt> to print out (to the output stream) constants
2080     which have been spilled to memory.</li>
2081
2082 <li>Call <tt>EmitJumpTableInfo</tt> to print out jump tables used by the current
2083     function.</li>
2084
2085 <li>Print out the label for the current function.</li>
2086
2087 <li>Print out the code for the function, including basic block labels and the
2088     assembly for the instruction (using <tt>printInstruction</tt>)</li>
2089 </ul>
2090
2091 <p>
2092 The <tt>XXXAsmPrinter</tt> implementation must also include the code generated
2093 by TableGen that is output in the <tt>XXXGenAsmWriter.inc</tt> file. The code
2094 in <tt>XXXGenAsmWriter.inc</tt> contains an implementation of the
2095 <tt>printInstruction</tt> method that may call these methods:
2096 </p>
2097
2098 <ul>
2099 <li><tt>printOperand</tt></li>
2100
2101 <li><tt>printMemOperand</tt></li>
2102
2103 <li><tt>printCCOperand (for conditional statements)</tt></li>
2104
2105 <li><tt>printDataDirective</tt></li>
2106
2107 <li><tt>printDeclare</tt></li>
2108
2109 <li><tt>printImplicitDef</tt></li>
2110
2111 <li><tt>printInlineAsm</tt></li>
2112
2113 <li><tt>printLabel</tt></li>
2114
2115 <li><tt>printPICJumpTableEntry</tt></li>
2116
2117 <li><tt>printPICJumpTableSetLabel</tt></li>
2118 </ul>
2119
2120 <p>
2121 The implementations of <tt>printDeclare</tt>, <tt>printImplicitDef</tt>,
2122 <tt>printInlineAsm</tt>, and <tt>printLabel</tt> in <tt>AsmPrinter.cpp</tt> are
2123 generally adequate for printing assembly and do not need to be
2124 overridden. (<tt>printBasicBlockLabel</tt> is another method that is implemented
2125 in <tt>AsmPrinter.cpp</tt> that may be directly used in an implementation of
2126 <tt>XXXAsmPrinter</tt>.)
2127 </p>
2128
2129 <p>
2130 The <tt>printOperand</tt> method is implemented with a long switch/case
2131 statement for the type of operand: register, immediate, basic block, external
2132 symbol, global address, constant pool index, or jump table index. For an
2133 instruction with a memory address operand, the <tt>printMemOperand</tt> method
2134 should be implemented to generate the proper output. Similarly,
2135 <tt>printCCOperand</tt> should be used to print a conditional operand.
2136 </p>
2137
2138 <p><tt>doFinalization</tt> should be overridden in <tt>XXXAsmPrinter</tt>, and
2139 it should be called to shut down the assembly printer. During
2140 <tt>doFinalization</tt>, global variables and constants are printed to
2141 output.
2142 </p>
2143
2144 </div>
2145
2146 <!-- *********************************************************************** -->
2147 <div class="doc_section">
2148   <a name="subtargetSupport">Subtarget Support</a>
2149 </div>
2150 <!-- *********************************************************************** -->
2151
2152 <div class="doc_text">
2153
2154 <p>
2155 Subtarget support is used to inform the code generation process of instruction
2156 set variations for a given chip set.  For example, the LLVM SPARC implementation
2157 provided covers three major versions of the SPARC microprocessor architecture:
2158 Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a 64-bit
2159 architecture), and the UltraSPARC architecture. V8 has 16 double-precision
2160 floating-point registers that are also usable as either 32 single-precision or 8
2161 quad-precision registers.  V8 is also purely big-endian. V9 has 32
2162 double-precision floating-point registers that are also usable as 16
2163 quad-precision registers, but cannot be used as single-precision registers. The
2164 UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
2165 extensions.
2166 </p>
2167
2168 <p>
2169 If subtarget support is needed, you should implement a target-specific
2170 XXXSubtarget class for your architecture. This class should process the
2171 command-line options <tt>-mcpu=</tt> and <tt>-mattr=</tt>.
2172 </p>
2173
2174 <p>
2175 TableGen uses definitions in the <tt>Target.td</tt> and <tt>Sparc.td</tt> files
2176 to generate code in <tt>SparcGenSubtarget.inc</tt>. In <tt>Target.td</tt>, shown
2177 below, the <tt>SubtargetFeature</tt> interface is defined. The first 4 string
2178 parameters of the <tt>SubtargetFeature</tt> interface are a feature name, an
2179 attribute set by the feature, the value of the attribute, and a description of
2180 the feature. (The fifth parameter is a list of features whose presence is
2181 implied, and its default value is an empty array.)
2182 </p>
2183
2184 <div class="doc_code">
2185 <pre>
2186 class SubtargetFeature&lt;string n, string a,  string v, string d,
2187                        list&lt;SubtargetFeature&gt; i = []&gt; {
2188   string Name = n;
2189   string Attribute = a;
2190   string Value = v;
2191   string Desc = d;
2192   list&lt;SubtargetFeature&gt; Implies = i;
2193 }
2194 </pre>
2195 </div>
2196
2197 <p>
2198 In the <tt>Sparc.td</tt> file, the SubtargetFeature is used to define the
2199 following features.
2200 </p>
2201
2202 <div class="doc_code">
2203 <pre>
2204 def FeatureV9 : SubtargetFeature&lt;"v9", "IsV9", "true",
2205                      "Enable SPARC-V9 instructions"&gt;;
2206 def FeatureV8Deprecated : SubtargetFeature&lt;"deprecated-v8",
2207                      "V8DeprecatedInsts", "true",
2208                      "Enable deprecated V8 instructions in V9 mode"&gt;;
2209 def FeatureVIS : SubtargetFeature&lt;"vis", "IsVIS", "true",
2210                      "Enable UltraSPARC Visual Instruction Set extensions"&gt;;
2211 </pre>
2212 </div>
2213
2214 <p>
2215 Elsewhere in <tt>Sparc.td</tt>, the Proc class is defined and then is used to
2216 define particular SPARC processor subtypes that may have the previously
2217 described features.
2218 </p>
2219
2220 <div class="doc_code">
2221 <pre>
2222 class Proc&lt;string Name, list&lt;SubtargetFeature&gt; Features&gt;
2223   : Processor&lt;Name, NoItineraries, Features&gt;;
2224 &nbsp;
2225 def : Proc&lt;"generic",         []&gt;;
2226 def : Proc&lt;"v8",              []&gt;;
2227 def : Proc&lt;"supersparc",      []&gt;;
2228 def : Proc&lt;"sparclite",       []&gt;;
2229 def : Proc&lt;"f934",            []&gt;;
2230 def : Proc&lt;"hypersparc",      []&gt;;
2231 def : Proc&lt;"sparclite86x",    []&gt;;
2232 def : Proc&lt;"sparclet",        []&gt;;
2233 def : Proc&lt;"tsc701",          []&gt;;
2234 def : Proc&lt;"v9",              [FeatureV9]&gt;;
2235 def : Proc&lt;"ultrasparc",      [FeatureV9, FeatureV8Deprecated]&gt;;
2236 def : Proc&lt;"ultrasparc3",     [FeatureV9, FeatureV8Deprecated]&gt;;
2237 def : Proc&lt;"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]&gt;;
2238 </pre>
2239 </div>
2240
2241 <p>
2242 From <tt>Target.td</tt> and <tt>Sparc.td</tt> files, the resulting
2243 SparcGenSubtarget.inc specifies enum values to identify the features, arrays of
2244 constants to represent the CPU features and CPU subtypes, and the
2245 ParseSubtargetFeatures method that parses the features string that sets
2246 specified subtarget options. The generated <tt>SparcGenSubtarget.inc</tt> file
2247 should be included in the <tt>SparcSubtarget.cpp</tt>. The target-specific
2248 implementation of the XXXSubtarget method should follow this pseudocode:
2249 </p>
2250
2251 <div class="doc_code">
2252 <pre>
2253 XXXSubtarget::XXXSubtarget(const Module &amp;M, const std::string &amp;FS) {
2254   // Set the default features
2255   // Determine default and user specified characteristics of the CPU
2256   // Call ParseSubtargetFeatures(FS, CPU) to parse the features string
2257   // Perform any additional operations
2258 }
2259 </pre>
2260 </div>
2261
2262 </div>
2263
2264 <!-- *********************************************************************** -->
2265 <div class="doc_section">
2266   <a name="jitSupport">JIT Support</a>
2267 </div>
2268 <!-- *********************************************************************** -->
2269
2270 <div class="doc_text">
2271
2272 <p>
2273 The implementation of a target machine optionally includes a Just-In-Time (JIT)
2274 code generator that emits machine code and auxiliary structures as binary output
2275 that can be written directly to memory.  To do this, implement JIT code
2276 generation by performing the following steps:
2277 </p>
2278
2279 <ul>
2280 <li>Write an <tt>XXXCodeEmitter.cpp</tt> file that contains a machine function
2281     pass that transforms target-machine instructions into relocatable machine
2282     code.</li>
2283
2284 <li>Write an <tt>XXXJITInfo.cpp</tt> file that implements the JIT interfaces for
2285     target-specific code-generation activities, such as emitting machine code
2286     and stubs.</li>
2287
2288 <li>Modify <tt>XXXTargetMachine</tt> so that it provides a
2289     <tt>TargetJITInfo</tt> object through its <tt>getJITInfo</tt> method.</li>
2290 </ul>
2291
2292 <p>
2293 There are several different approaches to writing the JIT support code. For
2294 instance, TableGen and target descriptor files may be used for creating a JIT
2295 code generator, but are not mandatory. For the Alpha and PowerPC target
2296 machines, TableGen is used to generate <tt>XXXGenCodeEmitter.inc</tt>, which
2297 contains the binary coding of machine instructions and the
2298 <tt>getBinaryCodeForInstr</tt> method to access those codes. Other JIT
2299 implementations do not.
2300 </p>
2301
2302 <p>
2303 Both <tt>XXXJITInfo.cpp</tt> and <tt>XXXCodeEmitter.cpp</tt> must include the
2304 <tt>llvm/CodeGen/MachineCodeEmitter.h</tt> header file that defines the
2305 <tt>MachineCodeEmitter</tt> class containing code for several callback functions
2306 that write data (in bytes, words, strings, etc.) to the output stream.
2307 </p>
2308
2309 </div>
2310
2311 <!-- ======================================================================= -->
2312 <div class="doc_subsection">
2313   <a name="mce">Machine Code Emitter</a>
2314 </div>
2315
2316 <div class="doc_text">
2317
2318 <p>
2319 In <tt>XXXCodeEmitter.cpp</tt>, a target-specific of the <tt>Emitter</tt> class
2320 is implemented as a function pass (subclass
2321 of <tt>MachineFunctionPass</tt>). The target-specific implementation
2322 of <tt>runOnMachineFunction</tt> (invoked by
2323 <tt>runOnFunction</tt> in <tt>MachineFunctionPass</tt>) iterates through the
2324 <tt>MachineBasicBlock</tt> calls <tt>emitInstruction</tt> to process each
2325 instruction and emit binary code. <tt>emitInstruction</tt> is largely
2326 implemented with case statements on the instruction types defined in
2327 <tt>XXXInstrInfo.h</tt>. For example, in <tt>X86CodeEmitter.cpp</tt>,
2328 the <tt>emitInstruction</tt> method is built around the following switch/case
2329 statements:
2330 </p>
2331
2332 <div class="doc_code">
2333 <pre>
2334 switch (Desc-&gt;TSFlags &amp; X86::FormMask) {
2335 case X86II::Pseudo:  // for not yet implemented instructions
2336    ...               // or pseudo-instructions
2337    break;
2338 case X86II::RawFrm:  // for instructions with a fixed opcode value
2339    ...
2340    break;
2341 case X86II::AddRegFrm: // for instructions that have one register operand
2342    ...                 // added to their opcode
2343    break;
2344 case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
2345    ...                 // to specify a destination (register)
2346    break;
2347 case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
2348    ...                 // to specify a destination (memory)
2349    break;
2350 case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
2351    ...                 // to specify a source (register)
2352    break;
2353 case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
2354    ...                 // to specify a source (memory)
2355    break;
2356 case X86II::MRM0r: case X86II::MRM1r:  // for instructions that operate on
2357 case X86II::MRM2r: case X86II::MRM3r:  // a REGISTER r/m operand and
2358 case X86II::MRM4r: case X86II::MRM5r:  // use the Mod/RM byte and a field
2359 case X86II::MRM6r: case X86II::MRM7r:  // to hold extended opcode data
2360    ...
2361    break;
2362 case X86II::MRM0m: case X86II::MRM1m:  // for instructions that operate on
2363 case X86II::MRM2m: case X86II::MRM3m:  // a MEMORY r/m operand and
2364 case X86II::MRM4m: case X86II::MRM5m:  // use the Mod/RM byte and a field
2365 case X86II::MRM6m: case X86II::MRM7m:  // to hold extended opcode data
2366    ...
2367    break;
2368 case X86II::MRMInitReg: // for instructions whose source and
2369    ...                  // destination are the same register
2370    break;
2371 }
2372 </pre>
2373 </div>
2374
2375 <p>
2376 The implementations of these case statements often first emit the opcode and
2377 then get the operand(s). Then depending upon the operand, helper methods may be
2378 called to process the operand(s). For example, in <tt>X86CodeEmitter.cpp</tt>,
2379 for the <tt>X86II::AddRegFrm</tt> case, the first data emitted
2380 (by <tt>emitByte</tt>) is the opcode added to the register operand. Then an
2381 object representing the machine operand, <tt>MO1</tt>, is extracted. The helper
2382 methods such as <tt>isImmediate</tt>,
2383 <tt>isGlobalAddress</tt>, <tt>isExternalSymbol</tt>, <tt>isConstantPoolIndex</tt>, and
2384 <tt>isJumpTableIndex</tt> determine the operand
2385 type. (<tt>X86CodeEmitter.cpp</tt> also has private methods such
2386 as <tt>emitConstant</tt>, <tt>emitGlobalAddress</tt>,
2387 <tt>emitExternalSymbolAddress</tt>, <tt>emitConstPoolAddress</tt>,
2388 and <tt>emitJumpTableAddress</tt> that emit the data into the output stream.)
2389 </p>
2390
2391 <div class="doc_code">
2392 <pre>
2393 case X86II::AddRegFrm:
2394   MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
2395
2396   if (CurOp != NumOps) {
2397     const MachineOperand &amp;MO1 = MI.getOperand(CurOp++);
2398     unsigned Size = X86InstrInfo::sizeOfImm(Desc);
2399     if (MO1.isImmediate())
2400       emitConstant(MO1.getImm(), Size);
2401     else {
2402       unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
2403         : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
2404       if (Opcode == X86::MOV64ri)
2405         rt = X86::reloc_absolute_dword;  // FIXME: add X86II flag?
2406       if (MO1.isGlobalAddress()) {
2407         bool NeedStub = isa&lt;Function&gt;(MO1.getGlobal());
2408         bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
2409         emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
2410                           NeedStub, isLazy);
2411       } else if (MO1.isExternalSymbol())
2412         emitExternalSymbolAddress(MO1.getSymbolName(), rt);
2413       else if (MO1.isConstantPoolIndex())
2414         emitConstPoolAddress(MO1.getIndex(), rt);
2415       else if (MO1.isJumpTableIndex())
2416         emitJumpTableAddress(MO1.getIndex(), rt);
2417     }
2418   }
2419   break;
2420 </pre>
2421 </div>
2422
2423 <p>
2424 In the previous example, <tt>XXXCodeEmitter.cpp</tt> uses the
2425 variable <tt>rt</tt>, which is a RelocationType enum that may be used to
2426 relocate addresses (for example, a global address with a PIC base offset). The
2427 <tt>RelocationType</tt> enum for that target is defined in the short
2428 target-specific <tt>XXXRelocations.h</tt> file. The <tt>RelocationType</tt> is used by
2429 the <tt>relocate</tt> method defined in <tt>XXXJITInfo.cpp</tt> to rewrite
2430 addresses for referenced global symbols.
2431 </p>
2432
2433 <p>
2434 For example, <tt>X86Relocations.h</tt> specifies the following relocation types
2435 for the X86 addresses. In all four cases, the relocated value is added to the
2436 value already in memory. For <tt>reloc_pcrel_word</tt>
2437 and <tt>reloc_picrel_word</tt>, there is an additional initial adjustment.
2438 </p>
2439
2440 <div class="doc_code">
2441 <pre>
2442 enum RelocationType {
2443   reloc_pcrel_word = 0,    // add reloc value after adjusting for the PC loc
2444   reloc_picrel_word = 1,   // add reloc value after adjusting for the PIC base
2445   reloc_absolute_word = 2, // absolute relocation; no additional adjustment
2446   reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
2447 };
2448 </pre>
2449 </div>
2450
2451 </div>
2452
2453 <!-- ======================================================================= -->
2454 <div class="doc_subsection">
2455   <a name="targetJITInfo">Target JIT Info</a>
2456 </div>
2457
2458 <div class="doc_text">
2459
2460 <p>
2461 <tt>XXXJITInfo.cpp</tt> implements the JIT interfaces for target-specific
2462 code-generation activities, such as emitting machine code and stubs. At minimum,
2463 a target-specific version of <tt>XXXJITInfo</tt> implements the following:
2464 </p>
2465
2466 <ul>
2467 <li><tt>getLazyResolverFunction</tt> &mdash; Initializes the JIT, gives the
2468     target a function that is used for compilation.</li>
2469
2470 <li><tt>emitFunctionStub</tt> &mdash; Returns a native function with a specified
2471     address for a callback function.</li>
2472
2473 <li><tt>relocate</tt> &mdash; Changes the addresses of referenced globals, based
2474     on relocation types.</li>
2475
2476 <li>Callback function that are wrappers to a function stub that is used when the
2477     real target is not initially known.</li>
2478 </ul>
2479
2480 <p>
2481 <tt>getLazyResolverFunction</tt> is generally trivial to implement. It makes the
2482 incoming parameter as the global <tt>JITCompilerFunction</tt> and returns the
2483 callback function that will be used a function wrapper. For the Alpha target
2484 (in <tt>AlphaJITInfo.cpp</tt>), the <tt>getLazyResolverFunction</tt>
2485 implementation is simply:
2486 </p>
2487
2488 <div class="doc_code">
2489 <pre>
2490 TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(
2491                                             JITCompilerFn F) {
2492   JITCompilerFunction = F;
2493   return AlphaCompilationCallback;
2494 }
2495 </pre>
2496 </div>
2497
2498 <p>
2499 For the X86 target, the <tt>getLazyResolverFunction</tt> implementation is a
2500 little more complication, because it returns a different callback function for
2501 processors with SSE instructions and XMM registers.
2502 </p>
2503
2504 <p>
2505 The callback function initially saves and later restores the callee register
2506 values, incoming arguments, and frame and return address. The callback function
2507 needs low-level access to the registers or stack, so it is typically implemented
2508 with assembler.
2509 </p>
2510
2511 </div>
2512
2513 <!-- *********************************************************************** -->
2514
2515 <hr>
2516 <address>
2517   <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
2518   src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
2519   <a href="http://validator.w3.org/check/referer"><img
2520   src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
2521
2522   <a href="http://www.woo.com">Mason Woo</a> and <a href="http://misha.brukman.net">Misha Brukman</a><br>
2523   <a href="http://llvm.org">The LLVM Compiler Infrastructure</a>
2524   <br>
2525   Last modified: $Date$
2526 </address>
2527
2528 </body>
2529 </html>