docs/WritingAnLLVMBackend.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   2                       "http://www.w3.org/TR/html4/strict.dtd">
   3 <html>
   4 <head>
   5   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   6   <title>Writing an LLVM Compiler Backend</title>
   7   <link rel="stylesheet" href="llvm.css" type="text/css">
   8 </head>
   9
  10 <body>
  11
  12 <div class="doc_title">
  13   Writing an LLVM Compiler Backend
  14 </div>
  15
  16 <ol>
  17   <li><a href="#intro">Introduction</a>
  18   <ul>
  19     <li><a href="#Audience">Audience</a></li>
  20     <li><a href="#Prerequisite">Prerequisite Reading</a></li>
  21     <li><a href="#Basic">Basic Steps</a></li>
  22     <li><a href="#Preliminaries">Preliminaries</a></li>
  23   </ul>
  24   <li><a href="#TargetMachine">Target Machine</a></li>
  25   <li><a href="#RegisterSet">Register Set and Register Classes</a>
  26   <ul>
  27     <li><a href="#RegisterDef">Defining a Register</a></li>
  28     <li><a href="#RegisterClassDef">Defining a Register Class</a></li>
  29     <li><a href="#implementRegister">Implement a subclass of TargetRegisterInfo</a></li>
  30   </ul></li>
  31   <li><a href="#InstructionSet">Instruction Set</a>
  32   <ul>
  33     <li><a href="#operandMapping">Instruction Operand Mapping</a></li>
  34     <li><a href="#implementInstr">Implement a subclass of TargetInstrInfo</a></li>
  35     <li><a href="#branchFolding">Branch Folding and If Conversion</a></li>
  36   </ul></li>
  37   <li><a href="#InstructionSelector">Instruction Selector</a>
  38   <ul>
  39     <li><a href="#LegalizePhase">The SelectionDAG Legalize Phase</a>
  40     <ul>
  41       <li><a href="#promote">Promote</a></li>
  42       <li><a href="#expand">Expand</a></li>
  43       <li><a href="#custom">Custom</a></li>
  44       <li><a href="#legal">Legal</a></li>
  45     </ul></li>
  46     <li><a href="#callingConventions">Calling Conventions</a></li>
  47   </ul></li>
  48   <li><a href="#assemblyPrinter">Assembly Printer</a></li>
  49   <li><a href="#subtargetSupport">Subtarget Support</a></li>
  50   <li><a href="#jitSupport">JIT Support</a>
  51   <ul>
  52     <li><a href="#mce">Machine Code Emitter</a></li>
  53     <li><a href="#targetJITInfo">Target JIT Info</a></li>
  54   </ul></li>
  55 </ol>
  56
  57 <div class="doc_author">
  58   <p>Written by <a href="http://www.woo.com">Mason Woo</a> and
  59                 <a href="http://misha.brukman.net">Misha Brukman</a></p>
  60 </div>
  61
  62 <!-- *********************************************************************** -->
  63 <div class="doc_section">
  64   <a name="intro">Introduction</a>
  65 </div>
  66 <!-- *********************************************************************** -->
  67
  68 <div class="doc_text">
  69
  70 <p>
  71 This document describes techniques for writing compiler backends that convert
  72 the LLVM Intermediate Representation (IR) to code for a specified machine or
  73 other languages. Code intended for a specific machine can take the form of
  74 either assembly code or binary code (usable for a JIT compiler).
  75 </p>
  76
  77 <p>
  78 The backend of LLVM features a target-independent code generator that may create
  79 output for several types of target CPUs &mdash; including X86, PowerPC, Alpha,
  80 and SPARC. The backend may also be used to generate code targeted at SPUs of the
  81 Cell processor or GPUs to support the execution of compute kernels.
  82 </p>
  83
  84 <p>
  85 The document focuses on existing examples found in subdirectories
  86 of <tt>llvm/lib/Target</tt> in a downloaded LLVM release. In particular, this
  87 document focuses on the example of creating a static compiler (one that emits
  88 text assembly) for a SPARC target, because SPARC has fairly standard
  89 characteristics, such as a RISC instruction set and straightforward calling
  90 conventions.
  91 </p>
  92
  93 </div>
  94
  95 <div class="doc_subsection">
  96   <a name="Audience">Audience</a>
  97 </div>
  98
  99 <div class="doc_text">
 100
 101 <p>
 102 The audience for this document is anyone who needs to write an LLVM backend to
 103 generate code for a specific hardware or software target.
 104 </p>
 105
 106 </div>
 107
 108 <div class="doc_subsection">
 109   <a name="Prerequisite">Prerequisite Reading</a>
 110 </div>
 111
 112 <div class="doc_text">
 113
 114 <p>
 115 These essential documents must be read before reading this document:
 116 </p>
 117
 118 <ul>
 119 <li><i><a href="http://www.llvm.org/docs/LangRef.html">LLVM Language Reference
 120     Manual</a></i> &mdash; a reference manual for the LLVM assembly language.</li>
 121
 122 <li><i><a href="http://www.llvm.org/docs/CodeGenerator.html">The LLVM
 123     Target-Independent Code Generator</a></i> &mdash; a guide to the components
 124     (classes and code generation algorithms) for translating the LLVM internal
 125     representation into machine code for a specified target.  Pay particular
 126     attention to the descriptions of code generation stages: Instruction
 127     Selection, Scheduling and Formation, SSA-based Optimization, Register
 128     Allocation, Prolog/Epilog Code Insertion, Late Machine Code Optimizations,
 129     and Code Emission.</li>
 130
 131 <li><i><a href="http://www.llvm.org/docs/TableGenFundamentals.html">TableGen
 132     Fundamentals</a></i> &mdash;a document that describes the TableGen
 133     (<tt>tblgen</tt>) application that manages domain-specific information to
 134     support LLVM code generation. TableGen processes input from a target
 135     description file (<tt>.td</tt> suffix) and generates C++ code that can be
 136     used for code generation.</li>
 137
 138 <li><i><a href="http://www.llvm.org/docs/WritingAnLLVMPass.html">Writing an LLVM
 139     Pass</a></i> &mdash; The assembly printer is a <tt>FunctionPass</tt>, as are
 140     several SelectionDAG processing steps.</li>
 141 </ul>
 142
 143 <p>
 144 To follow the SPARC examples in this document, have a copy of
 145 <i><a href="http://www.sparc.org/standards/V8.pdf">The SPARC Architecture
 146 Manual, Version 8</a></i> for reference. For details about the ARM instruction
 147 set, refer to the <i><a href="http://infocenter.arm.com/">ARM Architecture
 148 Reference Manual</a></i>. For more about the GNU Assembler format
 149 (<tt>GAS</tt>), see
 150 <i><a href="http://sourceware.org/binutils/docs/as/index.html">Using As</a></i>,
 151 especially for the assembly printer. <i>Using As</i> contains a list of target
 152 machine dependent features.
 153 </p>
 154
 155 </div>
 156
 157 <div class="doc_subsection">
 158   <a name="Basic">Basic Steps</a>
 159 </div>
 160
 161 <div class="doc_text">
 162
 163 <p>
 164 To write a compiler backend for LLVM that converts the LLVM IR to code for a
 165 specified target (machine or other language), follow these steps:
 166 </p>
 167
 168 <ul>
 169 <li>Create a subclass of the TargetMachine class that describes characteristics
 170     of your target machine. Copy existing examples of specific TargetMachine
 171     class and header files; for example, start with
 172     <tt>SparcTargetMachine.cpp</tt> and <tt>SparcTargetMachine.h</tt>, but
 173     change the file names for your target. Similarly, change code that
 174     references "Sparc" to reference your target. </li>
 175
 176 <li>Describe the register set of the target. Use TableGen to generate code for
 177     register definition, register aliases, and register classes from a
 178     target-specific <tt>RegisterInfo.td</tt> input file. You should also write
 179     additional code for a subclass of the TargetRegisterInfo class that
 180     represents the class register file data used for register allocation and
 181     also describes the interactions between registers.</li>
 182
 183 <li>Describe the instruction set of the target. Use TableGen to generate code
 184     for target-specific instructions from target-specific versions of
 185     <tt>TargetInstrFormats.td</tt> and <tt>TargetInstrInfo.td</tt>. You should
 186     write additional code for a subclass of the TargetInstrInfo class to
 187     represent machine instructions supported by the target machine. </li>
 188
 189 <li>Describe the selection and conversion of the LLVM IR from a Directed Acyclic
 190     Graph (DAG) representation of instructions to native target-specific
 191     instructions. Use TableGen to generate code that matches patterns and
 192     selects instructions based on additional information in a target-specific
 193     version of <tt>TargetInstrInfo.td</tt>. Write code
 194     for <tt>XXXISelDAGToDAG.cpp</tt>, where XXX identifies the specific target,
 195     to perform pattern matching and DAG-to-DAG instruction selection. Also write
 196     code in <tt>XXXISelLowering.cpp</tt> to replace or remove operations and
 197     data types that are not supported natively in a SelectionDAG. </li>
 198
 199 <li>Write code for an assembly printer that converts LLVM IR to a GAS format for
 200     your target machine.  You should add assembly strings to the instructions
 201     defined in your target-specific version of <tt>TargetInstrInfo.td</tt>. You
 202     should also write code for a subclass of AsmPrinter that performs the
 203     LLVM-to-assembly conversion and a trivial subclass of TargetAsmInfo.</li>
 204
 205 <li>Optionally, add support for subtargets (i.e., variants with different
 206     capabilities). You should also write code for a subclass of the
 207     TargetSubtarget class, which allows you to use the <tt>-mcpu=</tt>
 208     and <tt>-mattr=</tt> command-line options.</li>
 209
 210 <li>Optionally, add JIT support and create a machine code emitter (subclass of
 211     TargetJITInfo) that is used to emit binary code directly into memory. </li>
 212 </ul>
 213
 214 <p>
 215 In the <tt>.cpp</tt> and <tt>.h</tt>. files, initially stub up these methods and
 216 then implement them later. Initially, you may not know which private members
 217 that the class will need and which components will need to be subclassed.
 218 </p>
 219
 220 </div>
 221
 222 <div class="doc_subsection">
 223   <a name="Preliminaries">Preliminaries</a>
 224 </div>
 225
 226 <div class="doc_text">
 227
 228 <p>
 229 To actually create your compiler backend, you need to create and modify a few
 230 files. The absolute minimum is discussed here. But to actually use the LLVM
 231 target-independent code generator, you must perform the steps described in
 232 the <a href="http://www.llvm.org/docs/CodeGenerator.html">LLVM
 233 Target-Independent Code Generator</a> document.
 234 </p>
 235
 236 <p>
 237 First, you should create a subdirectory under <tt>lib/Target</tt> to hold all
 238 the files related to your target. If your target is called "Dummy," create the
 239 directory <tt>lib/Target/Dummy</tt>.
 240 </p>
 241
 242 <p>
 243 In this new
 244 directory, create a <tt>Makefile</tt>. It is easiest to copy a
 245 <tt>Makefile</tt> of another target and modify it. It should at least contain
 246 the <tt>LEVEL</tt>, <tt>LIBRARYNAME</tt> and <tt>TARGET</tt> variables, and then
 247 include <tt>$(LEVEL)/Makefile.common</tt>. The library can be
 248 named <tt>LLVMDummy</tt> (for example, see the MIPS target). Alternatively, you
 249 can split the library into <tt>LLVMDummyCodeGen</tt>
 250 and <tt>LLVMDummyAsmPrinter</tt>, the latter of which should be implemented in a
 251 subdirectory below <tt>lib/Target/Dummy</tt> (for example, see the PowerPC
 252 target).
 253 </p>
 254
 255 <p>
 256 Note that these two naming schemes are hardcoded into <tt>llvm-config</tt>.
 257 Using any other naming scheme will confuse <tt>llvm-config</tt> and produce a
 258 lot of (seemingly unrelated) linker errors when linking <tt>llc</tt>.
 259 </p>
 260
 261 <p>
 262 To make your target actually do something, you need to implement a subclass of
 263 <tt>TargetMachine</tt>. This implementation should typically be in the file
 264 <tt>lib/Target/DummyTargetMachine.cpp</tt>, but any file in
 265 the <tt>lib/Target</tt> directory will be built and should work. To use LLVM's
 266 target independent code generator, you should do what all current machine
 267 backends do: create a subclass of <tt>LLVMTargetMachine</tt>. (To create a
 268 target from scratch, create a subclass of <tt>TargetMachine</tt>.)
 269 </p>
 270
 271 <p>
 272 To get LLVM to actually build and link your target, you need to add it to
 273 the <tt>TARGETS_TO_BUILD</tt> variable. To do this, you modify the configure
 274 script to know about your target when parsing the <tt>--enable-targets</tt>
 275 option. Search the configure script for <tt>TARGETS_TO_BUILD</tt>, add your
 276 target to the lists there (some creativity required), and then
 277 reconfigure. Alternatively, you can change <tt>autotools/configure.ac</tt> and
 278 regenerate configure by running <tt>./autoconf/AutoRegen.sh</tt>.
 279 </p>
 280
 281 </div>
 282
 283 <!-- *********************************************************************** -->
 284 <div class="doc_section">
 285   <a name="TargetMachine">Target Machine</a>
 286 </div>
 287 <!-- *********************************************************************** -->
 288
 289 <div class="doc_text">
 290
 291 <p>
 292 <tt>LLVMTargetMachine</tt> is designed as a base class for targets implemented
 293 with the LLVM target-independent code generator. The <tt>LLVMTargetMachine</tt>
 294 class should be specialized by a concrete target class that implements the
 295 various virtual methods. <tt>LLVMTargetMachine</tt> is defined as a subclass of
 296 <tt>TargetMachine</tt> in <tt>include/llvm/Target/TargetMachine.h</tt>. The
 297 <tt>TargetMachine</tt> class implementation (<tt>TargetMachine.cpp</tt>) also
 298 processes numerous command-line options.
 299 </p>
 300
 301 <p>
 302 To create a concrete target-specific subclass of <tt>LLVMTargetMachine</tt>,
 303 start by copying an existing <tt>TargetMachine</tt> class and header.  You
 304 should name the files that you create to reflect your specific target. For
 305 instance, for the SPARC target, name the files <tt>SparcTargetMachine.h</tt> and
 306 <tt>SparcTargetMachine.cpp</tt>.
 307 </p>
 308
 309 <p>
 310 For a target machine <tt>XXX</tt>, the implementation of
 311 <tt>XXXTargetMachine</tt> must have access methods to obtain objects that
 312 represent target components.  These methods are named <tt>get*Info</tt>, and are
 313 intended to obtain the instruction set (<tt>getInstrInfo</tt>), register set
 314 (<tt>getRegisterInfo</tt>), stack frame layout (<tt>getFrameInfo</tt>), and
 315 similar information. <tt>XXXTargetMachine</tt> must also implement the
 316 <tt>getTargetData</tt> method to access an object with target-specific data
 317 characteristics, such as data type size and alignment requirements.
 318 </p>
 319
 320 <p>
 321 For instance, for the SPARC target, the header file
 322 <tt>SparcTargetMachine.h</tt> declares prototypes for several <tt>get*Info</tt>
 323 and <tt>getTargetData</tt> methods that simply return a class member.
 324 </p>
 325
 326 <div class="doc_code">
 327 <pre>
 328 namespace llvm {
 329
 330 class Module;
 331
 332 class SparcTargetMachine : public LLVMTargetMachine {
 333   const TargetData DataLayout;       // Calculates type size &amp; alignment
 334   SparcSubtarget Subtarget;
 335   SparcInstrInfo InstrInfo;
 336   TargetFrameInfo FrameInfo;
 337
 338 protected:
 339   virtual const TargetAsmInfo *createTargetAsmInfo() const;
 340
 341 public:
 342   SparcTargetMachine(const Module &amp;M, const std::string &amp;FS);
 343
 344   virtual const SparcInstrInfo *getInstrInfo() const {return &amp;InstrInfo; }
 345   virtual const TargetFrameInfo *getFrameInfo() const {return &amp;FrameInfo; }
 346   virtual const TargetSubtarget *getSubtargetImpl() const{return &amp;Subtarget; }
 347   virtual const TargetRegisterInfo *getRegisterInfo() const {
 348     return &amp;InstrInfo.getRegisterInfo();
 349   }
 350   virtual const TargetData *getTargetData() const { return &amp;DataLayout; }
 351   static unsigned getModuleMatchQuality(const Module &amp;M);
 352
 353   // Pass Pipeline Configuration
 354   virtual bool addInstSelector(PassManagerBase &amp;PM, bool Fast);
 355   virtual bool addPreEmitPass(PassManagerBase &amp;PM, bool Fast);
 356   virtual bool addAssemblyEmitter(PassManagerBase &amp;PM, bool Fast,
 357                                   std::ostream &amp;Out);
 358 };
 359
 360 } // end namespace llvm
 361 </pre>
 362 </div>
 363
 364 </div>
 365
 366
 367 <div class="doc_text">
 368
 369 <ul>
 370 <li><tt>getInstrInfo()</tt></li>
 371 <li><tt>getRegisterInfo()</tt></li>
 372 <li><tt>getFrameInfo()</tt></li>
 373 <li><tt>getTargetData()</tt></li>
 374 <li><tt>getSubtargetImpl()</tt></li>
 375 </ul>
 376
 377 <p>For some targets, you also need to support the following methods:</p>
 378
 379 <ul>
 380 <li><tt>getTargetLowering()</tt></li>
 381 <li><tt>getJITInfo()</tt></li>
 382 </ul>
 383
 384 <p>
 385 In addition, the <tt>XXXTargetMachine</tt> constructor should specify a
 386 <tt>TargetDescription</tt> string that determines the data layout for the target
 387 machine, including characteristics such as pointer size, alignment, and
 388 endianness. For example, the constructor for SparcTargetMachine contains the
 389 following:
 390 </p>
 391
 392 <div class="doc_code">
 393 <pre>
 394 SparcTargetMachine::SparcTargetMachine(const Module &amp;M, const std::string &amp;FS)
 395   : DataLayout("E-p:32:32-f128:128:128"),
 396     Subtarget(M, FS), InstrInfo(Subtarget),
 397     FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
 398 }
 399 </pre>
 400 </div>
 401
 402 </div>
 403
 404 <div class="doc_text">
 405
 406 <p>Hyphens separate portions of the <tt>TargetDescription</tt> string.</p>
 407
 408 <ul>
 409 <li>An upper-case "<tt>E</tt>" in the string indicates a big-endian target data
 410     model. a lower-case "<tt>e</tt>" indicates little-endian.</li>
 411
 412 <li>"<tt>p:</tt>" is followed by pointer information: size, ABI alignment, and
 413     preferred alignment. If only two figures follow "<tt>p:</tt>", then the
 414     first value is pointer size, and the second value is both ABI and preferred
 415     alignment.</li>
 416
 417 <li>Then a letter for numeric type alignment: "<tt>i</tt>", "<tt>f</tt>",
 418     "<tt>v</tt>", or "<tt>a</tt>" (corresponding to integer, floating point,
 419     vector, or aggregate). "<tt>i</tt>", "<tt>v</tt>", or "<tt>a</tt>" are
 420     followed by ABI alignment and preferred alignment. "<tt>f</tt>" is followed
 421     by three values: the first indicates the size of a long double, then ABI
 422     alignment, and then ABI preferred alignment.</li>
 423 </ul>
 424
 425 <p>
 426 You must also register your target using the <tt>RegisterTarget</tt>
 427 template. (See the <tt>TargetMachineRegistry</tt> class.) For example,
 428 in <tt>SparcTargetMachine.cpp</tt>, the target is registered with:
 429 </p>
 430
 431 <div class="doc_code">
 432 <pre>
 433 namespace {
 434   // Register the target.
 435   RegisterTarget&lt;SparcTargetMachine&gt;X("sparc", "SPARC");
 436 }
 437 </pre>
 438 </div>
 439
 440 </div>
 441
 442 <!-- *********************************************************************** -->
 443 <div class="doc_section">
 444   <a name="RegisterSet">Register Set and Register Classes</a>
 445 </div>
 446 <!-- *********************************************************************** -->
 447
 448 <div class="doc_text">
 449
 450 <p>
 451 You should describe a concrete target-specific class that represents the
 452 register file of a target machine. This class is called <tt>XXXRegisterInfo</tt>
 453 (where <tt>XXX</tt> identifies the target) and represents the class register
 454 file data that is used for register allocation. It also describes the
 455 interactions between registers.
 456 </p>
 457
 458 <p>
 459 You also need to define register classes to categorize related registers. A
 460 register class should be added for groups of registers that are all treated the
 461 same way for some instruction. Typical examples are register classes for
 462 integer, floating-point, or vector registers. A register allocator allows an
 463 instruction to use any register in a specified register class to perform the
 464 instruction in a similar manner. Register classes allocate virtual registers to
 465 instructions from these sets, and register classes let the target-independent
 466 register allocator automatically choose the actual registers.
 467 </p>
 468
 469 <p>
 470 Much of the code for registers, including register definition, register aliases,
 471 and register classes, is generated by TableGen from <tt>XXXRegisterInfo.td</tt>
 472 input files and placed in <tt>XXXGenRegisterInfo.h.inc</tt> and
 473 <tt>XXXGenRegisterInfo.inc</tt> output files. Some of the code in the
 474 implementation of <tt>XXXRegisterInfo</tt> requires hand-coding.
 475 </p>
 476
 477 </div>
 478
 479 <!-- ======================================================================= -->
 480 <div class="doc_subsection">
 481   <a name="RegisterDef">Defining a Register</a>
 482 </div>
 483
 484 <div class="doc_text">
 485
 486 <p>
 487 The <tt>XXXRegisterInfo.td</tt> file typically starts with register definitions
 488 for a target machine. The <tt>Register</tt> class (specified
 489 in <tt>Target.td</tt>) is used to define an object for each register. The
 490 specified string <tt>n</tt> becomes the <tt>Name</tt> of the register. The
 491 basic <tt>Register</tt> object does not have any subregisters and does not
 492 specify any aliases.
 493 </p>
 494
 495 <div class="doc_code">
 496 <pre>
 497 class Register&lt;string n&gt; {
 498   string Namespace = "";
 499   string AsmName = n;
 500   string Name = n;
 501   int SpillSize = 0;
 502   int SpillAlignment = 0;
 503   list&lt;Register&gt; Aliases = [];
 504   list&lt;Register&gt; SubRegs = [];
 505   list&lt;int&gt; DwarfNumbers = [];
 506 }
 507 </pre>
 508 </div>
 509
 510 <p>
 511 For example, in the <tt>X86RegisterInfo.td</tt> file, there are register
 512 definitions that utilize the Register class, such as:
 513 </p>
 514
 515 <div class="doc_code">
 516 <pre>
 517 def AL : Register&lt;"AL"&gt;, DwarfRegNum&lt;[0, 0, 0]&gt;;
 518 </pre>
 519 </div>
 520
 521 <p>
 522 This defines the register <tt>AL</tt> and assigns it values (with
 523 <tt>DwarfRegNum</tt>) that are used by <tt>gcc</tt>, <tt>gdb</tt>, or a debug
 524 information writer (such as <tt>DwarfWriter</tt>
 525 in <tt>llvm/lib/CodeGen/AsmPrinter</tt>) to identify a register. For register
 526 <tt>AL</tt>, <tt>DwarfRegNum</tt> takes an array of 3 values representing 3
 527 different modes: the first element is for X86-64, the second for exception
 528 handling (EH) on X86-32, and the third is generic. -1 is a special Dwarf number
 529 that indicates the gcc number is undefined, and -2 indicates the register number
 530 is invalid for this mode.
 531 </p>
 532
 533 <p>
 534 From the previously described line in the <tt>X86RegisterInfo.td</tt> file,
 535 TableGen generates this code in the <tt>X86GenRegisterInfo.inc</tt> file:
 536 </p>
 537
 538 <div class="doc_code">
 539 <pre>
 540 static const unsigned GR8[] = { X86::AL, ... };
 541
 542 const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
 543
 544 const TargetRegisterDesc RegisterDescriptors[] = {
 545   ...
 546 { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
 547 </pre>
 548 </div>
 549
 550 <p>
 551 From the register info file, TableGen generates a <tt>TargetRegisterDesc</tt>
 552 object for each register. <tt>TargetRegisterDesc</tt> is defined in
 553 <tt>include/llvm/Target/TargetRegisterInfo.h</tt> with the following fields:
 554 </p>
 555
 556 <div class="doc_code">
 557 <pre>
 558 struct TargetRegisterDesc {
 559   const char     *AsmName;      // Assembly language name for the register
 560   const char     *Name;         // Printable name for the reg (for debugging)
 561   const unsigned *AliasSet;     // Register Alias Set
 562   const unsigned *SubRegs;      // Sub-register set
 563   const unsigned *ImmSubRegs;   // Immediate sub-register set
 564   const unsigned *SuperRegs;    // Super-register set
 565 };</pre>
 566 </div>
 567
 568 <p>
 569 TableGen uses the entire target description file (<tt>.td</tt>) to determine
 570 text names for the register (in the <tt>AsmName</tt> and <tt>Name</tt> fields of
 571 <tt>TargetRegisterDesc</tt>) and the relationships of other registers to the
 572 defined register (in the other <tt>TargetRegisterDesc</tt> fields). In this
 573 example, other definitions establish the registers "<tt>AX</tt>",
 574 "<tt>EAX</tt>", and "<tt>RAX</tt>" as aliases for one another, so TableGen
 575 generates a null-terminated array (<tt>AL_AliasSet</tt>) for this register alias
 576 set.
 577 </p>
 578
 579 <p>
 580 The <tt>Register</tt> class is commonly used as a base class for more complex
 581 classes. In <tt>Target.td</tt>, the <tt>Register</tt> class is the base for the
 582 <tt>RegisterWithSubRegs</tt> class that is used to define registers that need to
 583 specify subregisters in the <tt>SubRegs</tt> list, as shown here:
 584 </p>
 585
 586 <div class="doc_code">
 587 <pre>
 588 class RegisterWithSubRegs&lt;string n,
 589 list&lt;Register&gt; subregs&gt; : Register&lt;n&gt; {
 590   let SubRegs = subregs;
 591 }
 592 </pre>
 593 </div>
 594
 595 <p>
 596 In <tt>SparcRegisterInfo.td</tt>, additional register classes are defined for
 597 SPARC: a Register subclass, SparcReg, and further subclasses: <tt>Ri</tt>,
 598 <tt>Rf</tt>, and <tt>Rd</tt>. SPARC registers are identified by 5-bit ID
 599 numbers, which is a feature common to these subclasses. Note the use of
 600 '<tt>let</tt>' expressions to override values that are initially defined in a
 601 superclass (such as <tt>SubRegs</tt> field in the <tt>Rd</tt> class).
 602 </p>
 603
 604 <div class="doc_code">
 605 <pre>
 606 class SparcReg&lt;string n&gt; : Register&lt;n&gt; {
 607   field bits&lt;5&gt; Num;
 608   let Namespace = "SP";
 609 }
 610 // Ri - 32-bit integer registers
 611 class Ri&lt;bits&lt;5&gt; num, string n&gt; :
 612 SparcReg&lt;n&gt; {
 613   let Num = num;
 614 }
 615 // Rf - 32-bit floating-point registers
 616 class Rf&lt;bits&lt;5&gt; num, string n&gt; :
 617 SparcReg&lt;n&gt; {
 618   let Num = num;
 619 }
 620 // Rd - Slots in the FP register file for 64-bit
 621 floating-point values.
 622 class Rd&lt;bits&lt;5&gt; num, string n,
 623 list&lt;Register&gt; subregs&gt; : SparcReg&lt;n&gt; {
 624   let Num = num;
 625   let SubRegs = subregs;
 626 }
 627 </pre>
 628 </div>
 629
 630 <p>
 631 In the <tt>SparcRegisterInfo.td</tt> file, there are register definitions that
 632 utilize these subclasses of <tt>Register</tt>, such as:
 633 </p>
 634
 635 <div class="doc_code">
 636 <pre>
 637 def G0 : Ri&lt; 0, "G0"&gt;,
 638 DwarfRegNum&lt;[0]&gt;;
 639 def G1 : Ri&lt; 1, "G1"&gt;, DwarfRegNum&lt;[1]&gt;;
 640 ...
 641 def F0 : Rf&lt; 0, "F0"&gt;,
 642 DwarfRegNum&lt;[32]&gt;;
 643 def F1 : Rf&lt; 1, "F1"&gt;,
 644 DwarfRegNum&lt;[33]&gt;;
 645 ...
 646 def D0 : Rd&lt; 0, "F0", [F0, F1]&gt;,
 647 DwarfRegNum&lt;[32]&gt;;
 648 def D1 : Rd&lt; 2, "F2", [F2, F3]&gt;,
 649 DwarfRegNum&lt;[34]&gt;;
 650 </pre>
 651 </div>
 652
 653 <p>
 654 The last two registers shown above (<tt>D0</tt> and <tt>D1</tt>) are
 655 double-precision floating-point registers that are aliases for pairs of
 656 single-precision floating-point sub-registers. In addition to aliases, the
 657 sub-register and super-register relationships of the defined register are in
 658 fields of a register's TargetRegisterDesc.
 659 </p>
 660
 661 </div>
 662
 663 <!-- ======================================================================= -->
 664 <div class="doc_subsection">
 665   <a name="RegisterClassDef">Defining a Register Class</a>
 666 </div>
 667
 668 <div class="doc_text">
 669
 670 <p>
 671 The <tt>RegisterClass</tt> class (specified in <tt>Target.td</tt>) is used to
 672 define an object that represents a group of related registers and also defines
 673 the default allocation order of the registers. A target description file
 674 <tt>XXXRegisterInfo.td</tt> that uses <tt>Target.td</tt> can construct register
 675 classes using the following class:
 676 </p>
 677
 678 <div class="doc_code">
 679 <pre>
 680 class RegisterClass&lt;string namespace,
 681 list&lt;ValueType&gt; regTypes, int alignment,
 682                     list&lt;Register&gt; regList&gt; {
 683   string Namespace = namespace;
 684   list&lt;ValueType&gt; RegTypes = regTypes;
 685   int Size = 0;  // spill size, in bits; zero lets tblgen pick the size
 686   int Alignment = alignment;
 687
 688   // CopyCost is the cost of copying a value between two registers
 689   // default value 1 means a single instruction
 690   // A negative value means copying is extremely expensive or impossible
 691   int CopyCost = 1;
 692   list&lt;Register&gt; MemberList = regList;
 693
 694   // for register classes that are subregisters of this class
 695   list&lt;RegisterClass&gt; SubRegClassList = [];
 696
 697   code MethodProtos = [{}];  // to insert arbitrary code
 698   code MethodBodies = [{}];
 699 }
 700 </pre>
 701 </div>
 702
 703 <p>To define a RegisterClass, use the following 4 arguments:</p>
 704
 705 <ul>
 706 <li>The first argument of the definition is the name of the namespace.</li>
 707
 708 <li>The second argument is a list of <tt>ValueType</tt> register type values
 709     that are defined in <tt>include/llvm/CodeGen/ValueTypes.td</tt>. Defined
 710     values include integer types (such as <tt>i16</tt>, <tt>i32</tt>,
 711     and <tt>i1</tt> for Boolean), floating-point types
 712     (<tt>f32</tt>, <tt>f64</tt>), and vector types (for example, <tt>v8i16</tt>
 713     for an <tt>8 x i16</tt> vector). All registers in a <tt>RegisterClass</tt>
 714     must have the same <tt>ValueType</tt>, but some registers may store vector
 715     data in different configurations. For example a register that can process a
 716     128-bit vector may be able to handle 16 8-bit integer elements, 8 16-bit
 717     integers, 4 32-bit integers, and so on. </li>
 718
 719 <li>The third argument of the <tt>RegisterClass</tt> definition specifies the
 720     alignment required of the registers when they are stored or loaded to
 721     memory.</li>
 722
 723 <li>The final argument, <tt>regList</tt>, specifies which registers are in this
 724     class.  If an <tt>allocation_order_*</tt> method is not specified,
 725     then <tt>regList</tt> also defines the order of allocation used by the
 726     register allocator.</li>
 727 </ul>
 728
 729 <p>
 730 In <tt>SparcRegisterInfo.td</tt>, three RegisterClass objects are defined:
 731 <tt>FPRegs</tt>, <tt>DFPRegs</tt>, and <tt>IntRegs</tt>. For all three register
 732 classes, the first argument defines the namespace with the string
 733 '<tt>SP</tt>'. <tt>FPRegs</tt> defines a group of 32 single-precision
 734 floating-point registers (<tt>F0</tt> to <tt>F31</tt>); <tt>DFPRegs</tt> defines
 735 a group of 16 double-precision registers
 736 (<tt>D0-D15</tt>). For <tt>IntRegs</tt>, the <tt>MethodProtos</tt>
 737 and <tt>MethodBodies</tt> methods are used by TableGen to insert the specified
 738 code into generated output.
 739 </p>
 740
 741 <div class="doc_code">
 742 <pre>
 743 def FPRegs : RegisterClass&lt;"SP", [f32], 32,
 744   [F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14, F15,
 745    F16, F17, F18, F19, F20, F21, F22, F23, F24, F25, F26, F27, F28, F29, F30, F31]&gt;;
 746
 747 def DFPRegs : RegisterClass&lt;"SP", [f64], 64,
 748   [D0, D1, D2, D3, D4, D5, D6, D7, D8, D9, D10, D11, D12, D13, D14, D15]&gt;;
 749 &nbsp;
 750 def IntRegs : RegisterClass&lt;"SP", [i32], 32,
 751     [L0, L1, L2, L3, L4, L5, L6, L7,
 752      I0, I1, I2, I3, I4, I5,
 753      O0, O1, O2, O3, O4, O5, O7,
 754      G1,
 755      // Non-allocatable regs:
 756      G2, G3, G4,
 757      O6,        // stack ptr
 758     I6,        // frame ptr
 759      I7,        // return address
 760      G0,        // constant zero
 761      G5, G6, G7 // reserved for kernel
 762     ]&gt; {
 763   let MethodProtos = [{
 764     iterator allocation_order_end(const MachineFunction &amp;MF) const;
 765   }];
 766   let MethodBodies = [{
 767     IntRegsClass::iterator
 768     IntRegsClass::allocation_order_end(const MachineFunction &amp;MF) const {
 769       return end() - 10  // Don't allocate special registers
 770          -1;
 771     }
 772   }];
 773 }
 774 </pre>
 775 </div>
 776
 777 <p>
 778 Using <tt>SparcRegisterInfo.td</tt> with TableGen generates several output files
 779 that are intended for inclusion in other source code that you write.
 780 <tt>SparcRegisterInfo.td</tt> generates <tt>SparcGenRegisterInfo.h.inc</tt>,
 781 which should be included in the header file for the implementation of the SPARC
 782 register implementation that you write (<tt>SparcRegisterInfo.h</tt>). In
 783 <tt>SparcGenRegisterInfo.h.inc</tt> a new structure is defined called
 784 <tt>SparcGenRegisterInfo</tt> that uses <tt>TargetRegisterInfo</tt> as its
 785 base. It also specifies types, based upon the defined register
 786 classes: <tt>DFPRegsClass</tt>, <tt>FPRegsClass</tt>, and <tt>IntRegsClass</tt>.
 787 </p>
 788
 789 <p>
 790 <tt>SparcRegisterInfo.td</tt> also generates <tt>SparcGenRegisterInfo.inc</tt>,
 791 which is included at the bottom of <tt>SparcRegisterInfo.cpp</tt>, the SPARC
 792 register implementation. The code below shows only the generated integer
 793 registers and associated register classes. The order of registers
 794 in <tt>IntRegs</tt> reflects the order in the definition of <tt>IntRegs</tt> in
 795 the target description file. Take special note of the use
 796 of <tt>MethodBodies</tt> in <tt>SparcRegisterInfo.td</tt> to create code in
 797 <tt>SparcGenRegisterInfo.inc</tt>. <tt>MethodProtos</tt> generates similar code
 798 in <tt>SparcGenRegisterInfo.h.inc</tt>.
 799 </p>
 800
 801 <div class="doc_code">
 802 <pre>  // IntRegs Register Class...
 803   static const unsigned IntRegs[] = {
 804     SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
 805     SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3,
 806     SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3,
 807     SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3,
 808     SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5,
 809     SP::G6, SP::G7,
 810   };
 811
 812   // IntRegsVTs Register Class Value Types...
 813   static const MVT::ValueType IntRegsVTs[] = {
 814     MVT::i32, MVT::Other
 815   };
 816
 817 namespace SP {   // Register class instances
 818   DFPRegsClass&nbsp;&nbsp;&nbsp; DFPRegsRegClass;
 819   FPRegsClass&nbsp;&nbsp;&nbsp;&nbsp; FPRegsRegClass;
 820   IntRegsClass&nbsp;&nbsp;&nbsp; IntRegsRegClass;
 821 ...
 822   // IntRegs Sub-register Classess...
 823   static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
 824     NULL
 825   };
 826 ...
 827   // IntRegs Super-register Classess...
 828   static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
 829     NULL
 830   };
 831 ...
 832   // IntRegs Register Class sub-classes...
 833   static const TargetRegisterClass* const IntRegsSubclasses [] = {
 834     NULL
 835   };
 836 ...
 837   // IntRegs Register Class super-classes...
 838   static const TargetRegisterClass* const IntRegsSuperclasses [] = {
 839     NULL
 840   };
 841 ...
 842   IntRegsClass::iterator
 843   IntRegsClass::allocation_order_end(const MachineFunction &amp;MF) const {
 844      return end()-10  // Don't allocate special registers
 845          -1;
 846   }
 847
 848   IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
 849     IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
 850     IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
 851 }
 852 </pre>
 853 </div>
 854
 855 </div>
 856
 857 <!-- ======================================================================= -->
 858 <div class="doc_subsection">
 859   <a name="implementRegister">Implement a subclass of</a>
 860   <a href="http://www.llvm.org/docs/CodeGenerator.html#targetregisterinfo">TargetRegisterInfo</a>
 861 </div>
 862
 863 <div class="doc_text">
 864
 865 <p>
 866 The final step is to hand code portions of <tt>XXXRegisterInfo</tt>, which
 867 implements the interface described in <tt>TargetRegisterInfo.h</tt>. These
 868 functions return <tt>0</tt>, <tt>NULL</tt>, or <tt>false</tt>, unless
 869 overridden. Here is a list of functions that are overridden for the SPARC
 870 implementation in <tt>SparcRegisterInfo.cpp</tt>:
 871 </p>
 872
 873 <ul>
 874 <li><tt>getCalleeSavedRegs</tt> &mdash; Returns a list of callee-saved registers
 875     in the order of the desired callee-save stack frame offset.</li>
 876
 877 <li><tt>getCalleeSavedRegClasses</tt> &mdash; Returns a list of preferred
 878     register classes with which to spill each callee saved register.</li>
 879
 880 <li><tt>getReservedRegs</tt> &mdash; Returns a bitset indexed by physical
 881     register numbers, indicating if a particular register is unavailable.</li>
 882
 883 <li><tt>hasFP</tt> &mdash; Return a Boolean indicating if a function should have
 884     a dedicated frame pointer register.</li>
 885
 886 <li><tt>eliminateCallFramePseudoInstr</tt> &mdash; If call frame setup or
 887     destroy pseudo instructions are used, this can be called to eliminate
 888     them.</li>
 889
 890 <li><tt>eliminateFrameIndex</tt> &mdash; Eliminate abstract frame indices from
 891     instructions that may use them.</li>
 892
 893 <li><tt>emitPrologue</tt> &mdash; Insert prologue code into the function.</li>
 894
 895 <li><tt>emitEpilogue</tt> &mdash; Insert epilogue code into the function.</li>
 896 </ul>
 897
 898 </div>
 899
 900 <!-- *********************************************************************** -->
 901 <div class="doc_section">
 902   <a name="InstructionSet">Instruction Set</a>
 903 </div>
 904
 905 <!-- *********************************************************************** -->
 906 <div class="doc_text">
 907
 908 <p>
 909 During the early stages of code generation, the LLVM IR code is converted to a
 910 <tt>SelectionDAG</tt> with nodes that are instances of the <tt>SDNode</tt> class
 911 containing target instructions. An <tt>SDNode</tt> has an opcode, operands, type
 912 requirements, and operation properties. For example, is an operation
 913 commutative, does an operation load from memory. The various operation node
 914 types are described in the <tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt>
 915 file (values of the <tt>NodeType</tt> enum in the <tt>ISD</tt> namespace).
 916 </p>
 917
 918 <p>
 919 TableGen uses the following target description (<tt>.td</tt>) input files to
 920 generate much of the code for instruction definition:
 921 </p>
 922
 923 <ul>
 924 <li><tt>Target.td</tt> &mdash; Where the <tt>Instruction</tt>, <tt>Operand</tt>,
 925     <tt>InstrInfo</tt>, and other fundamental classes are defined.</li>
 926
 927 <li><tt>TargetSelectionDAG.td</tt>&mdash; Used by <tt>SelectionDAG</tt>
 928     instruction selection generators, contains <tt>SDTC*</tt> classes (selection
 929     DAG type constraint), definitions of <tt>SelectionDAG</tt> nodes (such as
 930     <tt>imm</tt>, <tt>cond</tt>, <tt>bb</tt>, <tt>add</tt>, <tt>fadd</tt>,
 931     <tt>sub</tt>), and pattern support (<tt>Pattern</tt>, <tt>Pat</tt>,
 932     <tt>PatFrag</tt>, <tt>PatLeaf</tt>, <tt>ComplexPattern</tt>.</li>
 933
 934 <li><tt>XXXInstrFormats.td</tt> &mdash; Patterns for definitions of
 935     target-specific instructions.</li>
 936
 937 <li><tt>XXXInstrInfo.td</tt> &mdash; Target-specific definitions of instruction
 938     templates, condition codes, and instructions of an instruction set. For
 939     architecture modifications, a different file name may be used. For example,
 940     for Pentium with SSE instruction, this file is <tt>X86InstrSSE.td</tt>, and
 941     for Pentium with MMX, this file is <tt>X86InstrMMX.td</tt>.</li>
 942 </ul>
 943
 944 <p>
 945 There is also a target-specific <tt>XXX.td</tt> file, where <tt>XXX</tt> is the
 946 name of the target. The <tt>XXX.td</tt> file includes the other <tt>.td</tt>
 947 input files, but its contents are only directly important for subtargets.
 948 </p>
 949
 950 <p>
 951 You should describe a concrete target-specific class <tt>XXXInstrInfo</tt> that
 952 represents machine instructions supported by a target machine.
 953 <tt>XXXInstrInfo</tt> contains an array of <tt>XXXInstrDescriptor</tt> objects,
 954 each of which describes one instruction. An instruction descriptor defines:</p>
 955
 956 <ul>
 957 <li>Opcode mnemonic</li>
 958
 959 <li>Number of operands</li>
 960
 961 <li>List of implicit register definitions and uses</li>
 962
 963 <li>Target-independent properties (such as memory access, is commutable)</li>
 964
 965 <li>Target-specific flags </li>
 966 </ul>
 967
 968 <p>
 969 The Instruction class (defined in <tt>Target.td</tt>) is mostly used as a base
 970 for more complex instruction classes.
 971 </p>
 972
 973 <div class="doc_code">
 974 <pre>class Instruction {
 975   string Namespace = "";
 976   dag OutOperandList;       // An dag containing the MI def operand list.
 977   dag InOperandList;        // An dag containing the MI use operand list.
 978   string AsmString = "";    // The .s format to print the instruction with.
 979   list&lt;dag&gt; Pattern;  // Set to the DAG pattern for this instruction
 980   list&lt;Register&gt; Uses = [];
 981   list&lt;Register&gt; Defs = [];
 982   list&lt;Predicate&gt; Predicates = [];  // predicates turned into isel match code
 983   ... remainder not shown for space ...
 984 }
 985 </pre>
 986 </div>
 987
 988 <p>
 989 A <tt>SelectionDAG</tt> node (<tt>SDNode</tt>) should contain an object
 990 representing a target-specific instruction that is defined
 991 in <tt>XXXInstrInfo.td</tt>. The instruction objects should represent
 992 instructions from the architecture manual of the target machine (such as the
 993 SPARC Architecture Manual for the SPARC target).
 994 </p>
 995
 996 <p>
 997 A single instruction from the architecture manual is often modeled as multiple
 998 target instructions, depending upon its operands. For example, a manual might
 999 describe an add instruction that takes a register or an immediate operand. An
1000 LLVM target could model this with two instructions named <tt>ADDri</tt> and
1001 <tt>ADDrr</tt>.
1002 </p>
1003
1004 <p>
1005 You should define a class for each instruction category and define each opcode
1006 as a subclass of the category with appropriate parameters such as the fixed
1007 binary encoding of opcodes and extended opcodes. You should map the register
1008 bits to the bits of the instruction in which they are encoded (for the
1009 JIT). Also you should specify how the instruction should be printed when the
1010 automatic assembly printer is used.
1011 </p>
1012
1013 <p>
1014 As is described in the SPARC Architecture Manual, Version 8, there are three
1015 major 32-bit formats for instructions. Format 1 is only for the <tt>CALL</tt>
1016 instruction. Format 2 is for branch on condition codes and <tt>SETHI</tt> (set
1017 high bits of a register) instructions.  Format 3 is for other instructions.
1018 </p>
1019
1020 <p>
1021 Each of these formats has corresponding classes in <tt>SparcInstrFormat.td</tt>.
1022 <tt>InstSP</tt> is a base class for other instruction classes. Additional base
1023 classes are specified for more precise formats: for example
1024 in <tt>SparcInstrFormat.td</tt>, <tt>F2_1</tt> is for <tt>SETHI</tt>,
1025 and <tt>F2_2</tt> is for branches. There are three other base
1026 classes: <tt>F3_1</tt> for register/register operations, <tt>F3_2</tt> for
1027 register/immediate operations, and <tt>F3_3</tt> for floating-point
1028 operations. <tt>SparcInstrInfo.td</tt> also adds the base class Pseudo for
1029 synthetic SPARC instructions.
1030 </p>
1031
1032 <p>
1033 <tt>SparcInstrInfo.td</tt> largely consists of operand and instruction
1034 definitions for the SPARC target. In <tt>SparcInstrInfo.td</tt>, the following
1035 target description file entry, <tt>LDrr</tt>, defines the Load Integer
1036 instruction for a Word (the <tt>LD</tt> SPARC opcode) from a memory address to a
1037 register. The first parameter, the value 3 (<tt>11<sub>2</sub></tt>), is the
1038 operation value for this category of operation. The second parameter
1039 (<tt>000000<sub>2</sub></tt>) is the specific operation value
1040 for <tt>LD</tt>/Load Word. The third parameter is the output destination, which
1041 is a register operand and defined in the <tt>Register</tt> target description
1042 file (<tt>IntRegs</tt>).
1043 </p>
1044
1045 <div class="doc_code">
1046 <pre>def LDrr : F3_1 &lt;3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
1047                  "ld [$addr], $dst",
1048                  [(set IntRegs:$dst, (load ADDRrr:$addr))]&gt;;
1049 </pre>
1050 </div>
1051
1052 <p>
1053 The fourth parameter is the input source, which uses the address
1054 operand <tt>MEMrr</tt> that is defined earlier in <tt>SparcInstrInfo.td</tt>:
1055 </p>
1056
1057 <div class="doc_code">
1058 <pre>def MEMrr : Operand&lt;i32&gt; {
1059   let PrintMethod = "printMemOperand";
1060   let MIOperandInfo = (ops IntRegs, IntRegs);
1061 }
1062 </pre>
1063 </div>
1064
1065 <p>
1066 The fifth parameter is a string that is used by the assembly printer and can be
1067 left as an empty string until the assembly printer interface is implemented. The
1068 sixth and final parameter is the pattern used to match the instruction during
1069 the SelectionDAG Select Phase described in
1070 (<a href="http://www.llvm.org/docs/CodeGenerator.html">The LLVM
1071 Target-Independent Code Generator</a>).  This parameter is detailed in the next
1072 section, <a href="#InstructionSelector">Instruction Selector</a>.
1073 </p>
1074
1075 <p>
1076 Instruction class definitions are not overloaded for different operand types, so
1077 separate versions of instructions are needed for register, memory, or immediate
1078 value operands. For example, to perform a Load Integer instruction for a Word
1079 from an immediate operand to a register, the following instruction class is
1080 defined:
1081 </p>
1082
1083 <div class="doc_code">
1084 <pre>def LDri : F3_2 &lt;3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
1085                  "ld [$addr], $dst",
1086                  [(set IntRegs:$dst, (load ADDRri:$addr))]&gt;;
1087 </pre>
1088 </div>
1089
1090 <p>
1091 Writing these definitions for so many similar instructions can involve a lot of
1092 cut and paste. In td files, the <tt>multiclass</tt> directive enables the
1093 creation of templates to define several instruction classes at once (using
1094 the <tt>defm</tt> directive). For example in <tt>SparcInstrInfo.td</tt>, the
1095 <tt>multiclass</tt> pattern <tt>F3_12</tt> is defined to create 2 instruction
1096 classes each time <tt>F3_12</tt> is invoked:
1097 </p>
1098
1099 <div class="doc_code">
1100 <pre>multiclass F3_12 &lt;string OpcStr, bits&lt;6&gt; Op3Val, SDNode OpNode&gt; {
1101   def rr  : F3_1 &lt;2, Op3Val,
1102                  (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
1103                  !strconcat(OpcStr, " $b, $c, $dst"),
1104                  [(set IntRegs:$dst, (OpNode IntRegs:$b, IntRegs:$c))]&gt;;
1105   def ri  : F3_2 &lt;2, Op3Val,
1106                  (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
1107                  !strconcat(OpcStr, " $b, $c, $dst"),
1108                  [(set IntRegs:$dst, (OpNode IntRegs:$b, simm13:$c))]&gt;;
1109 }
1110 </pre>
1111 </div>
1112
1113 <p>
1114 So when the <tt>defm</tt> directive is used for the <tt>XOR</tt>
1115 and <tt>ADD</tt> instructions, as seen below, it creates four instruction
1116 objects: <tt>XORrr</tt>, <tt>XORri</tt>, <tt>ADDrr</tt>, and <tt>ADDri</tt>.
1117 </p>
1118
1119 <div class="doc_code">
1120 <pre>
1121 defm XOR   : F3_12&lt;"xor", 0b000011, xor&gt;;
1122 defm ADD   : F3_12&lt;"add", 0b000000, add&gt;;
1123 </pre>
1124 </div>
1125
1126 <p>
1127 <tt>SparcInstrInfo.td</tt> also includes definitions for condition codes that
1128 are referenced by branch instructions. The following definitions
1129 in <tt>SparcInstrInfo.td</tt> indicate the bit location of the SPARC condition
1130 code. For example, the 10<sup>th</sup> bit represents the 'greater than'
1131 condition for integers, and the 22<sup>nd</sup> bit represents the 'greater
1132 than' condition for floats.
1133 </p>
1134
1135 <div class="doc_code">
1136 <pre>
1137 def ICC_NE  : ICC_VAL&lt; 9&gt;;  // Not Equal
1138 def ICC_E   : ICC_VAL&lt; 1&gt;;  // Equal
1139 def ICC_G   : ICC_VAL&lt;10&gt;;  // Greater
1140 ...
1141 def FCC_U   : FCC_VAL&lt;23&gt;;  // Unordered
1142 def FCC_G   : FCC_VAL&lt;22&gt;;  // Greater
1143 def FCC_UG  : FCC_VAL&lt;21&gt;;  // Unordered or Greater
1144 ...
1145 </pre>
1146 </div>
1147
1148 <p>
1149 (Note that <tt>Sparc.h</tt> also defines enums that correspond to the same SPARC
1150 condition codes. Care must be taken to ensure the values in <tt>Sparc.h</tt>
1151 correspond to the values in <tt>SparcInstrInfo.td</tt>. I.e.,
1152 <tt>SPCC::ICC_NE = 9</tt>, <tt>SPCC::FCC_U = 23</tt> and so on.)
1153 </p>
1154
1155 </div>
1156
1157 <!-- ======================================================================= -->
1158 <div class="doc_subsection">
1159   <a name="operandMapping">Instruction Operand Mapping</a>
1160 </div>
1161
1162 <div class="doc_text">
1163
1164 <p>
1165 The code generator backend maps instruction operands to fields in the
1166 instruction.  Operands are assigned to unbound fields in the instruction in the
1167 order they are defined. Fields are bound when they are assigned a value.  For
1168 example, the Sparc target defines the <tt>XNORrr</tt> instruction as
1169 a <tt>F3_1</tt> format instruction having three operands.
1170 </p>
1171
1172 <div class="doc_code">
1173 <pre>
1174 def XNORrr  : F3_1&lt;2, 0b000111,
1175                    (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
1176                    "xnor $b, $c, $dst",
1177                    [(set IntRegs:$dst, (not (xor IntRegs:$b, IntRegs:$c)))]&gt;;
1178 </pre>
1179 </div>
1180
1181 <p>
1182 The instruction templates in <tt>SparcInstrFormats.td</tt> show the base class
1183 for <tt>F3_1</tt> is <tt>InstSP</tt>.
1184 </p>
1185
1186 <div class="doc_code">
1187 <pre>
1188 class InstSP&lt;dag outs, dag ins, string asmstr, list&lt;dag&gt; pattern&gt; : Instruction {
1189   field bits&lt;32&gt; Inst;
1190   let Namespace = "SP";
1191   bits&lt;2&gt; op;
1192   let Inst{31-30} = op;
1193   dag OutOperandList = outs;
1194   dag InOperandList = ins;
1195   let AsmString   = asmstr;
1196   let Pattern = pattern;
1197 }
1198 </pre>
1199 </div>
1200
1201 <p><tt>InstSP</tt> leaves the <tt>op</tt> field unbound.</p>
1202
1203 <div class="doc_code">
1204 <pre>
1205 class F3&lt;dag outs, dag ins, string asmstr, list&lt;dag&gt; pattern&gt;
1206     : InstSP&lt;outs, ins, asmstr, pattern&gt; {
1207   bits&lt;5&gt; rd;
1208   bits&lt;6&gt; op3;
1209   bits&lt;5&gt; rs1;
1210   let op{1} = 1;   // Op = 2 or 3
1211   let Inst{29-25} = rd;
1212   let Inst{24-19} = op3;
1213   let Inst{18-14} = rs1;
1214 }
1215 </pre>
1216 </div>
1217
1218 <p>
1219 <tt>F3</tt> binds the <tt>op</tt> field and defines the <tt>rd</tt>,
1220 <tt>op3</tt>, and <tt>rs1</tt> fields.  <tt>F3</tt> format instructions will
1221 bind the operands <tt>rd</tt>, <tt>op3</tt>, and <tt>rs1</tt> fields.
1222 </p>
1223
1224 <div class="doc_code">
1225 <pre>
1226 class F3_1&lt;bits&lt;2&gt; opVal, bits&lt;6&gt; op3val, dag outs, dag ins,
1227            string asmstr, list&lt;dag&gt; pattern&gt; : F3&lt;outs, ins, asmstr, pattern&gt; {
1228   bits&lt;8&gt; asi = 0; // asi not currently used
1229   bits&lt;5&gt; rs2;
1230   let op         = opVal;
1231   let op3        = op3val;
1232   let Inst{13}   = 0;     // i field = 0
1233   let Inst{12-5} = asi;   // address space identifier
1234   let Inst{4-0}  = rs2;
1235 }
1236 </pre>
1237 </div>
1238
1239 <p>
1240 <tt>F3_1</tt> binds the <tt>op3</tt> field and defines the <tt>rs2</tt>
1241 fields.  <tt>F3_1</tt> format instructions will bind the operands to the <tt>rd</tt>,
1242 <tt>rs1</tt>, and <tt>rs2</tt> fields. This results in the <tt>XNORrr</tt>
1243 instruction binding <tt>$dst</tt>, <tt>$b</tt>, and <tt>$c</tt> operands to
1244 the <tt>rd</tt>, <tt>rs1</tt>, and <tt>rs2</tt> fields respectively.
1245 </p>
1246
1247 </div>
1248
1249 <!-- ======================================================================= -->
1250 <div class="doc_subsection">
1251   <a name="implementInstr">Implement a subclass of </a>
1252   <a href="http://www.llvm.org/docs/CodeGenerator.html#targetinstrinfo">TargetInstrInfo</a>
1253 </div>
1254
1255 <div class="doc_text">
1256
1257 <p>
1258 The final step is to hand code portions of <tt>XXXInstrInfo</tt>, which
1259 implements the interface described in <tt>TargetInstrInfo.h</tt>. These
1260 functions return <tt>0</tt> or a Boolean or they assert, unless
1261 overridden. Here's a list of functions that are overridden for the SPARC
1262 implementation in <tt>SparcInstrInfo.cpp</tt>:
1263 </p>
1264
1265 <ul>
1266 <li><tt>isMoveInstr</tt> &mdash; Return true if the instruction is a register to
1267     register move; false, otherwise.</li>
1268
1269 <li><tt>isLoadFromStackSlot</tt> &mdash; If the specified machine instruction is
1270     a direct load from a stack slot, return the register number of the
1271     destination and the <tt>FrameIndex</tt> of the stack slot.</li>
1272
1273 <li><tt>isStoreToStackSlot</tt> &mdash; If the specified machine instruction is
1274     a direct store to a stack slot, return the register number of the
1275     destination and the <tt>FrameIndex</tt> of the stack slot.</li>
1276
1277 <li><tt>copyRegToReg</tt> &mdash; Copy values between a pair of registers.</li>
1278
1279 <li><tt>storeRegToStackSlot</tt> &mdash; Store a register value to a stack
1280     slot.</li>
1281
1282 <li><tt>loadRegFromStackSlot</tt> &mdash; Load a register value from a stack
1283     slot.</li>
1284
1285 <li><tt>storeRegToAddr</tt> &mdash; Store a register value to memory.</li>
1286
1287 <li><tt>loadRegFromAddr</tt> &mdash; Load a register value from memory.</li>
1288
1289 <li><tt>foldMemoryOperand</tt> &mdash; Attempt to combine instructions of any
1290     load or store instruction for the specified operand(s).</li>
1291 </ul>
1292
1293 </div>
1294
1295 <!-- ======================================================================= -->
1296 <div class="doc_subsection">
1297   <a name="branchFolding">Branch Folding and If Conversion</a>
1298 </div>
1299 <div class="doc_text">
1300
1301 <p>
1302 Performance can be improved by combining instructions or by eliminating
1303 instructions that are never reached. The <tt>AnalyzeBranch</tt> method
1304 in <tt>XXXInstrInfo</tt> may be implemented to examine conditional instructions
1305 and remove unnecessary instructions. <tt>AnalyzeBranch</tt> looks at the end of
1306 a machine basic block (MBB) for opportunities for improvement, such as branch
1307 folding and if conversion. The <tt>BranchFolder</tt> and <tt>IfConverter</tt>
1308 machine function passes (see the source files <tt>BranchFolding.cpp</tt> and
1309 <tt>IfConversion.cpp</tt> in the <tt>lib/CodeGen</tt> directory) call
1310 <tt>AnalyzeBranch</tt> to improve the control flow graph that represents the
1311 instructions.
1312 </p>
1313
1314 <p>
1315 Several implementations of <tt>AnalyzeBranch</tt> (for ARM, Alpha, and X86) can
1316 be examined as models for your own <tt>AnalyzeBranch</tt> implementation. Since
1317 SPARC does not implement a useful <tt>AnalyzeBranch</tt>, the ARM target
1318 implementation is shown below.
1319 </p>
1320
1321 <p><tt>AnalyzeBranch</tt> returns a Boolean value and takes four parameters:</p>
1322
1323 <ul>
1324 <li><tt>MachineBasicBlock &amp;MBB</tt> &mdash; The incoming block to be
1325     examined.</li>
1326
1327 <li><tt>MachineBasicBlock *&amp;TBB</tt> &mdash; A destination block that is
1328     returned. For a conditional branch that evaluates to true, <tt>TBB</tt> is
1329     the destination.</li>
1330
1331 <li><tt>MachineBasicBlock *&amp;FBB</tt> &mdash; For a conditional branch that
1332     evaluates to false, <tt>FBB</tt> is returned as the destination.</li>
1333
1334 <li><tt>std::vector&lt;MachineOperand&gt; &amp;Cond</tt> &mdash; List of
1335     operands to evaluate a condition for a conditional branch.</li>
1336 </ul>
1337
1338 <p>
1339 In the simplest case, if a block ends without a branch, then it falls through to
1340 the successor block. No destination blocks are specified for either <tt>TBB</tt>
1341 or <tt>FBB</tt>, so both parameters return <tt>NULL</tt>. The start of
1342 the <tt>AnalyzeBranch</tt> (see code below for the ARM target) shows the
1343 function parameters and the code for the simplest case.
1344 </p>
1345
1346 <div class="doc_code">
1347 <pre>bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &amp;MBB,
1348         MachineBasicBlock *&amp;TBB, MachineBasicBlock *&amp;FBB,
1349         std::vector&lt;MachineOperand&gt; &amp;Cond) const
1350 {
1351   MachineBasicBlock::iterator I = MBB.end();
1352   if (I == MBB.begin() || !isUnpredicatedTerminator(--I))
1353     return false;
1354 </pre>
1355 </div>
1356
1357 <p>
1358 If a block ends with a single unconditional branch instruction, then
1359 <tt>AnalyzeBranch</tt> (shown below) should return the destination of that
1360 branch in the <tt>TBB</tt> parameter.
1361 </p>
1362
1363 <div class="doc_code">
1364 <pre>
1365   if (LastOpc == ARM::B || LastOpc == ARM::tB) {
1366     TBB = LastInst-&gt;getOperand(0).getMBB();
1367     return false;
1368   }
1369 </pre>
1370 </div>
1371
1372 <p>
1373 If a block ends with two unconditional branches, then the second branch is never
1374 reached. In that situation, as shown below, remove the last branch instruction
1375 and return the penultimate branch in the <tt>TBB</tt> parameter.
1376 </p>
1377
1378 <div class="doc_code">
1379 <pre>
1380   if ((SecondLastOpc == ARM::B || SecondLastOpc==ARM::tB) &amp;&amp;
1381       (LastOpc == ARM::B || LastOpc == ARM::tB)) {
1382     TBB = SecondLastInst-&gt;getOperand(0).getMBB();
1383     I = LastInst;
1384     I-&gt;eraseFromParent();
1385     return false;
1386   }
1387 </pre>
1388 </div>
1389
1390 <p>
1391 A block may end with a single conditional branch instruction that falls through
1392 to successor block if the condition evaluates to false. In that case,
1393 <tt>AnalyzeBranch</tt> (shown below) should return the destination of that
1394 conditional branch in the <tt>TBB</tt> parameter and a list of operands in
1395 the <tt>Cond</tt> parameter to evaluate the condition.
1396 </p>
1397
1398 <div class="doc_code">
1399 <pre>
1400   if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) {
1401     // Block ends with fall-through condbranch.
1402     TBB = LastInst-&gt;getOperand(0).getMBB();
1403     Cond.push_back(LastInst-&gt;getOperand(1));
1404     Cond.push_back(LastInst-&gt;getOperand(2));
1405     return false;
1406   }
1407 </pre>
1408 </div>
1409
1410 <p>
1411 If a block ends with both a conditional branch and an ensuing unconditional
1412 branch, then <tt>AnalyzeBranch</tt> (shown below) should return the conditional
1413 branch destination (assuming it corresponds to a conditional evaluation of
1414 '<tt>true</tt>') in the <tt>TBB</tt> parameter and the unconditional branch
1415 destination in the <tt>FBB</tt> (corresponding to a conditional evaluation of
1416 '<tt>false</tt>').  A list of operands to evaluate the condition should be
1417 returned in the <tt>Cond</tt> parameter.
1418 </p>
1419
1420 <div class="doc_code">
1421 <pre>
1422   unsigned SecondLastOpc = SecondLastInst-&gt;getOpcode();
1423
1424   if ((SecondLastOpc == ARM::Bcc &amp;&amp; LastOpc == ARM::B) ||
1425       (SecondLastOpc == ARM::tBcc &amp;&amp; LastOpc == ARM::tB)) {
1426     TBB =  SecondLastInst-&gt;getOperand(0).getMBB();
1427     Cond.push_back(SecondLastInst-&gt;getOperand(1));
1428     Cond.push_back(SecondLastInst-&gt;getOperand(2));
1429     FBB = LastInst-&gt;getOperand(0).getMBB();
1430     return false;
1431   }
1432 </pre>
1433 </div>
1434
1435 <p>
1436 For the last two cases (ending with a single conditional branch or ending with
1437 one conditional and one unconditional branch), the operands returned in
1438 the <tt>Cond</tt> parameter can be passed to methods of other instructions to
1439 create new branches or perform other operations. An implementation
1440 of <tt>AnalyzeBranch</tt> requires the helper methods <tt>RemoveBranch</tt>
1441 and <tt>InsertBranch</tt> to manage subsequent operations.
1442 </p>
1443
1444 <p>
1445 <tt>AnalyzeBranch</tt> should return false indicating success in most circumstances.
1446 <tt>AnalyzeBranch</tt> should only return true when the method is stumped about what to
1447 do, for example, if a block has three terminating branches. <tt>AnalyzeBranch</tt> may
1448 return true if it encounters a terminator it cannot handle, such as an indirect
1449 branch.
1450 </p>
1451
1452 </div>
1453
1454 <!-- *********************************************************************** -->
1455 <div class="doc_section">
1456   <a name="InstructionSelector">Instruction Selector</a>
1457 </div>
1458 <!-- *********************************************************************** -->
1459
1460 <div class="doc_text">
1461
1462 <p>
1463 LLVM uses a <tt>SelectionDAG</tt> to represent LLVM IR instructions, and nodes
1464 of the <tt>SelectionDAG</tt> ideally represent native target
1465 instructions. During code generation, instruction selection passes are performed
1466 to convert non-native DAG instructions into native target-specific
1467 instructions. The pass described in <tt>XXXISelDAGToDAG.cpp</tt> is used to
1468 match patterns and perform DAG-to-DAG instruction selection. Optionally, a pass
1469 may be defined (in <tt>XXXBranchSelector.cpp</tt>) to perform similar DAG-to-DAG
1470 operations for branch instructions. Later, the code in
1471 <tt>XXXISelLowering.cpp</tt> replaces or removes operations and data types not
1472 supported natively (legalizes) in a <tt>SelectionDAG</tt>.
1473 </p>
1474
1475 <p>
1476 TableGen generates code for instruction selection using the following target
1477 description input files:
1478 </p>
1479
1480 <ul>
1481 <li><tt>XXXInstrInfo.td</tt> &mdash; Contains definitions of instructions in a
1482     target-specific instruction set, generates <tt>XXXGenDAGISel.inc</tt>, which
1483     is included in <tt>XXXISelDAGToDAG.cpp</tt>.</li>
1484
1485 <li><tt>XXXCallingConv.td</tt> &mdash; Contains the calling and return value
1486     conventions for the target architecture, and it generates
1487     <tt>XXXGenCallingConv.inc</tt>, which is included in
1488     <tt>XXXISelLowering.cpp</tt>.</li>
1489 </ul>
1490
1491 <p>
1492 The implementation of an instruction selection pass must include a header that
1493 declares the <tt>FunctionPass</tt> class or a subclass of <tt>FunctionPass</tt>. In
1494 <tt>XXXTargetMachine.cpp</tt>, a Pass Manager (PM) should add each instruction
1495 selection pass into the queue of passes to run.
1496 </p>
1497
1498 <p>
1499 The LLVM static compiler (<tt>llc</tt>) is an excellent tool for visualizing the
1500 contents of DAGs. To display the <tt>SelectionDAG</tt> before or after specific
1501 processing phases, use the command line options for <tt>llc</tt>, described
1502 at <a href="http://llvm.org/docs/CodeGenerator.html#selectiondag_process">
1503 SelectionDAG Instruction Selection Process</a>.
1504 </p>
1505
1506 <p>
1507 To describe instruction selector behavior, you should add patterns for lowering
1508 LLVM code into a <tt>SelectionDAG</tt> as the last parameter of the instruction
1509 definitions in <tt>XXXInstrInfo.td</tt>. For example, in
1510 <tt>SparcInstrInfo.td</tt>, this entry defines a register store operation, and
1511 the last parameter describes a pattern with the store DAG operator.
1512 </p>
1513
1514 <div class="doc_code">
1515 <pre>
1516 def STrr  : F3_1&lt; 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
1517                  "st $src, [$addr]", [(store IntRegs:$src, ADDRrr:$addr)]&gt;;
1518 </pre>
1519 </div>
1520
1521 <p>
1522 <tt>ADDRrr</tt> is a memory mode that is also defined in
1523 <tt>SparcInstrInfo.td</tt>:
1524 </p>
1525
1526 <div class="doc_code">
1527 <pre>
1528 def ADDRrr : ComplexPattern&lt;i32, 2, "SelectADDRrr", [], []&gt;;
1529 </pre>
1530 </div>
1531
1532 <p>
1533 The definition of <tt>ADDRrr</tt> refers to <tt>SelectADDRrr</tt>, which is a
1534 function defined in an implementation of the Instructor Selector (such
1535 as <tt>SparcISelDAGToDAG.cpp</tt>).
1536 </p>
1537
1538 <p>
1539 In <tt>lib/Target/TargetSelectionDAG.td</tt>, the DAG operator for store is
1540 defined below:
1541 </p>
1542
1543 <div class="doc_code">
1544 <pre>
1545 def store : PatFrag&lt;(ops node:$val, node:$ptr),
1546                     (st node:$val, node:$ptr), [{
1547   if (StoreSDNode *ST = dyn_cast&lt;StoreSDNode&gt;(N))
1548     return !ST-&gt;isTruncatingStore() &amp;&amp;
1549            ST-&gt;getAddressingMode() == ISD::UNINDEXED;
1550   return false;
1551 }]&gt;;
1552 </pre>
1553 </div>
1554
1555 <p>
1556 <tt>XXXInstrInfo.td</tt> also generates (in <tt>XXXGenDAGISel.inc</tt>) the
1557 <tt>SelectCode</tt> method that is used to call the appropriate processing
1558 method for an instruction. In this example, <tt>SelectCode</tt>
1559 calls <tt>Select_ISD_STORE</tt> for the <tt>ISD::STORE</tt> opcode.
1560 </p>
1561
1562 <div class="doc_code">
1563 <pre>
1564 SDNode *SelectCode(SDValue N) {
1565   ...
1566   MVT::ValueType NVT = N.getNode()-&gt;getValueType(0);
1567   switch (N.getOpcode()) {
1568   case ISD::STORE: {
1569     switch (NVT) {
1570     default:
1571       return Select_ISD_STORE(N);
1572       break;
1573     }
1574     break;
1575   }
1576   ...
1577 </pre>
1578 </div>
1579
1580 <p>
1581 The pattern for <tt>STrr</tt> is matched, so elsewhere in
1582 <tt>XXXGenDAGISel.inc</tt>, code for <tt>STrr</tt> is created for
1583 <tt>Select_ISD_STORE</tt>. The <tt>Emit_22</tt> method is also generated
1584 in <tt>XXXGenDAGISel.inc</tt> to complete the processing of this
1585 instruction.
1586 </p>
1587
1588 <div class="doc_code">
1589 <pre>
1590 SDNode *Select_ISD_STORE(const SDValue &amp;N) {
1591   SDValue Chain = N.getOperand(0);
1592   if (Predicate_store(N.getNode())) {
1593     SDValue N1 = N.getOperand(1);
1594     SDValue N2 = N.getOperand(2);
1595     SDValue CPTmp0;
1596     SDValue CPTmp1;
1597
1598     // Pattern: (st:void IntRegs:i32:$src,
1599     //           ADDRrr:i32:$addr)&lt;&lt;P:Predicate_store&gt;&gt;
1600     // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
1601     // Pattern complexity = 13  cost = 1  size = 0
1602     if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &amp;&amp;
1603         N1.getNode()-&gt;getValueType(0) == MVT::i32 &amp;&amp;
1604         N2.getNode()-&gt;getValueType(0) == MVT::i32) {
1605       return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
1606     }
1607 ...
1608 </pre>
1609 </div>
1610
1611 </div>
1612
1613 <!-- ======================================================================= -->
1614 <div class="doc_subsection">
1615   <a name="LegalizePhase">The SelectionDAG Legalize Phase</a>
1616 </div>
1617
1618 <div class="doc_text">
1619
1620 <p>
1621 The Legalize phase converts a DAG to use types and operations that are natively
1622 supported by the target. For natively unsupported types and operations, you need
1623 to add code to the target-specific XXXTargetLowering implementation to convert
1624 unsupported types and operations to supported ones.
1625 </p>
1626
1627 <p>
1628 In the constructor for the <tt>XXXTargetLowering</tt> class, first use the
1629 <tt>addRegisterClass</tt> method to specify which types are supports and which
1630 register classes are associated with them. The code for the register classes are
1631 generated by TableGen from <tt>XXXRegisterInfo.td</tt> and placed
1632 in <tt>XXXGenRegisterInfo.h.inc</tt>. For example, the implementation of the
1633 constructor for the SparcTargetLowering class (in
1634 <tt>SparcISelLowering.cpp</tt>) starts with the following code:
1635 </p>
1636
1637 <div class="doc_code">
1638 <pre>
1639 addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
1640 addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
1641 addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
1642 </pre>
1643 </div>
1644
1645 <p>
1646 You should examine the node types in the <tt>ISD</tt> namespace
1647 (<tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt>) and determine which
1648 operations the target natively supports. For operations that do <b>not</b> have
1649 native support, add a callback to the constructor for the XXXTargetLowering
1650 class, so the instruction selection process knows what to do. The TargetLowering
1651 class callback methods (declared in <tt>llvm/Target/TargetLowering.h</tt>) are:
1652 </p>
1653
1654 <ul>
1655 <li><tt>setOperationAction</tt> &mdash; General operation.</li>
1656
1657 <li><tt>setLoadExtAction</tt> &mdash; Load with extension.</li>
1658
1659 <li><tt>setTruncStoreAction</tt> &mdash; Truncating store.</li>
1660
1661 <li><tt>setIndexedLoadAction</tt> &mdash; Indexed load.</li>
1662
1663 <li><tt>setIndexedStoreAction</tt> &mdash; Indexed store.</li>
1664
1665 <li><tt>setConvertAction</tt> &mdash; Type conversion.</li>
1666
1667 <li><tt>setCondCodeAction</tt> &mdash; Support for a given condition code.</li>
1668 </ul>
1669
1670 <p>
1671 Note: on older releases, <tt>setLoadXAction</tt> is used instead
1672 of <tt>setLoadExtAction</tt>.  Also, on older releases,
1673 <tt>setCondCodeAction</tt> may not be supported. Examine your release
1674 to see what methods are specifically supported.
1675 </p>
1676
1677 <p>
1678 These callbacks are used to determine that an operation does or does not work
1679 with a specified type (or types). And in all cases, the third parameter is
1680 a <tt>LegalAction</tt> type enum value: <tt>Promote</tt>, <tt>Expand</tt>,
1681 <tt>Custom</tt>, or <tt>Legal</tt>. <tt>SparcISelLowering.cpp</tt>
1682 contains examples of all four <tt>LegalAction</tt> values.
1683 </p>
1684
1685 </div>
1686
1687 <!-- _______________________________________________________________________ -->
1688 <div class="doc_subsubsection">
1689   <a name="promote">Promote</a>
1690 </div>
1691
1692 <div class="doc_text">
1693
1694 <p>
1695 For an operation without native support for a given type, the specified type may
1696 be promoted to a larger type that is supported. For example, SPARC does not
1697 support a sign-extending load for Boolean values (<tt>i1</tt> type), so
1698 in <tt>SparcISelLowering.cpp</tt> the third parameter below, <tt>Promote</tt>,
1699 changes <tt>i1</tt> type values to a large type before loading.
1700 </p>
1701
1702 <div class="doc_code">
1703 <pre>
1704 setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
1705 </pre>
1706 </div>
1707
1708 </div>
1709
1710 <!-- _______________________________________________________________________ -->
1711 <div class="doc_subsubsection">
1712   <a name="expand">Expand</a>
1713 </div>
1714
1715 <div class="doc_text">
1716
1717 <p>
1718 For a type without native support, a value may need to be broken down further,
1719 rather than promoted. For an operation without native support, a combination of
1720 other operations may be used to similar effect. In SPARC, the floating-point
1721 sine and cosine trig operations are supported by expansion to other operations,
1722 as indicated by the third parameter, <tt>Expand</tt>, to
1723 <tt>setOperationAction</tt>:
1724 </p>
1725
1726 <div class="doc_code">
1727 <pre>
1728 setOperationAction(ISD::FSIN, MVT::f32, Expand);
1729 setOperationAction(ISD::FCOS, MVT::f32, Expand);
1730 </pre>
1731 </div>
1732
1733 </div>
1734
1735 <!-- _______________________________________________________________________ -->
1736 <div class="doc_subsubsection">
1737   <a name="custom">Custom</a>
1738 </div>
1739
1740 <div class="doc_text">
1741
1742 <p>
1743 For some operations, simple type promotion or operation expansion may be
1744 insufficient. In some cases, a special intrinsic function must be implemented.
1745 </p>
1746
1747 <p>
1748 For example, a constant value may require special treatment, or an operation may
1749 require spilling and restoring registers in the stack and working with register
1750 allocators.
1751 </p>
1752
1753 <p>
1754 As seen in <tt>SparcISelLowering.cpp</tt> code below, to perform a type
1755 conversion from a floating point value to a signed integer, first the
1756 <tt>setOperationAction</tt> should be called with <tt>Custom</tt> as the third
1757 parameter:
1758 </p>
1759
1760 <div class="doc_code">
1761 <pre>
1762 setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
1763 </pre>
1764 </div>
1765
1766 <p>
1767 In the <tt>LowerOperation</tt> method, for each <tt>Custom</tt> operation, a
1768 case statement should be added to indicate what function to call. In the
1769 following code, an <tt>FP_TO_SINT</tt> opcode will call
1770 the <tt>LowerFP_TO_SINT</tt> method:
1771 </p>
1772
1773 <div class="doc_code">
1774 <pre>
1775 SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &amp;DAG) {
1776   switch (Op.getOpcode()) {
1777   case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
1778   ...
1779   }
1780 }
1781 </pre>
1782 </div>
1783
1784 <p>
1785 Finally, the <tt>LowerFP_TO_SINT</tt> method is implemented, using an FP
1786 register to convert the floating-point value to an integer.
1787 </p>
1788
1789 <div class="doc_code">
1790 <pre>
1791 static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &amp;DAG) {
1792   assert(Op.getValueType() == MVT::i32);
1793   Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
1794   return DAG.getNode(ISD::BIT_CONVERT, MVT::i32, Op);
1795 }
1796 </pre>
1797 </div>
1798
1799 </div>
1800
1801 <!-- _______________________________________________________________________ -->
1802 <div class="doc_subsubsection">
1803   <a name="legal">Legal</a>
1804 </div>
1805
1806 <div class="doc_text">
1807
1808 <p>
1809 The <tt>Legal</tt> LegalizeAction enum value simply indicates that an
1810 operation <b>is</b> natively supported. <tt>Legal</tt> represents the default
1811 condition, so it is rarely used. In <tt>SparcISelLowering.cpp</tt>, the action
1812 for <tt>CTPOP</tt> (an operation to count the bits set in an integer) is
1813 natively supported only for SPARC v9. The following code enables
1814 the <tt>Expand</tt> conversion technique for non-v9 SPARC implementations.
1815 </p>
1816
1817 <div class="doc_code">
1818 <pre>
1819 setOperationAction(ISD::CTPOP, MVT::i32, Expand);
1820 ...
1821 if (TM.getSubtarget&lt;SparcSubtarget&gt;().isV9())
1822   setOperationAction(ISD::CTPOP, MVT::i32, Legal);
1823   case ISD::SETULT: return SPCC::ICC_CS;
1824   case ISD::SETULE: return SPCC::ICC_LEU;
1825   case ISD::SETUGT: return SPCC::ICC_GU;
1826   case ISD::SETUGE: return SPCC::ICC_CC;
1827   }
1828 }
1829 </pre>
1830 </div>
1831
1832 </div>
1833
1834 <!-- ======================================================================= -->
1835 <div class="doc_subsection">
1836   <a name="callingConventions">Calling Conventions</a>
1837 </div>
1838
1839 <div class="doc_text">
1840
1841 <p>
1842 To support target-specific calling conventions, <tt>XXXGenCallingConv.td</tt>
1843 uses interfaces (such as CCIfType and CCAssignToReg) that are defined in
1844 <tt>lib/Target/TargetCallingConv.td</tt>. TableGen can take the target
1845 descriptor file <tt>XXXGenCallingConv.td</tt> and generate the header
1846 file <tt>XXXGenCallingConv.inc</tt>, which is typically included
1847 in <tt>XXXISelLowering.cpp</tt>. You can use the interfaces in
1848 <tt>TargetCallingConv.td</tt> to specify:
1849 </p>
1850
1851 <ul>
1852 <li>The order of parameter allocation.</li>
1853
1854 <li>Where parameters and return values are placed (that is, on the stack or in
1855     registers).</li>
1856
1857 <li>Which registers may be used.</li>
1858
1859 <li>Whether the caller or callee unwinds the stack.</li>
1860 </ul>
1861
1862 <p>
1863 The following example demonstrates the use of the <tt>CCIfType</tt> and
1864 <tt>CCAssignToReg</tt> interfaces. If the <tt>CCIfType</tt> predicate is true
1865 (that is, if the current argument is of type <tt>f32</tt> or <tt>f64</tt>), then
1866 the action is performed. In this case, the <tt>CCAssignToReg</tt> action assigns
1867 the argument value to the first available register: either <tt>R0</tt>
1868 or <tt>R1</tt>.
1869 </p>
1870
1871 <div class="doc_code">
1872 <pre>
1873 CCIfType&lt;[f32,f64], CCAssignToReg&lt;[R0, R1]&gt;&gt;
1874 </pre>
1875 </div>
1876
1877 <p>
1878 <tt>SparcCallingConv.td</tt> contains definitions for a target-specific
1879 return-value calling convention (RetCC_Sparc32) and a basic 32-bit C calling
1880 convention (<tt>CC_Sparc32</tt>). The definition of <tt>RetCC_Sparc32</tt>
1881 (shown below) indicates which registers are used for specified scalar return
1882 types. A single-precision float is returned to register <tt>F0</tt>, and a
1883 double-precision float goes to register <tt>D0</tt>. A 32-bit integer is
1884 returned in register <tt>I0</tt> or <tt>I1</tt>.
1885 </p>
1886
1887 <div class="doc_code">
1888 <pre>
1889 def RetCC_Sparc32 : CallingConv&lt;[
1890   CCIfType&lt;[i32], CCAssignToReg&lt;[I0, I1]&gt;&gt;,
1891   CCIfType&lt;[f32], CCAssignToReg&lt;[F0]&gt;&gt;,
1892   CCIfType&lt;[f64], CCAssignToReg&lt;[D0]&gt;&gt;
1893 ]&gt;;
1894 </pre>
1895 </div>
1896
1897 <p>
1898 The definition of <tt>CC_Sparc32</tt> in <tt>SparcCallingConv.td</tt> introduces
1899 <tt>CCAssignToStack</tt>, which assigns the value to a stack slot with the
1900 specified size and alignment. In the example below, the first parameter, 4,
1901 indicates the size of the slot, and the second parameter, also 4, indicates the
1902 stack alignment along 4-byte units. (Special cases: if size is zero, then the
1903 ABI size is used; if alignment is zero, then the ABI alignment is used.)
1904 </p>
1905
1906 <div class="doc_code">
1907 <pre>
1908 def CC_Sparc32 : CallingConv&lt;[
1909   // All arguments get passed in integer registers if there is space.
1910   CCIfType&lt;[i32, f32, f64], CCAssignToReg&lt;[I0, I1, I2, I3, I4, I5]&gt;&gt;,
1911   CCAssignToStack&lt;4, 4&gt;
1912 ]&gt;;
1913 </pre>
1914 </div>
1915
1916 <p>
1917 <tt>CCDelegateTo</tt> is another commonly used interface, which tries to find a
1918 specified sub-calling convention, and, if a match is found, it is invoked. In
1919 the following example (in <tt>X86CallingConv.td</tt>), the definition of
1920 <tt>RetCC_X86_32_C</tt> ends with <tt>CCDelegateTo</tt>. After the current value
1921 is assigned to the register <tt>ST0</tt> or <tt>ST1</tt>,
1922 the <tt>RetCC_X86Common</tt> is invoked.
1923 </p>
1924
1925 <div class="doc_code">
1926 <pre>
1927 def RetCC_X86_32_C : CallingConv&lt;[
1928   CCIfType&lt;[f32], CCAssignToReg&lt;[ST0, ST1]&gt;&gt;,
1929   CCIfType&lt;[f64], CCAssignToReg&lt;[ST0, ST1]&gt;&gt;,
1930   CCDelegateTo&lt;RetCC_X86Common&gt;
1931 ]&gt;;
1932 </pre>
1933 </div>
1934
1935 <p>
1936 <tt>CCIfCC</tt> is an interface that attempts to match the given name to the
1937 current calling convention. If the name identifies the current calling
1938 convention, then a specified action is invoked. In the following example (in
1939 <tt>X86CallingConv.td</tt>), if the <tt>Fast</tt> calling convention is in use,
1940 then <tt>RetCC_X86_32_Fast</tt> is invoked. If the <tt>SSECall</tt> calling
1941 convention is in use, then <tt>RetCC_X86_32_SSE</tt> is invoked.
1942 </p>
1943
1944 <div class="doc_code">
1945 <pre>
1946 def RetCC_X86_32 : CallingConv&lt;[
1947   CCIfCC&lt;"CallingConv::Fast", CCDelegateTo&lt;RetCC_X86_32_Fast&gt;&gt;,
1948   CCIfCC&lt;"CallingConv::X86_SSECall", CCDelegateTo&lt;RetCC_X86_32_SSE&gt;&gt;,
1949   CCDelegateTo&lt;RetCC_X86_32_C&gt;
1950 ]&gt;;
1951 </pre>
1952 </div>
1953
1954 <p>Other calling convention interfaces include:</p>
1955
1956 <ul>
1957 <li><tt>CCIf &lt;predicate, action&gt;</tt> &mdash; If the predicate matches,
1958     apply the action.</li>
1959
1960 <li><tt>CCIfInReg &lt;action&gt;</tt> &mdash; If the argument is marked with the
1961     '<tt>inreg</tt>' attribute, then apply the action.</li>
1962
1963 <li><tt>CCIfNest &lt;action&gt;</tt> &mdash; Inf the argument is marked with the
1964     '<tt>nest</tt>' attribute, then apply the action.</li>
1965
1966 <li><tt>CCIfNotVarArg &lt;action&gt;</tt> &mdash; If the current function does
1967     not take a variable number of arguments, apply the action.</li>
1968
1969 <li><tt>CCAssignToRegWithShadow &lt;registerList, shadowList&gt;</tt> &mdash;
1970     similar to <tt>CCAssignToReg</tt>, but with a shadow list of registers.</li>
1971
1972 <li><tt>CCPassByVal &lt;size, align&gt;</tt> &mdash; Assign value to a stack
1973     slot with the minimum specified size and alignment.</li>
1974
1975 <li><tt>CCPromoteToType &lt;type&gt;</tt> &mdash; Promote the current value to
1976     the specified type.</li>
1977
1978 <li><tt>CallingConv &lt;[actions]&gt;</tt> &mdash; Define each calling
1979     convention that is supported.</li>
1980 </ul>
1981
1982 </div>
1983
1984 <!-- *********************************************************************** -->
1985 <div class="doc_section">
1986   <a name="assemblyPrinter">Assembly Printer</a>
1987 </div>
1988 <!-- *********************************************************************** -->
1989
1990 <div class="doc_text">
1991
1992 <p>
1993 During the code emission stage, the code generator may utilize an LLVM pass to
1994 produce assembly output. To do this, you want to implement the code for a
1995 printer that converts LLVM IR to a GAS-format assembly language for your target
1996 machine, using the following steps:
1997 </p>
1998
1999 <ul>
2000 <li>Define all the assembly strings for your target, adding them to the
2001     instructions defined in the <tt>XXXInstrInfo.td</tt> file.
2002     (See <a href="#InstructionSet">Instruction Set</a>.)  TableGen will produce
2003     an output file (<tt>XXXGenAsmWriter.inc</tt>) with an implementation of
2004     the <tt>printInstruction</tt> method for the XXXAsmPrinter class.</li>
2005
2006 <li>Write <tt>XXXTargetAsmInfo.h</tt>, which contains the bare-bones declaration
2007     of the <tt>XXXTargetAsmInfo</tt> class (a subclass
2008     of <tt>TargetAsmInfo</tt>).</li>
2009
2010 <li>Write <tt>XXXTargetAsmInfo.cpp</tt>, which contains target-specific values
2011     for <tt>TargetAsmInfo</tt> properties and sometimes new implementations for
2012     methods.</li>
2013
2014 <li>Write <tt>XXXAsmPrinter.cpp</tt>, which implements the <tt>AsmPrinter</tt>
2015     class that performs the LLVM-to-assembly conversion.</li>
2016 </ul>
2017
2018 <p>
2019 The code in <tt>XXXTargetAsmInfo.h</tt> is usually a trivial declaration of the
2020 <tt>XXXTargetAsmInfo</tt> class for use in <tt>XXXTargetAsmInfo.cpp</tt>.
2021 Similarly, <tt>XXXTargetAsmInfo.cpp</tt> usually has a few declarations of
2022 <tt>XXXTargetAsmInfo</tt> replacement values that override the default values
2023 in <tt>TargetAsmInfo.cpp</tt>. For example in <tt>SparcTargetAsmInfo.cpp</tt>:
2024 </p>
2025
2026 <div class="doc_code">
2027 <pre>
2028 SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &amp;TM) {
2029   Data16bitsDirective = "\t.half\t";
2030   Data32bitsDirective = "\t.word\t";
2031   Data64bitsDirective = 0;  // .xword is only supported by V9.
2032   ZeroDirective = "\t.skip\t";
2033   CommentString = "!";
2034   ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
2035 }
2036 </pre>
2037 </div>
2038
2039 <p>
2040 The X86 assembly printer implementation (<tt>X86TargetAsmInfo</tt>) is an
2041 example where the target specific <tt>TargetAsmInfo</tt> class uses overridden
2042 methods: <tt>ExpandInlineAsm</tt> and <tt>PreferredEHDataFormat</tt>.
2043 </p>
2044
2045 <p>
2046 A target-specific implementation of AsmPrinter is written in
2047 <tt>XXXAsmPrinter.cpp</tt>, which implements the <tt>AsmPrinter</tt> class that
2048 converts the LLVM to printable assembly. The implementation must include the
2049 following headers that have declarations for the <tt>AsmPrinter</tt> and
2050 <tt>MachineFunctionPass</tt> classes. The <tt>MachineFunctionPass</tt> is a
2051 subclass of <tt>FunctionPass</tt>.
2052 </p>
2053
2054 <div class="doc_code">
2055 <pre>
2056 #include "llvm/CodeGen/AsmPrinter.h"
2057 #include "llvm/CodeGen/MachineFunctionPass.h"
2058 </pre>
2059 </div>
2060
2061 <p>
2062 As a <tt>FunctionPass</tt>, <tt>AsmPrinter</tt> first
2063 calls <tt>doInitialization</tt> to set up the <tt>AsmPrinter</tt>. In
2064 <tt>SparcAsmPrinter</tt>, a <tt>Mangler</tt> object is instantiated to process
2065 variable names.
2066 </p>
2067
2068 <p>
2069 In <tt>XXXAsmPrinter.cpp</tt>, the <tt>runOnMachineFunction</tt> method
2070 (declared in <tt>MachineFunctionPass</tt>) must be implemented
2071 for <tt>XXXAsmPrinter</tt>. In <tt>MachineFunctionPass</tt>,
2072 the <tt>runOnFunction</tt> method invokes <tt>runOnMachineFunction</tt>.
2073 Target-specific implementations of <tt>runOnMachineFunction</tt> differ, but
2074 generally do the following to process each machine function:
2075 </p>
2076
2077 <ul>
2078 <li>Call <tt>SetupMachineFunction</tt> to perform initialization.</li>
2079
2080 <li>Call <tt>EmitConstantPool</tt> to print out (to the output stream) constants
2081     which have been spilled to memory.</li>
2082
2083 <li>Call <tt>EmitJumpTableInfo</tt> to print out jump tables used by the current
2084     function.</li>
2085
2086 <li>Print out the label for the current function.</li>
2087
2088 <li>Print out the code for the function, including basic block labels and the
2089     assembly for the instruction (using <tt>printInstruction</tt>)</li>
2090 </ul>
2091
2092 <p>
2093 The <tt>XXXAsmPrinter</tt> implementation must also include the code generated
2094 by TableGen that is output in the <tt>XXXGenAsmWriter.inc</tt> file. The code
2095 in <tt>XXXGenAsmWriter.inc</tt> contains an implementation of the
2096 <tt>printInstruction</tt> method that may call these methods:
2097 </p>
2098
2099 <ul>
2100 <li><tt>printOperand</tt></li>
2101
2102 <li><tt>printMemOperand</tt></li>
2103
2104 <li><tt>printCCOperand (for conditional statements)</tt></li>
2105
2106 <li><tt>printDataDirective</tt></li>
2107
2108 <li><tt>printDeclare</tt></li>
2109
2110 <li><tt>printImplicitDef</tt></li>
2111
2112 <li><tt>printInlineAsm</tt></li>
2113
2114 <li><tt>printLabel</tt></li>
2115
2116 <li><tt>printPICJumpTableEntry</tt></li>
2117
2118 <li><tt>printPICJumpTableSetLabel</tt></li>
2119 </ul>
2120
2121 <p>
2122 The implementations of <tt>printDeclare</tt>, <tt>printImplicitDef</tt>,
2123 <tt>printInlineAsm</tt>, and <tt>printLabel</tt> in <tt>AsmPrinter.cpp</tt> are
2124 generally adequate for printing assembly and do not need to be
2125 overridden. (<tt>printBasicBlockLabel</tt> is another method that is implemented
2126 in <tt>AsmPrinter.cpp</tt> that may be directly used in an implementation of
2127 <tt>XXXAsmPrinter</tt>.)
2128 </p>
2129
2130 <p>
2131 The <tt>printOperand</tt> method is implemented with a long switch/case
2132 statement for the type of operand: register, immediate, basic block, external
2133 symbol, global address, constant pool index, or jump table index. For an
2134 instruction with a memory address operand, the <tt>printMemOperand</tt> method
2135 should be implemented to generate the proper output. Similarly,
2136 <tt>printCCOperand</tt> should be used to print a conditional operand.
2137 </p>
2138
2139 <p><tt>doFinalization</tt> should be overridden in <tt>XXXAsmPrinter</tt>, and
2140 it should be called to shut down the assembly printer. During
2141 <tt>doFinalization</tt>, global variables and constants are printed to
2142 output.
2143 </p>
2144
2145 </div>
2146
2147 <!-- *********************************************************************** -->
2148 <div class="doc_section">
2149   <a name="subtargetSupport">Subtarget Support</a>
2150 </div>
2151 <!-- *********************************************************************** -->
2152
2153 <div class="doc_text">
2154
2155 <p>
2156 Subtarget support is used to inform the code generation process of instruction
2157 set variations for a given chip set.  For example, the LLVM SPARC implementation
2158 provided covers three major versions of the SPARC microprocessor architecture:
2159 Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a 64-bit
2160 architecture), and the UltraSPARC architecture. V8 has 16 double-precision
2161 floating-point registers that are also usable as either 32 single-precision or 8
2162 quad-precision registers.  V8 is also purely big-endian. V9 has 32
2163 double-precision floating-point registers that are also usable as 16
2164 quad-precision registers, but cannot be used as single-precision registers. The
2165 UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
2166 extensions.
2167 </p>
2168
2169 <p>
2170 If subtarget support is needed, you should implement a target-specific
2171 XXXSubtarget class for your architecture. This class should process the
2172 command-line options <tt>-mcpu=</tt> and <tt>-mattr=</tt>.
2173 </p>
2174
2175 <p>
2176 TableGen uses definitions in the <tt>Target.td</tt> and <tt>Sparc.td</tt> files
2177 to generate code in <tt>SparcGenSubtarget.inc</tt>. In <tt>Target.td</tt>, shown
2178 below, the <tt>SubtargetFeature</tt> interface is defined. The first 4 string
2179 parameters of the <tt>SubtargetFeature</tt> interface are a feature name, an
2180 attribute set by the feature, the value of the attribute, and a description of
2181 the feature. (The fifth parameter is a list of features whose presence is
2182 implied, and its default value is an empty array.)
2183 </p>
2184
2185 <div class="doc_code">
2186 <pre>
2187 class SubtargetFeature&lt;string n, string a,  string v, string d,
2188                        list&lt;SubtargetFeature&gt; i = []&gt; {
2189   string Name = n;
2190   string Attribute = a;
2191   string Value = v;
2192   string Desc = d;
2193   list&lt;SubtargetFeature&gt; Implies = i;
2194 }
2195 </pre>
2196 </div>
2197
2198 <p>
2199 In the <tt>Sparc.td</tt> file, the SubtargetFeature is used to define the
2200 following features.
2201 </p>
2202
2203 <div class="doc_code">
2204 <pre>
2205 def FeatureV9 : SubtargetFeature&lt;"v9", "IsV9", "true",
2206                      "Enable SPARC-V9 instructions"&gt;;
2207 def FeatureV8Deprecated : SubtargetFeature&lt;"deprecated-v8",
2208                      "V8DeprecatedInsts", "true",
2209                      "Enable deprecated V8 instructions in V9 mode"&gt;;
2210 def FeatureVIS : SubtargetFeature&lt;"vis", "IsVIS", "true",
2211                      "Enable UltraSPARC Visual Instruction Set extensions"&gt;;
2212 </pre>
2213 </div>
2214
2215 <p>
2216 Elsewhere in <tt>Sparc.td</tt>, the Proc class is defined and then is used to
2217 define particular SPARC processor subtypes that may have the previously
2218 described features.
2219 </p>
2220
2221 <div class="doc_code">
2222 <pre>
2223 class Proc&lt;string Name, list&lt;SubtargetFeature&gt; Features&gt;
2224   : Processor&lt;Name, NoItineraries, Features&gt;;
2225 &nbsp;
2226 def : Proc&lt;"generic",         []&gt;;
2227 def : Proc&lt;"v8",              []&gt;;
2228 def : Proc&lt;"supersparc",      []&gt;;
2229 def : Proc&lt;"sparclite",       []&gt;;
2230 def : Proc&lt;"f934",            []&gt;;
2231 def : Proc&lt;"hypersparc",      []&gt;;
2232 def : Proc&lt;"sparclite86x",    []&gt;;
2233 def : Proc&lt;"sparclet",        []&gt;;
2234 def : Proc&lt;"tsc701",          []&gt;;
2235 def : Proc&lt;"v9",              [FeatureV9]&gt;;
2236 def : Proc&lt;"ultrasparc",      [FeatureV9, FeatureV8Deprecated]&gt;;
2237 def : Proc&lt;"ultrasparc3",     [FeatureV9, FeatureV8Deprecated]&gt;;
2238 def : Proc&lt;"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]&gt;;
2239 </pre>
2240 </div>
2241
2242 <p>
2243 From <tt>Target.td</tt> and <tt>Sparc.td</tt> files, the resulting
2244 SparcGenSubtarget.inc specifies enum values to identify the features, arrays of
2245 constants to represent the CPU features and CPU subtypes, and the
2246 ParseSubtargetFeatures method that parses the features string that sets
2247 specified subtarget options. The generated <tt>SparcGenSubtarget.inc</tt> file
2248 should be included in the <tt>SparcSubtarget.cpp</tt>. The target-specific
2249 implementation of the XXXSubtarget method should follow this pseudocode:
2250 </p>
2251
2252 <div class="doc_code">
2253 <pre>
2254 XXXSubtarget::XXXSubtarget(const Module &amp;M, const std::string &amp;FS) {
2255   // Set the default features
2256   // Determine default and user specified characteristics of the CPU
2257   // Call ParseSubtargetFeatures(FS, CPU) to parse the features string
2258   // Perform any additional operations
2259 }
2260 </pre>
2261 </div>
2262
2263 </div>
2264
2265 <!-- *********************************************************************** -->
2266 <div class="doc_section">
2267   <a name="jitSupport">JIT Support</a>
2268 </div>
2269 <!-- *********************************************************************** -->
2270
2271 <div class="doc_text">
2272
2273 <p>
2274 The implementation of a target machine optionally includes a Just-In-Time (JIT)
2275 code generator that emits machine code and auxiliary structures as binary output
2276 that can be written directly to memory.  To do this, implement JIT code
2277 generation by performing the following steps:
2278 </p>
2279
2280 <ul>
2281 <li>Write an <tt>XXXCodeEmitter.cpp</tt> file that contains a machine function
2282     pass that transforms target-machine instructions into relocatable machine
2283     code.</li>
2284
2285 <li>Write an <tt>XXXJITInfo.cpp</tt> file that implements the JIT interfaces for
2286     target-specific code-generation activities, such as emitting machine code
2287     and stubs.</li>
2288
2289 <li>Modify <tt>XXXTargetMachine</tt> so that it provides a
2290     <tt>TargetJITInfo</tt> object through its <tt>getJITInfo</tt> method.</li>
2291 </ul>
2292
2293 <p>
2294 There are several different approaches to writing the JIT support code. For
2295 instance, TableGen and target descriptor files may be used for creating a JIT
2296 code generator, but are not mandatory. For the Alpha and PowerPC target
2297 machines, TableGen is used to generate <tt>XXXGenCodeEmitter.inc</tt>, which
2298 contains the binary coding of machine instructions and the
2299 <tt>getBinaryCodeForInstr</tt> method to access those codes. Other JIT
2300 implementations do not.
2301 </p>
2302
2303 <p>
2304 Both <tt>XXXJITInfo.cpp</tt> and <tt>XXXCodeEmitter.cpp</tt> must include the
2305 <tt>llvm/CodeGen/MachineCodeEmitter.h</tt> header file that defines the
2306 <tt>MachineCodeEmitter</tt> class containing code for several callback functions
2307 that write data (in bytes, words, strings, etc.) to the output stream.
2308 </p>
2309
2310 </div>
2311
2312 <!-- ======================================================================= -->
2313 <div class="doc_subsection">
2314   <a name="mce">Machine Code Emitter</a>
2315 </div>
2316
2317 <div class="doc_text">
2318
2319 <p>
2320 In <tt>XXXCodeEmitter.cpp</tt>, a target-specific of the <tt>Emitter</tt> class
2321 is implemented as a function pass (subclass
2322 of <tt>MachineFunctionPass</tt>). The target-specific implementation
2323 of <tt>runOnMachineFunction</tt> (invoked by
2324 <tt>runOnFunction</tt> in <tt>MachineFunctionPass</tt>) iterates through the
2325 <tt>MachineBasicBlock</tt> calls <tt>emitInstruction</tt> to process each
2326 instruction and emit binary code. <tt>emitInstruction</tt> is largely
2327 implemented with case statements on the instruction types defined in
2328 <tt>XXXInstrInfo.h</tt>. For example, in <tt>X86CodeEmitter.cpp</tt>,
2329 the <tt>emitInstruction</tt> method is built around the following switch/case
2330 statements:
2331 </p>
2332
2333 <div class="doc_code">
2334 <pre>
2335 switch (Desc-&gt;TSFlags &amp; X86::FormMask) {
2336 case X86II::Pseudo:  // for not yet implemented instructions
2337    ...               // or pseudo-instructions
2338    break;
2339 case X86II::RawFrm:  // for instructions with a fixed opcode value
2340    ...
2341    break;
2342 case X86II::AddRegFrm: // for instructions that have one register operand
2343    ...                 // added to their opcode
2344    break;
2345 case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
2346    ...                 // to specify a destination (register)
2347    break;
2348 case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
2349    ...                 // to specify a destination (memory)
2350    break;
2351 case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
2352    ...                 // to specify a source (register)
2353    break;
2354 case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
2355    ...                 // to specify a source (memory)
2356    break;
2357 case X86II::MRM0r: case X86II::MRM1r:  // for instructions that operate on
2358 case X86II::MRM2r: case X86II::MRM3r:  // a REGISTER r/m operand and
2359 case X86II::MRM4r: case X86II::MRM5r:  // use the Mod/RM byte and a field
2360 case X86II::MRM6r: case X86II::MRM7r:  // to hold extended opcode data
2361    ...
2362    break;
2363 case X86II::MRM0m: case X86II::MRM1m:  // for instructions that operate on
2364 case X86II::MRM2m: case X86II::MRM3m:  // a MEMORY r/m operand and
2365 case X86II::MRM4m: case X86II::MRM5m:  // use the Mod/RM byte and a field
2366 case X86II::MRM6m: case X86II::MRM7m:  // to hold extended opcode data
2367    ...
2368    break;
2369 case X86II::MRMInitReg: // for instructions whose source and
2370    ...                  // destination are the same register
2371    break;
2372 }
2373 </pre>
2374 </div>
2375
2376 <p>
2377 The implementations of these case statements often first emit the opcode and
2378 then get the operand(s). Then depending upon the operand, helper methods may be
2379 called to process the operand(s). For example, in <tt>X86CodeEmitter.cpp</tt>,
2380 for the <tt>X86II::AddRegFrm</tt> case, the first data emitted
2381 (by <tt>emitByte</tt>) is the opcode added to the register operand. Then an
2382 object representing the machine operand, <tt>MO1</tt>, is extracted. The helper
2383 methods such as <tt>isImmediate</tt>,
2384 <tt>isGlobalAddress</tt>, <tt>isExternalSymbol</tt>, <tt>isConstantPoolIndex</tt>, and
2385 <tt>isJumpTableIndex</tt> determine the operand
2386 type. (<tt>X86CodeEmitter.cpp</tt> also has private methods such
2387 as <tt>emitConstant</tt>, <tt>emitGlobalAddress</tt>,
2388 <tt>emitExternalSymbolAddress</tt>, <tt>emitConstPoolAddress</tt>,
2389 and <tt>emitJumpTableAddress</tt> that emit the data into the output stream.)
2390 </p>
2391
2392 <div class="doc_code">
2393 <pre>
2394 case X86II::AddRegFrm:
2395   MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
2396
2397   if (CurOp != NumOps) {
2398     const MachineOperand &amp;MO1 = MI.getOperand(CurOp++);
2399     unsigned Size = X86InstrInfo::sizeOfImm(Desc);
2400     if (MO1.isImmediate())
2401       emitConstant(MO1.getImm(), Size);
2402     else {
2403       unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
2404         : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
2405       if (Opcode == X86::MOV64ri)
2406         rt = X86::reloc_absolute_dword;  // FIXME: add X86II flag?
2407       if (MO1.isGlobalAddress()) {
2408         bool NeedStub = isa&lt;Function&gt;(MO1.getGlobal());
2409         bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
2410         emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
2411                           NeedStub, isLazy);
2412       } else if (MO1.isExternalSymbol())
2413         emitExternalSymbolAddress(MO1.getSymbolName(), rt);
2414       else if (MO1.isConstantPoolIndex())
2415         emitConstPoolAddress(MO1.getIndex(), rt);
2416       else if (MO1.isJumpTableIndex())
2417         emitJumpTableAddress(MO1.getIndex(), rt);
2418     }
2419   }
2420   break;
2421 </pre>
2422 </div>
2423
2424 <p>
2425 In the previous example, <tt>XXXCodeEmitter.cpp</tt> uses the
2426 variable <tt>rt</tt>, which is a RelocationType enum that may be used to
2427 relocate addresses (for example, a global address with a PIC base offset). The
2428 <tt>RelocationType</tt> enum for that target is defined in the short
2429 target-specific <tt>XXXRelocations.h</tt> file. The <tt>RelocationType</tt> is used by
2430 the <tt>relocate</tt> method defined in <tt>XXXJITInfo.cpp</tt> to rewrite
2431 addresses for referenced global symbols.
2432 </p>
2433
2434 <p>
2435 For example, <tt>X86Relocations.h</tt> specifies the following relocation types
2436 for the X86 addresses. In all four cases, the relocated value is added to the
2437 value already in memory. For <tt>reloc_pcrel_word</tt>
2438 and <tt>reloc_picrel_word</tt>, there is an additional initial adjustment.
2439 </p>
2440
2441 <div class="doc_code">
2442 <pre>
2443 enum RelocationType {
2444   reloc_pcrel_word = 0,    // add reloc value after adjusting for the PC loc
2445   reloc_picrel_word = 1,   // add reloc value after adjusting for the PIC base
2446   reloc_absolute_word = 2, // absolute relocation; no additional adjustment
2447   reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
2448 };
2449 </pre>
2450 </div>
2451
2452 </div>
2453
2454 <!-- ======================================================================= -->
2455 <div class="doc_subsection">
2456   <a name="targetJITInfo">Target JIT Info</a>
2457 </div>
2458
2459 <div class="doc_text">
2460
2461 <p>
2462 <tt>XXXJITInfo.cpp</tt> implements the JIT interfaces for target-specific
2463 code-generation activities, such as emitting machine code and stubs. At minimum,
2464 a target-specific version of <tt>XXXJITInfo</tt> implements the following:
2465 </p>
2466
2467 <ul>
2468 <li><tt>getLazyResolverFunction</tt> &mdash; Initializes the JIT, gives the
2469     target a function that is used for compilation.</li>
2470
2471 <li><tt>emitFunctionStub</tt> &mdash; Returns a native function with a specified
2472     address for a callback function.</li>
2473
2474 <li><tt>relocate</tt> &mdash; Changes the addresses of referenced globals, based
2475     on relocation types.</li>
2476
2477 <li>Callback function that are wrappers to a function stub that is used when the
2478     real target is not initially known.</li>
2479 </ul>
2480
2481 <p>
2482 <tt>getLazyResolverFunction</tt> is generally trivial to implement. It makes the
2483 incoming parameter as the global <tt>JITCompilerFunction</tt> and returns the
2484 callback function that will be used a function wrapper. For the Alpha target
2485 (in <tt>AlphaJITInfo.cpp</tt>), the <tt>getLazyResolverFunction</tt>
2486 implementation is simply:
2487 </p>
2488
2489 <div class="doc_code">
2490 <pre>
2491 TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(
2492                                             JITCompilerFn F) {
2493   JITCompilerFunction = F;
2494   return AlphaCompilationCallback;
2495 }
2496 </pre>
2497 </div>
2498
2499 <p>
2500 For the X86 target, the <tt>getLazyResolverFunction</tt> implementation is a
2501 little more complication, because it returns a different callback function for
2502 processors with SSE instructions and XMM registers.
2503 </p>
2504
2505 <p>
2506 The callback function initially saves and later restores the callee register
2507 values, incoming arguments, and frame and return address. The callback function
2508 needs low-level access to the registers or stack, so it is typically implemented
2509 with assembler.
2510 </p>
2511
2512 </div>
2513
2514 <!-- *********************************************************************** -->
2515
2516 <hr>
2517 <address>
2518   <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
2519   src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
2520   <a href="http://validator.w3.org/check/referer"><img
2521   src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
2522
2523   <a href="http://www.woo.com">Mason Woo</a> and <a href="http://misha.brukman.net">Misha Brukman</a><br>
2524   <a href="http://llvm.org">The LLVM Compiler Infrastructure</a>
2525   <br>
2526   Last modified: $Date$
2527 </address>
2528
2529 </body>
2530 </html>