docs/tutorial/LangImpl7.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   2                       "http://www.w3.org/TR/html4/strict.dtd">
   3
   4 <html>
   5 <head>
   6   <title>Kaleidoscope: Extending the Language: Mutable Variables / SSA
   7          construction</title>
   8   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   9   <meta name="author" content="Chris Lattner">
  10   <link rel="stylesheet" href="../llvm.css" type="text/css">
  11 </head>
  12
  13 <body>
  14
  15 <div class="doc_title">Kaleidoscope: Extending the Language: Mutable Variables</div>
  16
  17 <ul>
  18 <li><a href="index.html">Up to Tutorial Index</a></li>
  19 <li>Chapter 7
  20   <ol>
  21     <li><a href="#intro">Chapter 7 Introduction</a></li>
  22     <li><a href="#why">Why is this a hard problem?</a></li>
  23     <li><a href="#memory">Memory in LLVM</a></li>
  24     <li><a href="#kalvars">Mutable Variables in Kaleidoscope</a></li>
  25     <li><a href="#adjustments">Adjusting Existing Variables for
  26      Mutation</a></li>
  27     <li><a href="#assignment">New Assignment Operator</a></li>
  28     <li><a href="#localvars">User-defined Local Variables</a></li>
  29     <li><a href="#code">Full Code Listing</a></li>
  30   </ol>
  31 </li>
  32 <li><a href="LangImpl8.html">Chapter 8</a>: Conclusion and other useful LLVM
  33  tidbits</li>
  34 </ul>
  35
  36 <div class="doc_author">
  37   <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
  38 </div>
  39
  40 <!-- *********************************************************************** -->
  41 <div class="doc_section"><a name="intro">Chapter 7 Introduction</a></div>
  42 <!-- *********************************************************************** -->
  43
  44 <div class="doc_text">
  45
  46 <p>Welcome to Chapter 7 of the "<a href="index.html">Implementing a language
  47 with LLVM</a>" tutorial.  In chapters 1 through 6, we've built a very
  48 respectable, albeit simple, <a
  49 href="http://en.wikipedia.org/wiki/Functional_programming">functional
  50 programming language</a>.  In our journey, we learned some parsing techniques,
  51 how to build and represent an AST, how to build LLVM IR, and how to optimize
  52 the resultant code and JIT compile it.</p>
  53
  54 <p>While Kaleidoscope is interesting as a functional language, this makes it
  55 "too easy" to generate LLVM IR for it.  In particular, a functional language
  56 makes it very easy to build LLVM IR directly in <a
  57 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA form</a>.
  58 Since LLVM requires that the input code be in SSA form, this is a very nice
  59 property and it is often unclear to newcomers how to generate code for an
  60 imperative language with mutable variables.</p>
  61
  62 <p>The short (and happy) summary of this chapter is that there is no need for
  63 your front-end to build SSA form: LLVM provides highly tuned and well tested
  64 support for this, though the way it works is a bit unexpected for some.</p>
  65
  66 </div>
  67
  68 <!-- *********************************************************************** -->
  69 <div class="doc_section"><a name="why">Why is this a hard problem?</a></div>
  70 <!-- *********************************************************************** -->
  71
  72 <div class="doc_text">
  73
  74 <p>
  75 To understand why mutable variables cause complexities in SSA construction,
  76 consider this extremely simple C example:
  77 </p>
  78
  79 <div class="doc_code">
  80 <pre>
  81 int G, H;
  82 int test(_Bool Condition) {
  83   int X;
  84   if (Condition)
  85     X = G;
  86   else
  87     X = H;
  88   return X;
  89 }
  90 </pre>
  91 </div>
  92
  93 <p>In this case, we have the variable "X", whose value depends on the path
  94 executed in the program.  Because there are two different possible values for X
  95 before the return instruction, a PHI node is inserted to merge the two values.
  96 The LLVM IR that we want for this example looks like this:</p>
  97
  98 <div class="doc_code">
  99 <pre>
 100 @G = weak global i32 0   ; type of @G is i32*
 101 @H = weak global i32 0   ; type of @H is i32*
 102
 103 define i32 @test(i1 %Condition) {
 104 entry:
 105         br i1 %Condition, label %cond_true, label %cond_false
 106
 107 cond_true:
 108         %X.0 = load i32* @G
 109         br label %cond_next
 110
 111 cond_false:
 112         %X.1 = load i32* @H
 113         br label %cond_next
 114
 115 cond_next:
 116         %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
 117         ret i32 %X.2
 118 }
 119 </pre>
 120 </div>
 121
 122 <p>In this example, the loads from the G and H global variables are explicit in
 123 the LLVM IR, and they live in the then/else branches of the if statement
 124 (cond_true/cond_false).  In order to merge the incoming values, the X.2 phi node
 125 in the cond_next block selects the right value to use based on where control
 126 flow is coming from: if control flow comes from the cond_false block, X.2 gets
 127 the value of X.1.  Alternatively, if control flow comes from cond_tree, it gets
 128 the value of X.0.  The intent of this chapter is not to explain the details of
 129 SSA form.  For more information, see one of the many <a
 130 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online
 131 references</a>.</p>
 132
 133 <p>The question for this article is "who places phi nodes when lowering
 134 assignments to mutable variables?".  The issue here is that LLVM
 135 <em>requires</em> that its IR be in SSA form: there is no "non-ssa" mode for it.
 136 However, SSA construction requires non-trivial algorithms and data structures,
 137 so it is inconvenient and wasteful for every front-end to have to reproduce this
 138 logic.</p>
 139
 140 </div>
 141
 142 <!-- *********************************************************************** -->
 143 <div class="doc_section"><a name="memory">Memory in LLVM</a></div>
 144 <!-- *********************************************************************** -->
 145
 146 <div class="doc_text">
 147
 148 <p>The 'trick' here is that while LLVM does require all register values to be
 149 in SSA form, it does not require (or permit) memory objects to be in SSA form.
 150 In the example above, note that the loads from G and H are direct accesses to
 151 G and H: they are not renamed or versioned.  This differs from some other
 152 compiler systems, which do try to version memory objects.  In LLVM, instead of
 153 encoding dataflow analysis of memory into the LLVM IR, it is handled with <a
 154 href="../WritingAnLLVMPass.html">Analysis Passes</a> which are computed on
 155 demand.</p>
 156
 157 <p>
 158 With this in mind, the high-level idea is that we want to make a stack variable
 159 (which lives in memory, because it is on the stack) for each mutable object in
 160 a function.  To take advantage of this trick, we need to talk about how LLVM
 161 represents stack variables.
 162 </p>
 163
 164 <p>In LLVM, all memory accesses are explicit with load/store instructions, and
 165 it is carefully designed to not have (or need) an "address-of" operator.  Notice
 166 how the type of the @G/@H global variables is actually "i32*" even though the
 167 variable is defined as "i32".  What this means is that @G defines <em>space</em>
 168 for an i32 in the global data area, but its <em>name</em> actually refers to the
 169 address for that space.  Stack variables work the same way, but instead of being
 170 declared with global variable definitions, they are declared with the
 171 <a href="../LangRef.html#i_alloca">LLVM alloca instruction</a>:</p>
 172
 173 <div class="doc_code">
 174 <pre>
 175 define i32 @test(i1 %Condition) {
 176 entry:
 177         %X = alloca i32           ; type of %X is i32*.
 178         ...
 179         %tmp = load i32* %X       ; load the stack value %X from the stack.
 180         %tmp2 = add i32 %tmp, 1   ; increment it
 181         store i32 %tmp2, i32* %X  ; store it back
 182         ...
 183 </pre>
 184 </div>
 185
 186 <p>This code shows an example of how you can declare and manipulate a stack
 187 variable in the LLVM IR.  Stack memory allocated with the alloca instruction is
 188 fully general: you can pass the address of the stack slot to functions, you can
 189 store it in other variables, etc.  In our example above, we could rewrite the
 190 example to use the alloca technique to avoid using a PHI node:</p>
 191
 192 <div class="doc_code">
 193 <pre>
 194 @G = weak global i32 0   ; type of @G is i32*
 195 @H = weak global i32 0   ; type of @H is i32*
 196
 197 define i32 @test(i1 %Condition) {
 198 entry:
 199         %X = alloca i32           ; type of %X is i32*.
 200         br i1 %Condition, label %cond_true, label %cond_false
 201
 202 cond_true:
 203         %X.0 = load i32* @G
 204         store i32 %X.0, i32* %X   ; Update X
 205         br label %cond_next
 206
 207 cond_false:
 208         %X.1 = load i32* @H
 209         store i32 %X.1, i32* %X   ; Update X
 210         br label %cond_next
 211
 212 cond_next:
 213         %X.2 = load i32* %X       ; Read X
 214         ret i32 %X.2
 215 }
 216 </pre>
 217 </div>
 218
 219 <p>With this, we have discovered a way to handle arbitrary mutable variables
 220 without the need to create Phi nodes at all:</p>
 221
 222 <ol>
 223 <li>Each mutable variable becomes a stack allocation.</li>
 224 <li>Each read of the variable becomes a load from the stack.</li>
 225 <li>Each update of the variable becomes a store to the stack.</li>
 226 <li>Taking the address of a variable just uses the stack address directly.</li>
 227 </ol>
 228
 229 <p>While this solution has solved our immediate problem, it introduced another
 230 one: we have now apparently introduced a lot of stack traffic for very simple
 231 and common operations, a major performance problem.  Fortunately for us, the
 232 LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles
 233 this case, promoting allocas like this into SSA registers, inserting Phi nodes
 234 as appropriate.  If you run this example through the pass, for example, you'll
 235 get:</p>
 236
 237 <div class="doc_code">
 238 <pre>
 239 $ <b>llvm-as &lt; example.ll | opt -mem2reg | llvm-dis</b>
 240 @G = weak global i32 0
 241 @H = weak global i32 0
 242
 243 define i32 @test(i1 %Condition) {
 244 entry:
 245         br i1 %Condition, label %cond_true, label %cond_false
 246
 247 cond_true:
 248         %X.0 = load i32* @G
 249         br label %cond_next
 250
 251 cond_false:
 252         %X.1 = load i32* @H
 253         br label %cond_next
 254
 255 cond_next:
 256         %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
 257         ret i32 %X.01
 258 }
 259 </pre>
 260 </div>
 261
 262 <p>The mem2reg pass implements the standard "iterated dominator frontier"
 263 algorithm for constructing SSA form and has a number of optimizations that speed
 264 up very common degenerate cases.  mem2reg really is the answer for dealing with
 265 mutable variables, and we highly recommend that you depend on it.  Note that
 266 mem2reg only works on variables in certain circumstances:</p>
 267
 268 <ol>
 269 <li>mem2reg is alloca-driven: it looks for allocas and if it can handle them, it
 270 promotes them.  It does not apply to global variables or heap allocations.</li>
 271
 272 <li>mem2reg only looks for alloca instructions in the entry block of the
 273 function.  Being in the entry block guarantees that the alloca is only executed
 274 once, which makes analysis simpler.</li>
 275
 276 <li>mem2reg only promotes allocas whose uses are direct loads and stores.  If
 277 the address of the stack object is passed to a function, or if any funny pointer
 278 arithmetic is involved, the alloca will not be promoted.</li>
 279
 280 <li>mem2reg only works on allocas of <a
 281 href="../LangRef.html#t_classifications">first class</a>
 282 values (such as pointers, scalars and vectors), and only if the array size
 283 of the allocation is 1 (or missing in the .ll file).  mem2reg is not capable of
 284 promoting structs or arrays to registers.  Note that the "scalarrepl" pass is
 285 more powerful and can promote structs, "unions", and arrays in many cases.</li>
 286
 287 </ol>
 288
 289 <p>
 290 All of these properties are easy to satisfy for most imperative languages, and
 291 we'll illustrate this below with Kaleidoscope.  The final question you may be
 292 asking is: should I bother with this nonsense for my front-end?  Wouldn't it be
 293 better if I just did SSA construction directly, avoiding use of the mem2reg
 294 optimization pass?  In short, we strongly recommend that use you this technique
 295 for building SSA form, unless there is an extremely good reason not to.  Using
 296 this technique is:</p>
 297
 298 <ul>
 299 <li>Proven and well tested: llvm-gcc and clang both use this technique for local
 300 mutable variables.  As such, the most common clients of LLVM are using this to
 301 handle a bulk of their variables.  You can be sure that bugs are found fast and
 302 fixed early.</li>
 303
 304 <li>Extremely Fast: mem2reg has a number of special cases that make it fast in
 305 common cases as well as fully general.  For example, it has fast-paths for
 306 variables that are only used in a single block, variables that only have one
 307 assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc.
 308 </li>
 309
 310 <li>Needed for debug info generation: <a href="../SourceLevelDebugging.html">
 311 Debug information in LLVM</a> relies on having the address of the variable
 312 exposed to attach debug info to it.  This technique dovetails very naturally
 313 with this style of debug info.</li>
 314 </ul>
 315
 316 <p>If nothing else, this makes it much easier to get your front-end up and
 317 running, and is very simple to implement.  Lets extend Kaleidoscope with mutable
 318 variables now!
 319 </p>
 320
 321 </div>
 322
 323 <!-- *********************************************************************** -->
 324 <div class="doc_section"><a name="kalvars">Mutable Variables in
 325 Kaleidoscope</a></div>
 326 <!-- *********************************************************************** -->
 327
 328 <div class="doc_text">
 329
 330 <p>Now that we know the sort of problem we want to tackle, lets see what this
 331 looks like in the context of our little Kaleidoscope language.  We're going to
 332 add two features:</p>
 333
 334 <ol>
 335 <li>The ability to mutate variables with the '=' operator.</li>
 336 <li>The ability to define new variables.</li>
 337 </ol>
 338
 339 <p>While the first item is really what this is about, we only have variables
 340 for incoming arguments and for induction variables, and redefining them only
 341 goes so far :).  Also, the ability to define new variables is a
 342 useful thing regardless of whether you will be mutating them.  Here's a
 343 motivating example that shows how we could use these:</p>
 344
 345 <div class="doc_code">
 346 <pre>
 347 # Define ':' for sequencing: as a low-precedence operator that ignores operands
 348 # and just returns the RHS.
 349 def binary : 1 (x y) y;
 350
 351 # Recursive fib, we could do this before.
 352 def fib(x)
 353   if (x &lt; 3) then
 354     1
 355   else
 356     fib(x-1)+fib(x-2);
 357
 358 # Iterative fib.
 359 def fibi(x)
 360   <b>var a = 1, b = 1, c in</b>
 361   (for i = 3, i &;t; x in
 362      <b>c = a + b</b> :
 363      <b>a = b</b> :
 364      <b>b = c</b>) :
 365   b;
 366
 367 # Call it.
 368 fibi(10);
 369 </pre>
 370 </div>
 371
 372 <p>
 373 In order to mutate variables, we have to change our existing variables to use
 374 the "alloca trick".  Once we have that, we'll add our new operator, then extend
 375 Kaleidoscope to support new variable definitions.
 376 </p>
 377
 378 </div>
 379
 380 <!-- *********************************************************************** -->
 381 <div class="doc_section"><a name="adjustments">Adjusting Existing Variables for
 382 Mutation</a></div>
 383 <!-- *********************************************************************** -->
 384
 385 <div class="doc_text">
 386
 387 <p>
 388 The symbol table in Kaleidoscope is managed at code generation time by the
 389 '<tt>NamedValues</tt>' map.  This map currently keeps track of the LLVM "Value*"
 390 that holds the double value for the named variable.  In order to support
 391 mutation, we need to change this slightly, so that it <tt>NamedValues</tt> holds
 392 the <em>memory location</em> of the variable in question.  Note that this
 393 change is a refactoring: it changes the structure of the code, but does not
 394 (by itself) change the behavior of the compiler.  All of these changes are
 395 isolated in the Kaleidoscope code generator.</p>
 396
 397 <p>
 398 At this point in Kaleidoscope's development, it only supports variables for two
 399 things: incoming arguments to functions and the induction variable of 'for'
 400 loops.  For consistency, we'll allow mutation of these variables in addition to
 401 other user-defined variables.  This means that these will both need memory
 402 locations.
 403 </p>
 404
 405 <p>To start our transformation of Kaleidoscope, we'll change the NamedValues
 406 map to map to AllocaInst* instead of Value*.  Once we do this, the C++ compiler
 407 will tell use what parts of the code we need to update:</p>
 408
 409 <div class="doc_code">
 410 <pre>
 411 static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
 412 </pre>
 413 </div>
 414
 415 <p>Also, since we will need to create these alloca's, we'll use a helper
 416 function that ensures that the allocas are created in the entry block of the
 417 function:</p>
 418
 419 <div class="doc_code">
 420 <pre>
 421 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
 422 /// the function.  This is used for mutable variables etc.
 423 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
 424                                           const std::string &amp;VarName) {
 425   LLVMBuilder TmpB(&amp;TheFunction-&gt;getEntryBlock(),
 426                    TheFunction-&gt;getEntryBlock().begin());
 427   return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
 428 }
 429 </pre>
 430 </div>
 431
 432 <p>This funny looking code creates an LLVMBuilder object that is pointing at
 433 the first instruction (.begin()) of the entry block.  It then creates an alloca
 434 with the expected name and returns it.  Because all values in Kaleidoscope are
 435 doubles, there is no need to pass in a type to use.</p>
 436
 437 <p>With this in place, the first functionality change we want to make is to
 438 variable references.  In our new scheme, variables live on the stack, so code
 439 generating a reference to them actually needs to produce a load from the stack
 440 slot:</p>
 441
 442 <div class="doc_code">
 443 <pre>
 444 Value *VariableExprAST::Codegen() {
 445   // Look this variable up in the function.
 446   Value *V = NamedValues[Name];
 447   if (V == 0) return ErrorV("Unknown variable name");
 448
 449   // Load the value.
 450   return Builder.CreateLoad(V, Name.c_str());
 451 }
 452 </pre>
 453 </div>
 454
 455 <p>As you can see, this is pretty straight-forward.  Next we need to update the
 456 things that define the variables to set up the alloca.  We'll start with
 457 <tt>ForExprAST::Codegen</tt> (see the <a href="#code">full code listing</a> for
 458 the unabridged code):</p>
 459
 460 <div class="doc_code">
 461 <pre>
 462   Function *TheFunction = Builder.GetInsertBlock()->getParent();
 463
 464   <b>// Create an alloca for the variable in the entry block.
 465   AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);</b>
 466
 467     // Emit the start code first, without 'variable' in scope.
 468   Value *StartVal = Start-&gt;Codegen();
 469   if (StartVal == 0) return 0;
 470
 471   <b>// Store the value into the alloca.
 472   Builder.CreateStore(StartVal, Alloca);</b>
 473   ...
 474
 475   // Compute the end condition.
 476   Value *EndCond = End-&gt;Codegen();
 477   if (EndCond == 0) return EndCond;
 478
 479   <b>// Reload, increment, and restore the alloca.  This handles the case where
 480   // the body of the loop mutates the variable.
 481   Value *CurVar = Builder.CreateLoad(Alloca);
 482   Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
 483   Builder.CreateStore(NextVar, Alloca);</b>
 484   ...
 485 </pre>
 486 </div>
 487
 488 <p>This code is virtually identical to the code <a
 489 href="LangImpl5.html#forcodegen">before we allowed mutable variables</a>.  The
 490 big difference is that we no longer have to construct a PHI node, and we use
 491 load/store to access the variable as needed.</p>
 492
 493 <p>To support mutable argument variables, we need to also make allocas for them.
 494 The code for this is also pretty simple:</p>
 495
 496 <div class="doc_code">
 497 <pre>
 498 /// CreateArgumentAllocas - Create an alloca for each argument and register the
 499 /// argument in the symbol table so that references to it will succeed.
 500 void PrototypeAST::CreateArgumentAllocas(Function *F) {
 501   Function::arg_iterator AI = F-&gt;arg_begin();
 502   for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
 503     // Create an alloca for this variable.
 504     AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
 505
 506     // Store the initial value into the alloca.
 507     Builder.CreateStore(AI, Alloca);
 508
 509     // Add arguments to variable symbol table.
 510     NamedValues[Args[Idx]] = Alloca;
 511   }
 512 }
 513 </pre>
 514 </div>
 515
 516 <p>For each argument, we make an alloca, store the input value to the function
 517 into the alloca, and register the alloca as the memory location for the
 518 argument.  This method gets invoked by <tt>FunctionAST::Codegen</tt> right after
 519 it sets up the entry block for the function.</p>
 520
 521 <p>The final missing piece is adding the 'mem2reg' pass, which allows us to get
 522 good codegen once again:</p>
 523
 524 <div class="doc_code">
 525 <pre>
 526     // Set up the optimizer pipeline.  Start with registering info about how the
 527     // target lays out data structures.
 528     OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
 529     <b>// Promote allocas to registers.
 530     OurFPM.add(createPromoteMemoryToRegisterPass());</b>
 531     // Do simple "peephole" optimizations and bit-twiddling optzns.
 532     OurFPM.add(createInstructionCombiningPass());
 533     // Reassociate expressions.
 534     OurFPM.add(createReassociatePass());
 535 </pre>
 536 </div>
 537
 538 <p>It is interesting to see what the code looks like before and after the
 539 mem2reg optimization runs.  For example, this is the before/after code for our
 540 recursive fib.  Before the optimization:</p>
 541
 542 <div class="doc_code">
 543 <pre>
 544 define double @fib(double %x) {
 545 entry:
 546         <b>%x1 = alloca double
 547         store double %x, double* %x1
 548         %x2 = load double* %x1</b>
 549         %multmp = fcmp ult double %x2, 3.000000e+00
 550         %booltmp = uitofp i1 %multmp to double
 551         %ifcond = fcmp one double %booltmp, 0.000000e+00
 552         br i1 %ifcond, label %then, label %else
 553
 554 then:           ; preds = %entry
 555         br label %ifcont
 556
 557 else:           ; preds = %entry
 558         <b>%x3 = load double* %x1</b>
 559         %subtmp = sub double %x3, 1.000000e+00
 560         %calltmp = call double @fib( double %subtmp )
 561         <b>%x4 = load double* %x1</b>
 562         %subtmp5 = sub double %x4, 2.000000e+00
 563         %calltmp6 = call double @fib( double %subtmp5 )
 564         %addtmp = add double %calltmp, %calltmp6
 565         br label %ifcont
 566
 567 ifcont:         ; preds = %else, %then
 568         %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
 569         ret double %iftmp
 570 }
 571 </pre>
 572 </div>
 573
 574 <p>Here there is only one variable (x, the input argument) but you can still
 575 see the extremely simple-minded code generation strategy we are using.  In the
 576 entry block, an alloca is created, and the initial input value is stored into
 577 it.  Each reference to the variable does a reload from the stack.  Also, note
 578 that we didn't modify the if/then/else expression, so it still inserts a PHI
 579 node.  While we could make an alloca for it, it is actually easier to create a
 580 PHI node for it, so we still just make the PHI.</p>
 581
 582 <p>Here is the code after the mem2reg pass runs:</p>
 583
 584 <div class="doc_code">
 585 <pre>
 586 define double @fib(double %x) {
 587 entry:
 588         %multmp = fcmp ult double <b>%x</b>, 3.000000e+00
 589         %booltmp = uitofp i1 %multmp to double
 590         %ifcond = fcmp one double %booltmp, 0.000000e+00
 591         br i1 %ifcond, label %then, label %else
 592
 593 then:
 594         br label %ifcont
 595
 596 else:
 597         %subtmp = sub double <b>%x</b>, 1.000000e+00
 598         %calltmp = call double @fib( double %subtmp )
 599         %subtmp5 = sub double <b>%x</b>, 2.000000e+00
 600         %calltmp6 = call double @fib( double %subtmp5 )
 601         %addtmp = add double %calltmp, %calltmp6
 602         br label %ifcont
 603
 604 ifcont:         ; preds = %else, %then
 605         %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
 606         ret double %iftmp
 607 }
 608 </pre>
 609 </div>
 610
 611 <p>This is a trivial case for mem2reg, since there are no redefinitions of the
 612 variable.  The point of showing this is to calm your tension about inserting
 613 such blatent inefficiencies :).</p>
 614
 615 <p>After the rest of the optimizers run, we get:</p>
 616
 617 <div class="doc_code">
 618 <pre>
 619 define double @fib(double %x) {
 620 entry:
 621         %multmp = fcmp ult double %x, 3.000000e+00
 622         %booltmp = uitofp i1 %multmp to double
 623         %ifcond = fcmp ueq double %booltmp, 0.000000e+00
 624         br i1 %ifcond, label %else, label %ifcont
 625
 626 else:
 627         %subtmp = sub double %x, 1.000000e+00
 628         %calltmp = call double @fib( double %subtmp )
 629         %subtmp5 = sub double %x, 2.000000e+00
 630         %calltmp6 = call double @fib( double %subtmp5 )
 631         %addtmp = add double %calltmp, %calltmp6
 632         ret double %addtmp
 633
 634 ifcont:
 635         ret double 1.000000e+00
 636 }
 637 </pre>
 638 </div>
 639
 640 <p>Here we see that the simplifycfg pass decided to clone the return instruction
 641 into the end of the 'else' block.  This allowed it to eliminate some branches
 642 and the PHI node.</p>
 643
 644 <p>Now that all symbol table references are updated to use stack variables,
 645 we'll add the assignment operator.</p>
 646
 647 </div>
 648
 649 <!-- *********************************************************************** -->
 650 <div class="doc_section"><a name="assignment">New Assignment Operator</a></div>
 651 <!-- *********************************************************************** -->
 652
 653 <div class="doc_text">
 654
 655 <p>With our current framework, adding a new assignment operator is really
 656 simple.  We will parse it just like any other binary operator, but handle it
 657 internally (instead of allowing the user to define it).  The first step is to
 658 set a precedence:</p>
 659
 660 <div class="doc_code">
 661 <pre>
 662  int main() {
 663    // Install standard binary operators.
 664    // 1 is lowest precedence.
 665    <b>BinopPrecedence['='] = 2;</b>
 666    BinopPrecedence['&lt;'] = 10;
 667    BinopPrecedence['+'] = 20;
 668    BinopPrecedence['-'] = 20;
 669 </pre>
 670 </div>
 671
 672 <p>Now that the parser knows the precedence of the binary operator, it takes
 673 care of all the parsing and AST generation.  We just need to implement codegen
 674 for the assignment operator.  This looks like:</p>
 675
 676 <div class="doc_code">
 677 <pre>
 678 Value *BinaryExprAST::Codegen() {
 679   // Special case '=' because we don't want to emit the LHS as an expression.
 680   if (Op == '=') {
 681     // Assignment requires the LHS to be an identifier.
 682     VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
 683     if (!LHSE)
 684       return ErrorV("destination of '=' must be a variable");
 685 </pre>
 686 </div>
 687
 688 <p>Unlike the rest of the binary operators, our assignment operator doesn't
 689 follow the "emit LHS, emit RHS, do computation" model.  As such, it is handled
 690 as a special case before the other binary operators are handled.  The other
 691 strange thing about it is that it requires the LHS to be a variable directly.
 692 </p>
 693
 694 <div class="doc_code">
 695 <pre>
 696     // Codegen the RHS.
 697     Value *Val = RHS-&gt;Codegen();
 698     if (Val == 0) return 0;
 699
 700     // Look up the name.
 701     Value *Variable = NamedValues[LHSE-&gt;getName()];
 702     if (Variable == 0) return ErrorV("Unknown variable name");
 703
 704     Builder.CreateStore(Val, Variable);
 705     return Val;
 706   }
 707   ...
 708 </pre>
 709 </div>
 710
 711 <p>Once it has the variable, codegen'ing the assignment is straight-forward:
 712 we emit the RHS of the assignment, create a store, and return the computed
 713 value.  Returning a value allows for chained assignments like "X = (Y = Z)".</p>
 714
 715 <p>Now that we have an assignment operator, we can mutate loop variables and
 716 arguments.  For example, we can now run code like this:</p>
 717
 718 <div class="doc_code">
 719 <pre>
 720 # Function to print a double.
 721 extern printd(x);
 722
 723 # Define ':' for sequencing: as a low-precedence operator that ignores operands
 724 # and just returns the RHS.
 725 def binary : 1 (x y) y;
 726
 727 def test(x)
 728   printd(x) :
 729   x = 4 :
 730   printd(x);
 731
 732 test(123);
 733 </pre>
 734 </div>
 735
 736 <p>When run, this example prints "123" and then "4", showing that we did
 737 actually mutate the value!  Okay, we have now officially implemented our goal:
 738 getting this to work requires SSA construction in the general case.  However,
 739 to be really useful, we want the ability to define our own local variables, lets
 740 add this next!
 741 </p>
 742
 743 </div>
 744
 745 <!-- *********************************************************************** -->
 746 <div class="doc_section"><a name="localvars">User-defined Local
 747 Variables</a></div>
 748 <!-- *********************************************************************** -->
 749
 750 <div class="doc_text">
 751
 752 <p>Adding var/in is just like any other other extensions we made to
 753 Kaleidoscope: we extend the lexer, the parser, the AST and the code generator.
 754 The first step for adding our new 'var/in' construct is to extend the lexer.
 755 As before, this is pretty trivial, the code looks like this:</p>
 756
 757 <div class="doc_code">
 758 <pre>
 759 enum Token {
 760   ...
 761   <b>// var definition
 762   tok_var = -13</b>
 763 ...
 764 }
 765 ...
 766 static int gettok() {
 767 ...
 768     if (IdentifierStr == "in") return tok_in;
 769     if (IdentifierStr == "binary") return tok_binary;
 770     if (IdentifierStr == "unary") return tok_unary;
 771     <b>if (IdentifierStr == "var") return tok_var;</b>
 772     return tok_identifier;
 773 ...
 774 </pre>
 775 </div>
 776
 777 <p>The next step is to define the AST node that we will construct.  For var/in,
 778 it will look like this:</p>
 779
 780 <div class="doc_code">
 781 <pre>
 782 /// VarExprAST - Expression class for var/in
 783 class VarExprAST : public ExprAST {
 784   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
 785   ExprAST *Body;
 786 public:
 787   VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
 788              ExprAST *body)
 789   : VarNames(varnames), Body(body) {}
 790
 791   virtual Value *Codegen();
 792 };
 793 </pre>
 794 </div>
 795
 796 <p>var/in allows a list of names to be defined all at once, and each name can
 797 optionally have an initializer value.  As such, we capture this information in
 798 the VarNames vector.  Also, var/in has a body, this body is allowed to access
 799 the variables defined by the let/in.</p>
 800
 801 <p>With this ready, we can define the parser pieces.  First thing we do is add
 802 it as a primary expression:</p>
 803
 804 <div class="doc_code">
 805 <pre>
 806 /// primary
 807 ///   ::= identifierexpr
 808 ///   ::= numberexpr
 809 ///   ::= parenexpr
 810 ///   ::= ifexpr
 811 ///   ::= forexpr
 812 <b>///   ::= varexpr</b>
 813 static ExprAST *ParsePrimary() {
 814   switch (CurTok) {
 815   default: return Error("unknown token when expecting an expression");
 816   case tok_identifier: return ParseIdentifierExpr();
 817   case tok_number:     return ParseNumberExpr();
 818   case '(':            return ParseParenExpr();
 819   case tok_if:         return ParseIfExpr();
 820   case tok_for:        return ParseForExpr();
 821   <b>case tok_var:        return ParseVarExpr();</b>
 822   }
 823 }
 824 </pre>
 825 </div>
 826
 827 <p>Next we define ParseVarExpr:</p>
 828
 829 <div class="doc_code">
 830 <pre>
 831 /// varexpr ::= 'var' identifier ('=' expression)?
 832 //                    (',' identifier ('=' expression)?)* 'in' expression
 833 static ExprAST *ParseVarExpr() {
 834   getNextToken();  // eat the var.
 835
 836   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
 837
 838   // At least one variable name is required.
 839   if (CurTok != tok_identifier)
 840     return Error("expected identifier after var");
 841 </pre>
 842 </div>
 843
 844 <p>The first part of this code parses the list of identifier/expr pairs into the
 845 local <tt>VarNames</tt> vector.
 846
 847 <div class="doc_code">
 848 <pre>
 849   while (1) {
 850     std::string Name = IdentifierStr;
 851     getNextToken();  // eat identifier.
 852
 853     // Read the optional initializer.
 854     ExprAST *Init = 0;
 855     if (CurTok == '=') {
 856       getNextToken(); // eat the '='.
 857
 858       Init = ParseExpression();
 859       if (Init == 0) return 0;
 860     }
 861
 862     VarNames.push_back(std::make_pair(Name, Init));
 863
 864     // End of var list, exit loop.
 865     if (CurTok != ',') break;
 866     getNextToken(); // eat the ','.
 867
 868     if (CurTok != tok_identifier)
 869       return Error("expected identifier list after var");
 870   }
 871 </pre>
 872 </div>
 873
 874 <p>Once all the variables are parsed, we then parse the body and create the
 875 AST node:</p>
 876
 877 <div class="doc_code">
 878 <pre>
 879   // At this point, we have to have 'in'.
 880   if (CurTok != tok_in)
 881     return Error("expected 'in' keyword after 'var'");
 882   getNextToken();  // eat 'in'.
 883
 884   ExprAST *Body = ParseExpression();
 885   if (Body == 0) return 0;
 886
 887   return new VarExprAST(VarNames, Body);
 888 }
 889 </pre>
 890 </div>
 891
 892 <p>Now that we can parse and represent the code, we need to support emission of
 893 LLVM IR for it.  This code starts out with:</p>
 894
 895 <div class="doc_code">
 896 <pre>
 897 Value *VarExprAST::Codegen() {
 898   std::vector&lt;AllocaInst *&gt; OldBindings;
 899
 900   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
 901
 902   // Register all variables and emit their initializer.
 903   for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
 904     const std::string &amp;VarName = VarNames[i].first;
 905     ExprAST *Init = VarNames[i].second;
 906 </pre>
 907 </div>
 908
 909 <p>Basically it loops over all the variables, installing them one at a time.
 910 For each variable we put into the symbol table, we remember the previous value
 911 that we replace in OldBindings.</p>
 912
 913 <div class="doc_code">
 914 <pre>
 915     // Emit the initializer before adding the variable to scope, this prevents
 916     // the initializer from referencing the variable itself, and permits stuff
 917     // like this:
 918     //  var a = 1 in
 919     //    var a = a in ...   # refers to outer 'a'.
 920     Value *InitVal;
 921     if (Init) {
 922       InitVal = Init-&gt;Codegen();
 923       if (InitVal == 0) return 0;
 924     } else { // If not specified, use 0.0.
 925       InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
 926     }
 927
 928     AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
 929     Builder.CreateStore(InitVal, Alloca);
 930
 931     // Remember the old variable binding so that we can restore the binding when
 932     // we unrecurse.
 933     OldBindings.push_back(NamedValues[VarName]);
 934
 935     // Remember this binding.
 936     NamedValues[VarName] = Alloca;
 937   }
 938 </pre>
 939 </div>
 940
 941 <p>There are more comments here than code.  The basic idea is that we emit the
 942 initializer, create the alloca, then update the symbol table to point to it.
 943 Once all the variables are installed in the symbol table, we evaluate the body
 944 of the var/in expression:</p>
 945
 946 <div class="doc_code">
 947 <pre>
 948   // Codegen the body, now that all vars are in scope.
 949   Value *BodyVal = Body-&gt;Codegen();
 950   if (BodyVal == 0) return 0;
 951 </pre>
 952 </div>
 953
 954 <p>Finally, before returning, we restore the previous variable bindings:</p>
 955
 956 <div class="doc_code">
 957 <pre>
 958   // Pop all our variables from scope.
 959   for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
 960     NamedValues[VarNames[i].first] = OldBindings[i];
 961
 962   // Return the body computation.
 963   return BodyVal;
 964 }
 965 </pre>
 966 </div>
 967
 968 <p>The end result of all of this is that we get properly scoped variable
 969 definitions, and we even (trivially) allow mutation of them :).</p>
 970
 971 <p>With this, we completed what we set out to do.  Our nice iterative fib
 972 example from the intro compiles and runs just fine.  The mem2reg pass optimizes
 973 all of our stack variables into SSA registers, inserting PHI nodes where needed,
 974 and our front-end remains simple: no iterated dominator frontier computation
 975 anywhere in sight.</p>
 976
 977 </div>
 978
 979 <!-- *********************************************************************** -->
 980 <div class="doc_section"><a name="code">Full Code Listing</a></div>
 981 <!-- *********************************************************************** -->
 982
 983 <div class="doc_text">
 984
 985 <p>
 986 Here is the complete code listing for our running example, enhanced with mutable
 987 variables and var/in support.  To build this example, use:
 988 </p>
 989
 990 <div class="doc_code">
 991 <pre>
 992    # Compile
 993    g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
 994    # Run
 995    ./toy
 996 </pre>
 997 </div>
 998
 999 <p>Here is the code:</p>
1000
1001 <div class="doc_code">
1002 <pre>
1003 #include "llvm/DerivedTypes.h"
1004 #include "llvm/ExecutionEngine/ExecutionEngine.h"
1005 #include "llvm/Module.h"
1006 #include "llvm/ModuleProvider.h"
1007 #include "llvm/PassManager.h"
1008 #include "llvm/Analysis/Verifier.h"
1009 #include "llvm/Target/TargetData.h"
1010 #include "llvm/Transforms/Scalar.h"
1011 #include "llvm/Support/LLVMBuilder.h"
1012 #include &lt;cstdio&gt;
1013 #include &lt;string&gt;
1014 #include &lt;map&gt;
1015 #include &lt;vector&gt;
1016 using namespace llvm;
1017
1018 //===----------------------------------------------------------------------===//
1019 // Lexer
1020 //===----------------------------------------------------------------------===//
1021
1022 // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
1023 // of these for known things.
1024 enum Token {
1025   tok_eof = -1,
1026
1027   // commands
1028   tok_def = -2, tok_extern = -3,
1029
1030   // primary
1031   tok_identifier = -4, tok_number = -5,
1032
1033   // control
1034   tok_if = -6, tok_then = -7, tok_else = -8,
1035   tok_for = -9, tok_in = -10,
1036
1037   // operators
1038   tok_binary = -11, tok_unary = -12,
1039
1040   // var definition
1041   tok_var = -13
1042 };
1043
1044 static std::string IdentifierStr;  // Filled in if tok_identifier
1045 static double NumVal;              // Filled in if tok_number
1046
1047 /// gettok - Return the next token from standard input.
1048 static int gettok() {
1049   static int LastChar = ' ';
1050
1051   // Skip any whitespace.
1052   while (isspace(LastChar))
1053     LastChar = getchar();
1054
1055   if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
1056     IdentifierStr = LastChar;
1057     while (isalnum((LastChar = getchar())))
1058       IdentifierStr += LastChar;
1059
1060     if (IdentifierStr == "def") return tok_def;
1061     if (IdentifierStr == "extern") return tok_extern;
1062     if (IdentifierStr == "if") return tok_if;
1063     if (IdentifierStr == "then") return tok_then;
1064     if (IdentifierStr == "else") return tok_else;
1065     if (IdentifierStr == "for") return tok_for;
1066     if (IdentifierStr == "in") return tok_in;
1067     if (IdentifierStr == "binary") return tok_binary;
1068     if (IdentifierStr == "unary") return tok_unary;
1069     if (IdentifierStr == "var") return tok_var;
1070     return tok_identifier;
1071   }
1072
1073   if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
1074     std::string NumStr;
1075     do {
1076       NumStr += LastChar;
1077       LastChar = getchar();
1078     } while (isdigit(LastChar) || LastChar == '.');
1079
1080     NumVal = strtod(NumStr.c_str(), 0);
1081     return tok_number;
1082   }
1083
1084   if (LastChar == '#') {
1085     // Comment until end of line.
1086     do LastChar = getchar();
1087     while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
1088
1089     if (LastChar != EOF)
1090       return gettok();
1091   }
1092
1093   // Check for end of file.  Don't eat the EOF.
1094   if (LastChar == EOF)
1095     return tok_eof;
1096
1097   // Otherwise, just return the character as its ascii value.
1098   int ThisChar = LastChar;
1099   LastChar = getchar();
1100   return ThisChar;
1101 }
1102
1103 //===----------------------------------------------------------------------===//
1104 // Abstract Syntax Tree (aka Parse Tree)
1105 //===----------------------------------------------------------------------===//
1106
1107 /// ExprAST - Base class for all expression nodes.
1108 class ExprAST {
1109 public:
1110   virtual ~ExprAST() {}
1111   virtual Value *Codegen() = 0;
1112 };
1113
1114 /// NumberExprAST - Expression class for numeric literals like "1.0".
1115 class NumberExprAST : public ExprAST {
1116   double Val;
1117 public:
1118   NumberExprAST(double val) : Val(val) {}
1119   virtual Value *Codegen();
1120 };
1121
1122 /// VariableExprAST - Expression class for referencing a variable, like "a".
1123 class VariableExprAST : public ExprAST {
1124   std::string Name;
1125 public:
1126   VariableExprAST(const std::string &amp;name) : Name(name) {}
1127   const std::string &amp;getName() const { return Name; }
1128   virtual Value *Codegen();
1129 };
1130
1131 /// UnaryExprAST - Expression class for a unary operator.
1132 class UnaryExprAST : public ExprAST {
1133   char Opcode;
1134   ExprAST *Operand;
1135 public:
1136   UnaryExprAST(char opcode, ExprAST *operand)
1137     : Opcode(opcode), Operand(operand) {}
1138   virtual Value *Codegen();
1139 };
1140
1141 /// BinaryExprAST - Expression class for a binary operator.
1142 class BinaryExprAST : public ExprAST {
1143   char Op;
1144   ExprAST *LHS, *RHS;
1145 public:
1146   BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
1147     : Op(op), LHS(lhs), RHS(rhs) {}
1148   virtual Value *Codegen();
1149 };
1150
1151 /// CallExprAST - Expression class for function calls.
1152 class CallExprAST : public ExprAST {
1153   std::string Callee;
1154   std::vector&lt;ExprAST*&gt; Args;
1155 public:
1156   CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1157     : Callee(callee), Args(args) {}
1158   virtual Value *Codegen();
1159 };
1160
1161 /// IfExprAST - Expression class for if/then/else.
1162 class IfExprAST : public ExprAST {
1163   ExprAST *Cond, *Then, *Else;
1164 public:
1165   IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1166   : Cond(cond), Then(then), Else(_else) {}
1167   virtual Value *Codegen();
1168 };
1169
1170 /// ForExprAST - Expression class for for/in.
1171 class ForExprAST : public ExprAST {
1172   std::string VarName;
1173   ExprAST *Start, *End, *Step, *Body;
1174 public:
1175   ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1176              ExprAST *step, ExprAST *body)
1177     : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1178   virtual Value *Codegen();
1179 };
1180
1181 /// VarExprAST - Expression class for var/in
1182 class VarExprAST : public ExprAST {
1183   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1184   ExprAST *Body;
1185 public:
1186   VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
1187              ExprAST *body)
1188   : VarNames(varnames), Body(body) {}
1189
1190   virtual Value *Codegen();
1191 };
1192
1193 /// PrototypeAST - This class represents the "prototype" for a function,
1194 /// which captures its argument names as well as if it is an operator.
1195 class PrototypeAST {
1196   std::string Name;
1197   std::vector&lt;std::string&gt; Args;
1198   bool isOperator;
1199   unsigned Precedence;  // Precedence if a binary op.
1200 public:
1201   PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1202                bool isoperator = false, unsigned prec = 0)
1203   : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1204
1205   bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1206   bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1207
1208   char getOperatorName() const {
1209     assert(isUnaryOp() || isBinaryOp());
1210     return Name[Name.size()-1];
1211   }
1212
1213   unsigned getBinaryPrecedence() const { return Precedence; }
1214
1215   Function *Codegen();
1216
1217   void CreateArgumentAllocas(Function *F);
1218 };
1219
1220 /// FunctionAST - This class represents a function definition itself.
1221 class FunctionAST {
1222   PrototypeAST *Proto;
1223   ExprAST *Body;
1224 public:
1225   FunctionAST(PrototypeAST *proto, ExprAST *body)
1226     : Proto(proto), Body(body) {}
1227
1228   Function *Codegen();
1229 };
1230
1231 //===----------------------------------------------------------------------===//
1232 // Parser
1233 //===----------------------------------------------------------------------===//
1234
1235 /// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
1236 /// token the parser it looking at.  getNextToken reads another token from the
1237 /// lexer and updates CurTok with its results.
1238 static int CurTok;
1239 static int getNextToken() {
1240   return CurTok = gettok();
1241 }
1242
1243 /// BinopPrecedence - This holds the precedence for each binary operator that is
1244 /// defined.
1245 static std::map&lt;char, int&gt; BinopPrecedence;
1246
1247 /// GetTokPrecedence - Get the precedence of the pending binary operator token.
1248 static int GetTokPrecedence() {
1249   if (!isascii(CurTok))
1250     return -1;
1251
1252   // Make sure it's a declared binop.
1253   int TokPrec = BinopPrecedence[CurTok];
1254   if (TokPrec &lt;= 0) return -1;
1255   return TokPrec;
1256 }
1257
1258 /// Error* - These are little helper functions for error handling.
1259 ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1260 PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1261 FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1262
1263 static ExprAST *ParseExpression();
1264
1265 /// identifierexpr
1266 ///   ::= identifier
1267 ///   ::= identifier '(' expression* ')'
1268 static ExprAST *ParseIdentifierExpr() {
1269   std::string IdName = IdentifierStr;
1270
1271   getNextToken();  // eat identifier.
1272
1273   if (CurTok != '(') // Simple variable ref.
1274     return new VariableExprAST(IdName);
1275
1276   // Call.
1277   getNextToken();  // eat (
1278   std::vector&lt;ExprAST*&gt; Args;
1279   if (CurTok != ')') {
1280     while (1) {
1281       ExprAST *Arg = ParseExpression();
1282       if (!Arg) return 0;
1283       Args.push_back(Arg);
1284
1285       if (CurTok == ')') break;
1286
1287       if (CurTok != ',')
1288         return Error("Expected ')'");
1289       getNextToken();
1290     }
1291   }
1292
1293   // Eat the ')'.
1294   getNextToken();
1295
1296   return new CallExprAST(IdName, Args);
1297 }
1298
1299 /// numberexpr ::= number
1300 static ExprAST *ParseNumberExpr() {
1301   ExprAST *Result = new NumberExprAST(NumVal);
1302   getNextToken(); // consume the number
1303   return Result;
1304 }
1305
1306 /// parenexpr ::= '(' expression ')'
1307 static ExprAST *ParseParenExpr() {
1308   getNextToken();  // eat (.
1309   ExprAST *V = ParseExpression();
1310   if (!V) return 0;
1311
1312   if (CurTok != ')')
1313     return Error("expected ')'");
1314   getNextToken();  // eat ).
1315   return V;
1316 }
1317
1318 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
1319 static ExprAST *ParseIfExpr() {
1320   getNextToken();  // eat the if.
1321
1322   // condition.
1323   ExprAST *Cond = ParseExpression();
1324   if (!Cond) return 0;
1325
1326   if (CurTok != tok_then)
1327     return Error("expected then");
1328   getNextToken();  // eat the then
1329
1330   ExprAST *Then = ParseExpression();
1331   if (Then == 0) return 0;
1332
1333   if (CurTok != tok_else)
1334     return Error("expected else");
1335
1336   getNextToken();
1337
1338   ExprAST *Else = ParseExpression();
1339   if (!Else) return 0;
1340
1341   return new IfExprAST(Cond, Then, Else);
1342 }
1343
1344 /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
1345 static ExprAST *ParseForExpr() {
1346   getNextToken();  // eat the for.
1347
1348   if (CurTok != tok_identifier)
1349     return Error("expected identifier after for");
1350
1351   std::string IdName = IdentifierStr;
1352   getNextToken();  // eat identifier.
1353
1354   if (CurTok != '=')
1355     return Error("expected '=' after for");
1356   getNextToken();  // eat '='.
1357
1358
1359   ExprAST *Start = ParseExpression();
1360   if (Start == 0) return 0;
1361   if (CurTok != ',')
1362     return Error("expected ',' after for start value");
1363   getNextToken();
1364
1365   ExprAST *End = ParseExpression();
1366   if (End == 0) return 0;
1367
1368   // The step value is optional.
1369   ExprAST *Step = 0;
1370   if (CurTok == ',') {
1371     getNextToken();
1372     Step = ParseExpression();
1373     if (Step == 0) return 0;
1374   }
1375
1376   if (CurTok != tok_in)
1377     return Error("expected 'in' after for");
1378   getNextToken();  // eat 'in'.
1379
1380   ExprAST *Body = ParseExpression();
1381   if (Body == 0) return 0;
1382
1383   return new ForExprAST(IdName, Start, End, Step, Body);
1384 }
1385
1386 /// varexpr ::= 'var' identifier ('=' expression)?
1387 //                    (',' identifier ('=' expression)?)* 'in' expression
1388 static ExprAST *ParseVarExpr() {
1389   getNextToken();  // eat the var.
1390
1391   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1392
1393   // At least one variable name is required.
1394   if (CurTok != tok_identifier)
1395     return Error("expected identifier after var");
1396
1397   while (1) {
1398     std::string Name = IdentifierStr;
1399     getNextToken();  // eat identifier.
1400
1401     // Read the optional initializer.
1402     ExprAST *Init = 0;
1403     if (CurTok == '=') {
1404       getNextToken(); // eat the '='.
1405
1406       Init = ParseExpression();
1407       if (Init == 0) return 0;
1408     }
1409
1410     VarNames.push_back(std::make_pair(Name, Init));
1411
1412     // End of var list, exit loop.
1413     if (CurTok != ',') break;
1414     getNextToken(); // eat the ','.
1415
1416     if (CurTok != tok_identifier)
1417       return Error("expected identifier list after var");
1418   }
1419
1420   // At this point, we have to have 'in'.
1421   if (CurTok != tok_in)
1422     return Error("expected 'in' keyword after 'var'");
1423   getNextToken();  // eat 'in'.
1424
1425   ExprAST *Body = ParseExpression();
1426   if (Body == 0) return 0;
1427
1428   return new VarExprAST(VarNames, Body);
1429 }
1430
1431
1432 /// primary
1433 ///   ::= identifierexpr
1434 ///   ::= numberexpr
1435 ///   ::= parenexpr
1436 ///   ::= ifexpr
1437 ///   ::= forexpr
1438 ///   ::= varexpr
1439 static ExprAST *ParsePrimary() {
1440   switch (CurTok) {
1441   default: return Error("unknown token when expecting an expression");
1442   case tok_identifier: return ParseIdentifierExpr();
1443   case tok_number:     return ParseNumberExpr();
1444   case '(':            return ParseParenExpr();
1445   case tok_if:         return ParseIfExpr();
1446   case tok_for:        return ParseForExpr();
1447   case tok_var:        return ParseVarExpr();
1448   }
1449 }
1450
1451 /// unary
1452 ///   ::= primary
1453 ///   ::= '!' unary
1454 static ExprAST *ParseUnary() {
1455   // If the current token is not an operator, it must be a primary expr.
1456   if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1457     return ParsePrimary();
1458
1459   // If this is a unary operator, read it.
1460   int Opc = CurTok;
1461   getNextToken();
1462   if (ExprAST *Operand = ParseUnary())
1463     return new UnaryExprAST(Opc, Operand);
1464   return 0;
1465 }
1466
1467 /// binoprhs
1468 ///   ::= ('+' unary)*
1469 static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1470   // If this is a binop, find its precedence.
1471   while (1) {
1472     int TokPrec = GetTokPrecedence();
1473
1474     // If this is a binop that binds at least as tightly as the current binop,
1475     // consume it, otherwise we are done.
1476     if (TokPrec &lt; ExprPrec)
1477       return LHS;
1478
1479     // Okay, we know this is a binop.
1480     int BinOp = CurTok;
1481     getNextToken();  // eat binop
1482
1483     // Parse the unary expression after the binary operator.
1484     ExprAST *RHS = ParseUnary();
1485     if (!RHS) return 0;
1486
1487     // If BinOp binds less tightly with RHS than the operator after RHS, let
1488     // the pending operator take RHS as its LHS.
1489     int NextPrec = GetTokPrecedence();
1490     if (TokPrec &lt; NextPrec) {
1491       RHS = ParseBinOpRHS(TokPrec+1, RHS);
1492       if (RHS == 0) return 0;
1493     }
1494
1495     // Merge LHS/RHS.
1496     LHS = new BinaryExprAST(BinOp, LHS, RHS);
1497   }
1498 }
1499
1500 /// expression
1501 ///   ::= unary binoprhs
1502 ///
1503 static ExprAST *ParseExpression() {
1504   ExprAST *LHS = ParseUnary();
1505   if (!LHS) return 0;
1506
1507   return ParseBinOpRHS(0, LHS);
1508 }
1509
1510 /// prototype
1511 ///   ::= id '(' id* ')'
1512 ///   ::= binary LETTER number? (id, id)
1513 ///   ::= unary LETTER (id)
1514 static PrototypeAST *ParsePrototype() {
1515   std::string FnName;
1516
1517   int Kind = 0;  // 0 = identifier, 1 = unary, 2 = binary.
1518   unsigned BinaryPrecedence = 30;
1519
1520   switch (CurTok) {
1521   default:
1522     return ErrorP("Expected function name in prototype");
1523   case tok_identifier:
1524     FnName = IdentifierStr;
1525     Kind = 0;
1526     getNextToken();
1527     break;
1528   case tok_unary:
1529     getNextToken();
1530     if (!isascii(CurTok))
1531       return ErrorP("Expected unary operator");
1532     FnName = "unary";
1533     FnName += (char)CurTok;
1534     Kind = 1;
1535     getNextToken();
1536     break;
1537   case tok_binary:
1538     getNextToken();
1539     if (!isascii(CurTok))
1540       return ErrorP("Expected binary operator");
1541     FnName = "binary";
1542     FnName += (char)CurTok;
1543     Kind = 2;
1544     getNextToken();
1545
1546     // Read the precedence if present.
1547     if (CurTok == tok_number) {
1548       if (NumVal &lt; 1 || NumVal &gt; 100)
1549         return ErrorP("Invalid precedecnce: must be 1..100");
1550       BinaryPrecedence = (unsigned)NumVal;
1551       getNextToken();
1552     }
1553     break;
1554   }
1555
1556   if (CurTok != '(')
1557     return ErrorP("Expected '(' in prototype");
1558
1559   std::vector&lt;std::string&gt; ArgNames;
1560   while (getNextToken() == tok_identifier)
1561     ArgNames.push_back(IdentifierStr);
1562   if (CurTok != ')')
1563     return ErrorP("Expected ')' in prototype");
1564
1565   // success.
1566   getNextToken();  // eat ')'.
1567
1568   // Verify right number of names for operator.
1569   if (Kind &amp;&amp; ArgNames.size() != Kind)
1570     return ErrorP("Invalid number of operands for operator");
1571
1572   return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1573 }
1574
1575 /// definition ::= 'def' prototype expression
1576 static FunctionAST *ParseDefinition() {
1577   getNextToken();  // eat def.
1578   PrototypeAST *Proto = ParsePrototype();
1579   if (Proto == 0) return 0;
1580
1581   if (ExprAST *E = ParseExpression())
1582     return new FunctionAST(Proto, E);
1583   return 0;
1584 }
1585
1586 /// toplevelexpr ::= expression
1587 static FunctionAST *ParseTopLevelExpr() {
1588   if (ExprAST *E = ParseExpression()) {
1589     // Make an anonymous proto.
1590     PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1591     return new FunctionAST(Proto, E);
1592   }
1593   return 0;
1594 }
1595
1596 /// external ::= 'extern' prototype
1597 static PrototypeAST *ParseExtern() {
1598   getNextToken();  // eat extern.
1599   return ParsePrototype();
1600 }
1601
1602 //===----------------------------------------------------------------------===//
1603 // Code Generation
1604 //===----------------------------------------------------------------------===//
1605
1606 static Module *TheModule;
1607 static LLVMFoldingBuilder Builder;
1608 static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
1609 static FunctionPassManager *TheFPM;
1610
1611 Value *ErrorV(const char *Str) { Error(Str); return 0; }
1612
1613 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
1614 /// the function.  This is used for mutable variables etc.
1615 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
1616                                           const std::string &amp;VarName) {
1617   LLVMBuilder TmpB(&amp;TheFunction-&gt;getEntryBlock(),
1618                    TheFunction-&gt;getEntryBlock().begin());
1619   return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
1620 }
1621
1622
1623 Value *NumberExprAST::Codegen() {
1624   return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1625 }
1626
1627 Value *VariableExprAST::Codegen() {
1628   // Look this variable up in the function.
1629   Value *V = NamedValues[Name];
1630   if (V == 0) return ErrorV("Unknown variable name");
1631
1632   // Load the value.
1633   return Builder.CreateLoad(V, Name.c_str());
1634 }
1635
1636 Value *UnaryExprAST::Codegen() {
1637   Value *OperandV = Operand-&gt;Codegen();
1638   if (OperandV == 0) return 0;
1639
1640   Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1641   if (F == 0)
1642     return ErrorV("Unknown unary operator");
1643
1644   return Builder.CreateCall(F, OperandV, "unop");
1645 }
1646
1647
1648 Value *BinaryExprAST::Codegen() {
1649   // Special case '=' because we don't want to emit the LHS as an expression.
1650   if (Op == '=') {
1651     // Assignment requires the LHS to be an identifier.
1652     VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
1653     if (!LHSE)
1654       return ErrorV("destination of '=' must be a variable");
1655     // Codegen the RHS.
1656     Value *Val = RHS-&gt;Codegen();
1657     if (Val == 0) return 0;
1658
1659     // Look up the name.
1660     Value *Variable = NamedValues[LHSE-&gt;getName()];
1661     if (Variable == 0) return ErrorV("Unknown variable name");
1662
1663     Builder.CreateStore(Val, Variable);
1664     return Val;
1665   }
1666
1667
1668   Value *L = LHS-&gt;Codegen();
1669   Value *R = RHS-&gt;Codegen();
1670   if (L == 0 || R == 0) return 0;
1671
1672   switch (Op) {
1673   case '+': return Builder.CreateAdd(L, R, "addtmp");
1674   case '-': return Builder.CreateSub(L, R, "subtmp");
1675   case '*': return Builder.CreateMul(L, R, "multmp");
1676   case '&lt;':
1677     L = Builder.CreateFCmpULT(L, R, "multmp");
1678     // Convert bool 0/1 to double 0.0 or 1.0
1679     return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1680   default: break;
1681   }
1682
1683   // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1684   // a call to it.
1685   Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1686   assert(F &amp;&amp; "binary operator not found!");
1687
1688   Value *Ops[] = { L, R };
1689   return Builder.CreateCall(F, Ops, Ops+2, "binop");
1690 }
1691
1692 Value *CallExprAST::Codegen() {
1693   // Look up the name in the global module table.
1694   Function *CalleeF = TheModule-&gt;getFunction(Callee);
1695   if (CalleeF == 0)
1696     return ErrorV("Unknown function referenced");
1697
1698   // If argument mismatch error.
1699   if (CalleeF-&gt;arg_size() != Args.size())
1700     return ErrorV("Incorrect # arguments passed");
1701
1702   std::vector&lt;Value*&gt; ArgsV;
1703   for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1704     ArgsV.push_back(Args[i]-&gt;Codegen());
1705     if (ArgsV.back() == 0) return 0;
1706   }
1707
1708   return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1709 }
1710
1711 Value *IfExprAST::Codegen() {
1712   Value *CondV = Cond-&gt;Codegen();
1713   if (CondV == 0) return 0;
1714
1715   // Convert condition to a bool by comparing equal to 0.0.
1716   CondV = Builder.CreateFCmpONE(CondV,
1717                                 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1718                                 "ifcond");
1719
1720   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1721
1722   // Create blocks for the then and else cases.  Insert the 'then' block at the
1723   // end of the function.
1724   BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
1725   BasicBlock *ElseBB = new BasicBlock("else");
1726   BasicBlock *MergeBB = new BasicBlock("ifcont");
1727
1728   Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1729
1730   // Emit then value.
1731   Builder.SetInsertPoint(ThenBB);
1732
1733   Value *ThenV = Then-&gt;Codegen();
1734   if (ThenV == 0) return 0;
1735
1736   Builder.CreateBr(MergeBB);
1737   // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1738   ThenBB = Builder.GetInsertBlock();
1739
1740   // Emit else block.
1741   TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1742   Builder.SetInsertPoint(ElseBB);
1743
1744   Value *ElseV = Else-&gt;Codegen();
1745   if (ElseV == 0) return 0;
1746
1747   Builder.CreateBr(MergeBB);
1748   // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1749   ElseBB = Builder.GetInsertBlock();
1750
1751   // Emit merge block.
1752   TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1753   Builder.SetInsertPoint(MergeBB);
1754   PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
1755
1756   PN-&gt;addIncoming(ThenV, ThenBB);
1757   PN-&gt;addIncoming(ElseV, ElseBB);
1758   return PN;
1759 }
1760
1761 Value *ForExprAST::Codegen() {
1762   // Output this as:
1763   //   var = alloca double
1764   //   ...
1765   //   start = startexpr
1766   //   store start -&gt; var
1767   //   goto loop
1768   // loop:
1769   //   ...
1770   //   bodyexpr
1771   //   ...
1772   // loopend:
1773   //   step = stepexpr
1774   //   endcond = endexpr
1775   //
1776   //   curvar = load var
1777   //   nextvar = curvar + step
1778   //   store nextvar -&gt; var
1779   //   br endcond, loop, endloop
1780   // outloop:
1781
1782   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1783
1784   // Create an alloca for the variable in the entry block.
1785   AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1786
1787   // Emit the start code first, without 'variable' in scope.
1788   Value *StartVal = Start-&gt;Codegen();
1789   if (StartVal == 0) return 0;
1790
1791   // Store the value into the alloca.
1792   Builder.CreateStore(StartVal, Alloca);
1793
1794   // Make the new basic block for the loop header, inserting after current
1795   // block.
1796   BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1797   BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
1798
1799   // Insert an explicit fall through from the current block to the LoopBB.
1800   Builder.CreateBr(LoopBB);
1801
1802   // Start insertion in LoopBB.
1803   Builder.SetInsertPoint(LoopBB);
1804
1805   // Within the loop, the variable is defined equal to the PHI node.  If it
1806   // shadows an existing variable, we have to restore it, so save it now.
1807   AllocaInst *OldVal = NamedValues[VarName];
1808   NamedValues[VarName] = Alloca;
1809
1810   // Emit the body of the loop.  This, like any other expr, can change the
1811   // current BB.  Note that we ignore the value computed by the body, but don't
1812   // allow an error.
1813   if (Body-&gt;Codegen() == 0)
1814     return 0;
1815
1816   // Emit the step value.
1817   Value *StepVal;
1818   if (Step) {
1819     StepVal = Step-&gt;Codegen();
1820     if (StepVal == 0) return 0;
1821   } else {
1822     // If not specified, use 1.0.
1823     StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
1824   }
1825
1826   // Compute the end condition.
1827   Value *EndCond = End-&gt;Codegen();
1828   if (EndCond == 0) return EndCond;
1829
1830   // Reload, increment, and restore the alloca.  This handles the case where
1831   // the body of the loop mutates the variable.
1832   Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
1833   Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
1834   Builder.CreateStore(NextVar, Alloca);
1835
1836   // Convert condition to a bool by comparing equal to 0.0.
1837   EndCond = Builder.CreateFCmpONE(EndCond,
1838                                   ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1839                                   "loopcond");
1840
1841   // Create the "after loop" block and insert it.
1842   BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1843   BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
1844
1845   // Insert the conditional branch into the end of LoopEndBB.
1846   Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1847
1848   // Any new code will be inserted in AfterBB.
1849   Builder.SetInsertPoint(AfterBB);
1850
1851   // Restore the unshadowed variable.
1852   if (OldVal)
1853     NamedValues[VarName] = OldVal;
1854   else
1855     NamedValues.erase(VarName);
1856
1857
1858   // for expr always returns 0.0.
1859   return Constant::getNullValue(Type::DoubleTy);
1860 }
1861
1862 Value *VarExprAST::Codegen() {
1863   std::vector&lt;AllocaInst *&gt; OldBindings;
1864
1865   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1866
1867   // Register all variables and emit their initializer.
1868   for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
1869     const std::string &amp;VarName = VarNames[i].first;
1870     ExprAST *Init = VarNames[i].second;
1871
1872     // Emit the initializer before adding the variable to scope, this prevents
1873     // the initializer from referencing the variable itself, and permits stuff
1874     // like this:
1875     //  var a = 1 in
1876     //    var a = a in ...   # refers to outer 'a'.
1877     Value *InitVal;
1878     if (Init) {
1879       InitVal = Init-&gt;Codegen();
1880       if (InitVal == 0) return 0;
1881     } else { // If not specified, use 0.0.
1882       InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
1883     }
1884
1885     AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1886     Builder.CreateStore(InitVal, Alloca);
1887
1888     // Remember the old variable binding so that we can restore the binding when
1889     // we unrecurse.
1890     OldBindings.push_back(NamedValues[VarName]);
1891
1892     // Remember this binding.
1893     NamedValues[VarName] = Alloca;
1894   }
1895
1896   // Codegen the body, now that all vars are in scope.
1897   Value *BodyVal = Body-&gt;Codegen();
1898   if (BodyVal == 0) return 0;
1899
1900   // Pop all our variables from scope.
1901   for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
1902     NamedValues[VarNames[i].first] = OldBindings[i];
1903
1904   // Return the body computation.
1905   return BodyVal;
1906 }
1907
1908
1909 Function *PrototypeAST::Codegen() {
1910   // Make the function type:  double(double,double) etc.
1911   std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1912   FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
1913
1914   Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1915
1916   // If F conflicted, there was already something named 'Name'.  If it has a
1917   // body, don't allow redefinition or reextern.
1918   if (F-&gt;getName() != Name) {
1919     // Delete the one we just made and get the existing one.
1920     F-&gt;eraseFromParent();
1921     F = TheModule-&gt;getFunction(Name);
1922
1923     // If F already has a body, reject this.
1924     if (!F-&gt;empty()) {
1925       ErrorF("redefinition of function");
1926       return 0;
1927     }
1928
1929     // If F took a different number of args, reject.
1930     if (F-&gt;arg_size() != Args.size()) {
1931       ErrorF("redefinition of function with different # args");
1932       return 0;
1933     }
1934   }
1935
1936   // Set names for all arguments.
1937   unsigned Idx = 0;
1938   for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1939        ++AI, ++Idx)
1940     AI-&gt;setName(Args[Idx]);
1941
1942   return F;
1943 }
1944
1945 /// CreateArgumentAllocas - Create an alloca for each argument and register the
1946 /// argument in the symbol table so that references to it will succeed.
1947 void PrototypeAST::CreateArgumentAllocas(Function *F) {
1948   Function::arg_iterator AI = F-&gt;arg_begin();
1949   for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
1950     // Create an alloca for this variable.
1951     AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
1952
1953     // Store the initial value into the alloca.
1954     Builder.CreateStore(AI, Alloca);
1955
1956     // Add arguments to variable symbol table.
1957     NamedValues[Args[Idx]] = Alloca;
1958   }
1959 }
1960
1961
1962 Function *FunctionAST::Codegen() {
1963   NamedValues.clear();
1964
1965   Function *TheFunction = Proto-&gt;Codegen();
1966   if (TheFunction == 0)
1967     return 0;
1968
1969   // If this is an operator, install it.
1970   if (Proto-&gt;isBinaryOp())
1971     BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1972
1973   // Create a new basic block to start insertion into.
1974   BasicBlock *BB = new BasicBlock("entry", TheFunction);
1975   Builder.SetInsertPoint(BB);
1976
1977   // Add all arguments to the symbol table and create their allocas.
1978   Proto-&gt;CreateArgumentAllocas(TheFunction);
1979
1980   if (Value *RetVal = Body-&gt;Codegen()) {
1981     // Finish off the function.
1982     Builder.CreateRet(RetVal);
1983
1984     // Validate the generated code, checking for consistency.
1985     verifyFunction(*TheFunction);
1986
1987     // Optimize the function.
1988     TheFPM-&gt;run(*TheFunction);
1989
1990     return TheFunction;
1991   }
1992
1993   // Error reading body, remove function.
1994   TheFunction-&gt;eraseFromParent();
1995
1996   if (Proto-&gt;isBinaryOp())
1997     BinopPrecedence.erase(Proto-&gt;getOperatorName());
1998   return 0;
1999 }
2000
2001 //===----------------------------------------------------------------------===//
2002 // Top-Level parsing and JIT Driver
2003 //===----------------------------------------------------------------------===//
2004
2005 static ExecutionEngine *TheExecutionEngine;
2006
2007 static void HandleDefinition() {
2008   if (FunctionAST *F = ParseDefinition()) {
2009     if (Function *LF = F-&gt;Codegen()) {
2010       fprintf(stderr, "Read function definition:");
2011       LF-&gt;dump();
2012     }
2013   } else {
2014     // Skip token for error recovery.
2015     getNextToken();
2016   }
2017 }
2018
2019 static void HandleExtern() {
2020   if (PrototypeAST *P = ParseExtern()) {
2021     if (Function *F = P-&gt;Codegen()) {
2022       fprintf(stderr, "Read extern: ");
2023       F-&gt;dump();
2024     }
2025   } else {
2026     // Skip token for error recovery.
2027     getNextToken();
2028   }
2029 }
2030
2031 static void HandleTopLevelExpression() {
2032   // Evaluate a top level expression into an anonymous function.
2033   if (FunctionAST *F = ParseTopLevelExpr()) {
2034     if (Function *LF = F-&gt;Codegen()) {
2035       // JIT the function, returning a function pointer.
2036       void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
2037
2038       // Cast it to the right type (takes no arguments, returns a double) so we
2039       // can call it as a native function.
2040       double (*FP)() = (double (*)())FPtr;
2041       fprintf(stderr, "Evaluated to %f\n", FP());
2042     }
2043   } else {
2044     // Skip token for error recovery.
2045     getNextToken();
2046   }
2047 }
2048
2049 /// top ::= definition | external | expression | ';'
2050 static void MainLoop() {
2051   while (1) {
2052     fprintf(stderr, "ready&gt; ");
2053     switch (CurTok) {
2054     case tok_eof:    return;
2055     case ';':        getNextToken(); break;  // ignore top level semicolons.
2056     case tok_def:    HandleDefinition(); break;
2057     case tok_extern: HandleExtern(); break;
2058     default:         HandleTopLevelExpression(); break;
2059     }
2060   }
2061 }
2062
2063
2064
2065 //===----------------------------------------------------------------------===//
2066 // "Library" functions that can be "extern'd" from user code.
2067 //===----------------------------------------------------------------------===//
2068
2069 /// putchard - putchar that takes a double and returns 0.
2070 extern "C"
2071 double putchard(double X) {
2072   putchar((char)X);
2073   return 0;
2074 }
2075
2076 /// printd - printf that takes a double prints it as "%f\n", returning 0.
2077 extern "C"
2078 double printd(double X) {
2079   printf("%f\n", X);
2080   return 0;
2081 }
2082
2083 //===----------------------------------------------------------------------===//
2084 // Main driver code.
2085 //===----------------------------------------------------------------------===//
2086
2087 int main() {
2088   // Install standard binary operators.
2089   // 1 is lowest precedence.
2090   BinopPrecedence['='] = 2;
2091   BinopPrecedence['&lt;'] = 10;
2092   BinopPrecedence['+'] = 20;
2093   BinopPrecedence['-'] = 20;
2094   BinopPrecedence['*'] = 40;  // highest.
2095
2096   // Prime the first token.
2097   fprintf(stderr, "ready&gt; ");
2098   getNextToken();
2099
2100   // Make the module, which holds all the code.
2101   TheModule = new Module("my cool jit");
2102
2103   // Create the JIT.
2104   TheExecutionEngine = ExecutionEngine::create(TheModule);
2105
2106   {
2107     ExistingModuleProvider OurModuleProvider(TheModule);
2108     FunctionPassManager OurFPM(&amp;OurModuleProvider);
2109
2110     // Set up the optimizer pipeline.  Start with registering info about how the
2111     // target lays out data structures.
2112     OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
2113     // Promote allocas to registers.
2114     OurFPM.add(createPromoteMemoryToRegisterPass());
2115     // Do simple "peephole" optimizations and bit-twiddling optzns.
2116     OurFPM.add(createInstructionCombiningPass());
2117     // Reassociate expressions.
2118     OurFPM.add(createReassociatePass());
2119     // Eliminate Common SubExpressions.
2120     OurFPM.add(createGVNPass());
2121     // Simplify the control flow graph (deleting unreachable blocks, etc).
2122     OurFPM.add(createCFGSimplificationPass());
2123
2124     // Set the global so the code gen can use this.
2125     TheFPM = &amp;OurFPM;
2126
2127     // Run the main "interpreter loop" now.
2128     MainLoop();
2129
2130     TheFPM = 0;
2131   }  // Free module provider and pass manager.
2132
2133
2134   // Print out all of the generated code.
2135   TheModule-&gt;dump();
2136   return 0;
2137 }
2138 </pre>
2139 </div>
2140
2141 </div>
2142
2143 <!-- *********************************************************************** -->
2144 <hr>
2145 <address>
2146   <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
2147   src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
2148   <a href="http://validator.w3.org/check/referer"><img
2149   src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
2150
2151   <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
2152   <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
2153   Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
2154 </address>
2155 </body>
2156 </html>