docs/tutorial/LangImpl7.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   2                       "http://www.w3.org/TR/html4/strict.dtd">
   3
   4 <html>
   5 <head>
   6   <title>Kaleidoscope: Extending the Language: Mutable Variables / SSA
   7          construction</title>
   8   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   9   <meta name="author" content="Chris Lattner">
  10   <link rel="stylesheet" href="../llvm.css" type="text/css">
  11 </head>
  12
  13 <body>
  14
  15 <div class="doc_title">Kaleidoscope: Extending the Language: Mutable Variables</div>
  16
  17 <ul>
  18 <li>Chapter 7
  19   <ol>
  20     <li><a href="#intro">Chapter 7 Introduction</a></li>
  21     <li><a href="#why">Why is this a hard problem?</a></li>
  22     <li><a href="#memory">Memory in LLVM</a></li>
  23     <li><a href="#kalvars">Mutable Variables in Kaleidoscope</a></li>
  24     <li><a href="#adjustments">Adjusting Existing Variables for
  25      Mutation</a></li>
  26     <li><a href="#assignment">New Assignment Operator</a></li>
  27     <li><a href="#localvars">User-defined Local Variables</a></li>
  28     <li><a href="#code">Full Code Listing</a></li>
  29   </ol>
  30 </li>
  31 </ul>
  32
  33 <div class="doc_author">
  34   <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
  35 </div>
  36
  37 <!-- *********************************************************************** -->
  38 <div class="doc_section"><a name="intro">Chapter 7 Introduction</a></div>
  39 <!-- *********************************************************************** -->
  40
  41 <div class="doc_text">
  42
  43 <p>Welcome to Chapter 7 of the "<a href="index.html">Implementing a language
  44 with LLVM</a>" tutorial.  In chapters 1 through 6, we've built a very
  45 respectable, albeit simple, <a
  46 href="http://en.wikipedia.org/wiki/Functional_programming">functional
  47 programming language</a>.  In our journey, we learned some parsing techniques,
  48 how to build and represent an AST, how to build LLVM IR, and how to optimize
  49 the resultant code and JIT compile it.</p>
  50
  51 <p>While Kaleidoscope is interesting as a functional language, this makes it
  52 "too easy" to generate LLVM IR for it.  In particular, a functional language
  53 makes it very easy to build LLVM IR directly in <a
  54 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA form</a>.
  55 Since LLVM requires that the input code be in SSA form, this is a very nice
  56 property and it is often unclear to newcomers how to generate code for an
  57 imperative language with mutable variables.</p>
  58
  59 <p>The short (and happy) summary of this chapter is that there is no need for
  60 your front-end to build SSA form: LLVM provides highly tuned and well tested
  61 support for this, though the way it works is a bit unexpected for some.</p>
  62
  63 </div>
  64
  65 <!-- *********************************************************************** -->
  66 <div class="doc_section"><a name="why">Why is this a hard problem?</a></div>
  67 <!-- *********************************************************************** -->
  68
  69 <div class="doc_text">
  70
  71 <p>
  72 To understand why mutable variables cause complexities in SSA construction,
  73 consider this extremely simple C example:
  74 </p>
  75
  76 <div class="doc_code">
  77 <pre>
  78 int G, H;
  79 int test(_Bool Condition) {
  80   int X;
  81   if (Condition)
  82     X = G;
  83   else
  84     X = H;
  85   return X;
  86 }
  87 </pre>
  88 </div>
  89
  90 <p>In this case, we have the variable "X", whose value depends on the path
  91 executed in the program.  Because there are two different possible values for X
  92 before the return instruction, a PHI node is inserted to merge the two values.
  93 The LLVM IR that we want for this example looks like this:</p>
  94
  95 <div class="doc_code">
  96 <pre>
  97 @G = weak global i32 0   ; type of @G is i32*
  98 @H = weak global i32 0   ; type of @H is i32*
  99
 100 define i32 @test(i1 %Condition) {
 101 entry:
 102         br i1 %Condition, label %cond_true, label %cond_false
 103
 104 cond_true:
 105         %X.0 = load i32* @G
 106         br label %cond_next
 107
 108 cond_false:
 109         %X.1 = load i32* @H
 110         br label %cond_next
 111
 112 cond_next:
 113         %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
 114         ret i32 %X.2
 115 }
 116 </pre>
 117 </div>
 118
 119 <p>In this example, the loads from the G and H global variables are explicit in
 120 the LLVM IR, and they live in the then/else branches of the if statement
 121 (cond_true/cond_false).  In order to merge the incoming values, the X.2 phi node
 122 in the cond_next block selects the right value to use based on where control
 123 flow is coming from: if control flow comes from the cond_false block, X.2 gets
 124 the value of X.1.  Alternatively, if control flow comes from cond_tree, it gets
 125 the value of X.0.  The intent of this chapter is not to explain the details of
 126 SSA form.  For more information, see one of the many <a
 127 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online
 128 references</a>.</p>
 129
 130 <p>The question for this article is "who places phi nodes when lowering
 131 assignments to mutable variables?".  The issue here is that LLVM
 132 <em>requires</em> that its IR be in SSA form: there is no "non-ssa" mode for it.
 133 However, SSA construction requires non-trivial algorithms and data structures,
 134 so it is inconvenient and wasteful for every front-end to have to reproduce this
 135 logic.</p>
 136
 137 </div>
 138
 139 <!-- *********************************************************************** -->
 140 <div class="doc_section"><a name="memory">Memory in LLVM</a></div>
 141 <!-- *********************************************************************** -->
 142
 143 <div class="doc_text">
 144
 145 <p>The 'trick' here is that while LLVM does require all register values to be
 146 in SSA form, it does not require (or permit) memory objects to be in SSA form.
 147 In the example above, note that the loads from G and H are direct accesses to
 148 G and H: they are not renamed or versioned.  This differs from some other
 149 compiler systems, which do try to version memory objects.  In LLVM, instead of
 150 encoding dataflow analysis of memory into the LLVM IR, it is handled with <a
 151 href="../WritingAnLLVMPass.html">Analysis Passes</a> which are computed on
 152 demand.</p>
 153
 154 <p>
 155 With this in mind, the high-level idea is that we want to make a stack variable
 156 (which lives in memory, because it is on the stack) for each mutable object in
 157 a function.  To take advantage of this trick, we need to talk about how LLVM
 158 represents stack variables.
 159 </p>
 160
 161 <p>In LLVM, all memory accesses are explicit with load/store instructions, and
 162 it is carefully designed to not have (or need) an "address-of" operator.  Notice
 163 how the type of the @G/@H global variables is actually "i32*" even though the
 164 variable is defined as "i32".  What this means is that @G defines <em>space</em>
 165 for an i32 in the global data area, but its <em>name</em> actually refers to the
 166 address for that space.  Stack variables work the same way, but instead of being
 167 declared with global variable definitions, they are declared with the
 168 <a href="../LangRef.html#i_alloca">LLVM alloca instruction</a>:</p>
 169
 170 <div class="doc_code">
 171 <pre>
 172 define i32 @test(i1 %Condition) {
 173 entry:
 174         %X = alloca i32           ; type of %X is i32*.
 175         ...
 176         %tmp = load i32* %X       ; load the stack value %X from the stack.
 177         %tmp2 = add i32 %tmp, 1   ; increment it
 178         store i32 %tmp2, i32* %X  ; store it back
 179         ...
 180 </pre>
 181 </div>
 182
 183 <p>This code shows an example of how you can declare and manipulate a stack
 184 variable in the LLVM IR.  Stack memory allocated with the alloca instruction is
 185 fully general: you can pass the address of the stack slot to functions, you can
 186 store it in other variables, etc.  In our example above, we could rewrite the
 187 example to use the alloca technique to avoid using a PHI node:</p>
 188
 189 <div class="doc_code">
 190 <pre>
 191 @G = weak global i32 0   ; type of @G is i32*
 192 @H = weak global i32 0   ; type of @H is i32*
 193
 194 define i32 @test(i1 %Condition) {
 195 entry:
 196         %X = alloca i32           ; type of %X is i32*.
 197         br i1 %Condition, label %cond_true, label %cond_false
 198
 199 cond_true:
 200         %X.0 = load i32* @G
 201         store i32 %X.0, i32* %X   ; Update X
 202         br label %cond_next
 203
 204 cond_false:
 205         %X.1 = load i32* @H
 206         store i32 %X.1, i32* %X   ; Update X
 207         br label %cond_next
 208
 209 cond_next:
 210         %X.2 = load i32* %X       ; Read X
 211         ret i32 %X.2
 212 }
 213 </pre>
 214 </div>
 215
 216 <p>With this, we have discovered a way to handle arbitrary mutable variables
 217 without the need to create Phi nodes at all:</p>
 218
 219 <ol>
 220 <li>Each mutable variable becomes a stack allocation.</li>
 221 <li>Each read of the variable becomes a load from the stack.</li>
 222 <li>Each update of the variable becomes a store to the stack.</li>
 223 <li>Taking the address of a variable just uses the stack address directly.</li>
 224 </ol>
 225
 226 <p>While this solution has solved our immediate problem, it introduced another
 227 one: we have now apparently introduced a lot of stack traffic for very simple
 228 and common operations, a major performance problem.  Fortunately for us, the
 229 LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles
 230 this case, promoting allocas like this into SSA registers, inserting Phi nodes
 231 as appropriate.  If you run this example through the pass, for example, you'll
 232 get:</p>
 233
 234 <div class="doc_code">
 235 <pre>
 236 $ <b>llvm-as &lt; example.ll | opt -mem2reg | llvm-dis</b>
 237 @G = weak global i32 0
 238 @H = weak global i32 0
 239
 240 define i32 @test(i1 %Condition) {
 241 entry:
 242         br i1 %Condition, label %cond_true, label %cond_false
 243
 244 cond_true:
 245         %X.0 = load i32* @G
 246         br label %cond_next
 247
 248 cond_false:
 249         %X.1 = load i32* @H
 250         br label %cond_next
 251
 252 cond_next:
 253         %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
 254         ret i32 %X.01
 255 }
 256 </pre>
 257 </div>
 258
 259 <p>The mem2reg pass implements the standard "iterated dominator frontier"
 260 algorithm for constructing SSA form and has a number of optimizations that speed
 261 up very common degenerate cases.  mem2reg really is the answer for dealing with
 262 mutable variables, and we highly recommend that you depend on it.  Note that
 263 mem2reg only works on variables in certain circumstances:</p>
 264
 265 <ol>
 266 <li>mem2reg is alloca-driven: it looks for allocas and if it can handle them, it
 267 promotes them.  It does not apply to global variables or heap allocations.</li>
 268
 269 <li>mem2reg only looks for alloca instructions in the entry block of the
 270 function.  Being in the entry block guarantees that the alloca is only executed
 271 once, which makes analysis simpler.</li>
 272
 273 <li>mem2reg only promotes allocas whose uses are direct loads and stores.  If
 274 the address of the stack object is passed to a function, or if any funny pointer
 275 arithmetic is involved, the alloca will not be promoted.</li>
 276
 277 <li>mem2reg only works on allocas of <a
 278 href="../LangRef.html#t_classifications">first class</a>
 279 values (such as pointers, scalars and vectors), and only if the array size
 280 of the allocation is 1 (or missing in the .ll file).  mem2reg is not capable of
 281 promoting structs or arrays to registers.  Note that the "scalarrepl" pass is
 282 more powerful and can promote structs, "unions", and arrays in many cases.</li>
 283
 284 </ol>
 285
 286 <p>
 287 All of these properties are easy to satisfy for most imperative languages, and
 288 we'll illustrate this below with Kaleidoscope.  The final question you may be
 289 asking is: should I bother with this nonsense for my front-end?  Wouldn't it be
 290 better if I just did SSA construction directly, avoiding use of the mem2reg
 291 optimization pass?  In short, we strongly recommend that use you this technique
 292 for building SSA form, unless there is an extremely good reason not to.  Using
 293 this technique is:</p>
 294
 295 <ul>
 296 <li>Proven and well tested: llvm-gcc and clang both use this technique for local
 297 mutable variables.  As such, the most common clients of LLVM are using this to
 298 handle a bulk of their variables.  You can be sure that bugs are found fast and
 299 fixed early.</li>
 300
 301 <li>Extremely Fast: mem2reg has a number of special cases that make it fast in
 302 common cases as well as fully general.  For example, it has fast-paths for
 303 variables that are only used in a single block, variables that only have one
 304 assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc.
 305 </li>
 306
 307 <li>Needed for debug info generation: <a href="../SourceLevelDebugging.html">
 308 Debug information in LLVM</a> relies on having the address of the variable
 309 exposed to attach debug info to it.  This technique dovetails very naturally
 310 with this style of debug info.</li>
 311 </ul>
 312
 313 <p>If nothing else, this makes it much easier to get your front-end up and
 314 running, and is very simple to implement.  Lets extend Kaleidoscope with mutable
 315 variables now!
 316 </p>
 317
 318 </div>
 319
 320 <!-- *********************************************************************** -->
 321 <div class="doc_section"><a name="kalvars">Mutable Variables in
 322 Kaleidoscope</a></div>
 323 <!-- *********************************************************************** -->
 324
 325 <div class="doc_text">
 326
 327 <p>Now that we know the sort of problem we want to tackle, lets see what this
 328 looks like in the context of our little Kaleidoscope language.  We're going to
 329 add two features:</p>
 330
 331 <ol>
 332 <li>The ability to mutate variables with the '=' operator.</li>
 333 <li>The ability to define new variables.</li>
 334 </ol>
 335
 336 <p>While the first item is really what this is about, we only have variables
 337 for incoming arguments and for induction variables, and redefining them only
 338 goes so far :).  Also, the ability to define new variables is a
 339 useful thing regardless of whether you will be mutating them.  Here's a
 340 motivating example that shows how we could use these:</p>
 341
 342 <div class="doc_code">
 343 <pre>
 344 # Define ':' for sequencing: as a low-precedence operator that ignores operands
 345 # and just returns the RHS.
 346 def binary : 1 (x y) y;
 347
 348 # Recursive fib, we could do this before.
 349 def fib(x)
 350   if (x &lt; 3) then
 351     1
 352   else
 353     fib(x-1)+fib(x-2);
 354
 355 # Iterative fib.
 356 def fibi(x)
 357   <b>var a = 1, b = 1, c in</b>
 358   (for i = 3, i &;t; x in
 359      <b>c = a + b</b> :
 360      <b>a = b</b> :
 361      <b>b = c</b>) :
 362   b;
 363
 364 # Call it.
 365 fibi(10);
 366 </pre>
 367 </div>
 368
 369 <p>
 370 In order to mutate variables, we have to change our existing variables to use
 371 the "alloca trick".  Once we have that, we'll add our new operator, then extend
 372 Kaleidoscope to support new variable definitions.
 373 </p>
 374
 375 </div>
 376
 377 <!-- *********************************************************************** -->
 378 <div class="doc_section"><a name="adjustments">Adjusting Existing Variables for
 379 Mutation</a></div>
 380 <!-- *********************************************************************** -->
 381
 382 <div class="doc_text">
 383
 384 <p>
 385 The symbol table in Kaleidoscope is managed at code generation time by the
 386 '<tt>NamedValues</tt>' map.  This map currently keeps track of the LLVM "Value*"
 387 that holds the double value for the named variable.  In order to support
 388 mutation, we need to change this slightly, so that it <tt>NamedValues</tt> holds
 389 the <em>memory location</em> of the variable in question.  Note that this
 390 change is a refactoring: it changes the structure of the code, but does not
 391 (by itself) change the behavior of the compiler.  All of these changes are
 392 isolated in the Kaleidoscope code generator.</p>
 393
 394 <p>
 395 At this point in Kaleidoscope's development, it only supports variables for two
 396 things: incoming arguments to functions and the induction variable of 'for'
 397 loops.  For consistency, we'll allow mutation of these variables in addition to
 398 other user-defined variables.  This means that these will both need memory
 399 locations.
 400 </p>
 401
 402 <p>To start our transformation of Kaleidoscope, we'll change the NamedValues
 403 map to map to AllocaInst* instead of Value*.  Once we do this, the C++ compiler
 404 will tell use what parts of the code we need to update:</p>
 405
 406 <div class="doc_code">
 407 <pre>
 408 static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
 409 </pre>
 410 </div>
 411
 412 <p>Also, since we will need to create these alloca's, we'll use a helper
 413 function that ensures that the allocas are created in the entry block of the
 414 function:</p>
 415
 416 <div class="doc_code">
 417 <pre>
 418 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
 419 /// the function.  This is used for mutable variables etc.
 420 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
 421                                           const std::string &amp;VarName) {
 422   LLVMBuilder TmpB(&amp;TheFunction-&gt;getEntryBlock(),
 423                    TheFunction-&gt;getEntryBlock().begin());
 424   return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
 425 }
 426 </pre>
 427 </div>
 428
 429 <p>This funny looking code creates an LLVMBuilder object that is pointing at
 430 the first instruction (.begin()) of the entry block.  It then creates an alloca
 431 with the expected name and returns it.  Because all values in Kaleidoscope are
 432 doubles, there is no need to pass in a type to use.</p>
 433
 434 <p>With this in place, the first functionality change we want to make is to
 435 variable references.  In our new scheme, variables live on the stack, so code
 436 generating a reference to them actually needs to produce a load from the stack
 437 slot:</p>
 438
 439 <div class="doc_code">
 440 <pre>
 441 Value *VariableExprAST::Codegen() {
 442   // Look this variable up in the function.
 443   Value *V = NamedValues[Name];
 444   if (V == 0) return ErrorV("Unknown variable name");
 445
 446   // Load the value.
 447   return Builder.CreateLoad(V, Name.c_str());
 448 }
 449 </pre>
 450 </div>
 451
 452 <p>As you can see, this is pretty straight-forward.  Next we need to update the
 453 things that define the variables to set up the alloca.  We'll start with
 454 <tt>ForExprAST::Codegen</tt> (see the <a href="#code">full code listing</a> for
 455 the unabridged code):</p>
 456
 457 <div class="doc_code">
 458 <pre>
 459   Function *TheFunction = Builder.GetInsertBlock()->getParent();
 460
 461   <b>// Create an alloca for the variable in the entry block.
 462   AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);</b>
 463
 464     // Emit the start code first, without 'variable' in scope.
 465   Value *StartVal = Start-&gt;Codegen();
 466   if (StartVal == 0) return 0;
 467
 468   <b>// Store the value into the alloca.
 469   Builder.CreateStore(StartVal, Alloca);</b>
 470   ...
 471
 472   // Compute the end condition.
 473   Value *EndCond = End-&gt;Codegen();
 474   if (EndCond == 0) return EndCond;
 475
 476   <b>// Reload, increment, and restore the alloca.  This handles the case where
 477   // the body of the loop mutates the variable.
 478   Value *CurVar = Builder.CreateLoad(Alloca);
 479   Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
 480   Builder.CreateStore(NextVar, Alloca);</b>
 481   ...
 482 </pre>
 483 </div>
 484
 485 <p>This code is virtually identical to the code <a
 486 href="LangImpl5.html#forcodegen">before we allowed mutable variables</a>.  The
 487 big difference is that we no longer have to construct a PHI node, and we use
 488 load/store to access the variable as needed.</p>
 489
 490 <p>To support mutable argument variables, we need to also make allocas for them.
 491 The code for this is also pretty simple:</p>
 492
 493 <div class="doc_code">
 494 <pre>
 495 /// CreateArgumentAllocas - Create an alloca for each argument and register the
 496 /// argument in the symbol table so that references to it will succeed.
 497 void PrototypeAST::CreateArgumentAllocas(Function *F) {
 498   Function::arg_iterator AI = F-&gt;arg_begin();
 499   for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
 500     // Create an alloca for this variable.
 501     AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
 502
 503     // Store the initial value into the alloca.
 504     Builder.CreateStore(AI, Alloca);
 505
 506     // Add arguments to variable symbol table.
 507     NamedValues[Args[Idx]] = Alloca;
 508   }
 509 }
 510 </pre>
 511 </div>
 512
 513 <p>For each argument, we make an alloca, store the input value to the function
 514 into the alloca, and register the alloca as the memory location for the
 515 argument.  This method gets invoked by <tt>FunctionAST::Codegen</tt> right after
 516 it sets up the entry block for the function.</p>
 517
 518 <p>The final missing piece is adding the 'mem2reg' pass, which allows us to get
 519 good codegen once again:</p>
 520
 521 <div class="doc_code">
 522 <pre>
 523     // Set up the optimizer pipeline.  Start with registering info about how the
 524     // target lays out data structures.
 525     OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
 526     <b>// Promote allocas to registers.
 527     OurFPM.add(createPromoteMemoryToRegisterPass());</b>
 528     // Do simple "peephole" optimizations and bit-twiddling optzns.
 529     OurFPM.add(createInstructionCombiningPass());
 530     // Reassociate expressions.
 531     OurFPM.add(createReassociatePass());
 532 </pre>
 533 </div>
 534
 535 <p>It is interesting to see what the code looks like before and after the
 536 mem2reg optimization runs.  For example, this is the before/after code for our
 537 recursive fib.  Before the optimization:</p>
 538
 539 <div class="doc_code">
 540 <pre>
 541 define double @fib(double %x) {
 542 entry:
 543         <b>%x1 = alloca double
 544         store double %x, double* %x1
 545         %x2 = load double* %x1</b>
 546         %multmp = fcmp ult double %x2, 3.000000e+00
 547         %booltmp = uitofp i1 %multmp to double
 548         %ifcond = fcmp one double %booltmp, 0.000000e+00
 549         br i1 %ifcond, label %then, label %else
 550
 551 then:           ; preds = %entry
 552         br label %ifcont
 553
 554 else:           ; preds = %entry
 555         <b>%x3 = load double* %x1</b>
 556         %subtmp = sub double %x3, 1.000000e+00
 557         %calltmp = call double @fib( double %subtmp )
 558         <b>%x4 = load double* %x1</b>
 559         %subtmp5 = sub double %x4, 2.000000e+00
 560         %calltmp6 = call double @fib( double %subtmp5 )
 561         %addtmp = add double %calltmp, %calltmp6
 562         br label %ifcont
 563
 564 ifcont:         ; preds = %else, %then
 565         %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
 566         ret double %iftmp
 567 }
 568 </pre>
 569 </div>
 570
 571 <p>Here there is only one variable (x, the input argument) but you can still
 572 see the extremely simple-minded code generation strategy we are using.  In the
 573 entry block, an alloca is created, and the initial input value is stored into
 574 it.  Each reference to the variable does a reload from the stack.  Also, note
 575 that we didn't modify the if/then/else expression, so it still inserts a PHI
 576 node.  While we could make an alloca for it, it is actually easier to create a
 577 PHI node for it, so we still just make the PHI.</p>
 578
 579 <p>Here is the code after the mem2reg pass runs:</p>
 580
 581 <div class="doc_code">
 582 <pre>
 583 define double @fib(double %x) {
 584 entry:
 585         %multmp = fcmp ult double <b>%x</b>, 3.000000e+00
 586         %booltmp = uitofp i1 %multmp to double
 587         %ifcond = fcmp one double %booltmp, 0.000000e+00
 588         br i1 %ifcond, label %then, label %else
 589
 590 then:
 591         br label %ifcont
 592
 593 else:
 594         %subtmp = sub double <b>%x</b>, 1.000000e+00
 595         %calltmp = call double @fib( double %subtmp )
 596         %subtmp5 = sub double <b>%x</b>, 2.000000e+00
 597         %calltmp6 = call double @fib( double %subtmp5 )
 598         %addtmp = add double %calltmp, %calltmp6
 599         br label %ifcont
 600
 601 ifcont:         ; preds = %else, %then
 602         %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
 603         ret double %iftmp
 604 }
 605 </pre>
 606 </div>
 607
 608 <p>This is a trivial case for mem2reg, since there are no redefinitions of the
 609 variable.  The point of showing this is to calm your tension about inserting
 610 such blatent inefficiencies :).</p>
 611
 612 <p>After the rest of the optimizers run, we get:</p>
 613
 614 <div class="doc_code">
 615 <pre>
 616 define double @fib(double %x) {
 617 entry:
 618         %multmp = fcmp ult double %x, 3.000000e+00
 619         %booltmp = uitofp i1 %multmp to double
 620         %ifcond = fcmp ueq double %booltmp, 0.000000e+00
 621         br i1 %ifcond, label %else, label %ifcont
 622
 623 else:
 624         %subtmp = sub double %x, 1.000000e+00
 625         %calltmp = call double @fib( double %subtmp )
 626         %subtmp5 = sub double %x, 2.000000e+00
 627         %calltmp6 = call double @fib( double %subtmp5 )
 628         %addtmp = add double %calltmp, %calltmp6
 629         ret double %addtmp
 630
 631 ifcont:
 632         ret double 1.000000e+00
 633 }
 634 </pre>
 635 </div>
 636
 637 <p>Here we see that the simplifycfg pass decided to clone the return instruction
 638 into the end of the 'else' block.  This allowed it to eliminate some branches
 639 and the PHI node.</p>
 640
 641 <p>Now that all symbol table references are updated to use stack variables,
 642 we'll add the assignment operator.</p>
 643
 644 </div>
 645
 646 <!-- *********************************************************************** -->
 647 <div class="doc_section"><a name="assignment">New Assignment Operator</a></div>
 648 <!-- *********************************************************************** -->
 649
 650 <div class="doc_text">
 651
 652 <p>With our current framework, adding a new assignment operator is really
 653 simple.  We will parse it just like any other binary operator, but handle it
 654 internally (instead of allowing the user to define it).  The first step is to
 655 set a precedence:</p>
 656
 657 <div class="doc_code">
 658 <pre>
 659  int main() {
 660    // Install standard binary operators.
 661    // 1 is lowest precedence.
 662    <b>BinopPrecedence['='] = 2;</b>
 663    BinopPrecedence['&lt;'] = 10;
 664    BinopPrecedence['+'] = 20;
 665    BinopPrecedence['-'] = 20;
 666 </pre>
 667 </div>
 668
 669 <p>Now that the parser knows the precedence of the binary operator, it takes
 670 care of all the parsing and AST generation.  We just need to implement codegen
 671 for the assignment operator.  This looks like:</p>
 672
 673 <div class="doc_code">
 674 <pre>
 675 Value *BinaryExprAST::Codegen() {
 676   // Special case '=' because we don't want to emit the LHS as an expression.
 677   if (Op == '=') {
 678     // Assignment requires the LHS to be an identifier.
 679     VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
 680     if (!LHSE)
 681       return ErrorV("destination of '=' must be a variable");
 682 </pre>
 683 </div>
 684
 685 <p>Unlike the rest of the binary operators, our assignment operator doesn't
 686 follow the "emit LHS, emit RHS, do computation" model.  As such, it is handled
 687 as a special case before the other binary operators are handled.  The other
 688 strange thing about it is that it requires the LHS to be a variable directly.
 689 </p>
 690
 691 <div class="doc_code">
 692 <pre>
 693     // Codegen the RHS.
 694     Value *Val = RHS-&gt;Codegen();
 695     if (Val == 0) return 0;
 696
 697     // Look up the name.
 698     Value *Variable = NamedValues[LHSE-&gt;getName()];
 699     if (Variable == 0) return ErrorV("Unknown variable name");
 700
 701     Builder.CreateStore(Val, Variable);
 702     return Val;
 703   }
 704   ...
 705 </pre>
 706 </div>
 707
 708 <p>Once it has the variable, codegen'ing the assignment is straight-forward:
 709 we emit the RHS of the assignment, create a store, and return the computed
 710 value.  Returning a value allows for chained assignments like "X = (Y = Z)".</p>
 711
 712 <p>Now that we have an assignment operator, we can mutate loop variables and
 713 arguments.  For example, we can now run code like this:</p>
 714
 715 <div class="doc_code">
 716 <pre>
 717 # Function to print a double.
 718 extern printd(x);
 719
 720 # Define ':' for sequencing: as a low-precedence operator that ignores operands
 721 # and just returns the RHS.
 722 def binary : 1 (x y) y;
 723
 724 def test(x)
 725   printd(x) :
 726   x = 4 :
 727   printd(x);
 728
 729 test(123);
 730 </pre>
 731 </div>
 732
 733 <p>When run, this example prints "123" and then "4", showing that we did
 734 actually mutate the value!  Okay, we have now officially implemented our goal:
 735 getting this to work requires SSA construction in the general case.  However,
 736 to be really useful, we want the ability to define our own local variables, lets
 737 add this next!
 738 </p>
 739
 740 </div>
 741
 742 <!-- *********************************************************************** -->
 743 <div class="doc_section"><a name="localvars">User-defined Local
 744 Variables</a></div>
 745 <!-- *********************************************************************** -->
 746
 747 <div class="doc_text">
 748
 749 <p>Adding var/in is just like any other other extensions we made to
 750 Kaleidoscope: we extend the lexer, the parser, the AST and the code generator.
 751 The first step for adding our new 'var/in' construct is to extend the lexer.
 752 As before, this is pretty trivial, the code looks like this:</p>
 753
 754 <div class="doc_code">
 755 <pre>
 756 enum Token {
 757   ...
 758   <b>// var definition
 759   tok_var = -13</b>
 760 ...
 761 }
 762 ...
 763 static int gettok() {
 764 ...
 765     if (IdentifierStr == "in") return tok_in;
 766     if (IdentifierStr == "binary") return tok_binary;
 767     if (IdentifierStr == "unary") return tok_unary;
 768     <b>if (IdentifierStr == "var") return tok_var;</b>
 769     return tok_identifier;
 770 ...
 771 </pre>
 772 </div>
 773
 774 <p>The next step is to define the AST node that we will construct.  For var/in,
 775 it will look like this:</p>
 776
 777 <div class="doc_code">
 778 <pre>
 779 /// VarExprAST - Expression class for var/in
 780 class VarExprAST : public ExprAST {
 781   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
 782   ExprAST *Body;
 783 public:
 784   VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
 785              ExprAST *body)
 786   : VarNames(varnames), Body(body) {}
 787
 788   virtual Value *Codegen();
 789 };
 790 </pre>
 791 </div>
 792
 793 <p>var/in allows a list of names to be defined all at once, and each name can
 794 optionally have an initializer value.  As such, we capture this information in
 795 the VarNames vector.  Also, var/in has a body, this body is allowed to access
 796 the variables defined by the let/in.</p>
 797
 798 <p>With this ready, we can define the parser pieces.  First thing we do is add
 799 it as a primary expression:</p>
 800
 801 <div class="doc_code">
 802 <pre>
 803 /// primary
 804 ///   ::= identifierexpr
 805 ///   ::= numberexpr
 806 ///   ::= parenexpr
 807 ///   ::= ifexpr
 808 ///   ::= forexpr
 809 <b>///   ::= varexpr</b>
 810 static ExprAST *ParsePrimary() {
 811   switch (CurTok) {
 812   default: return Error("unknown token when expecting an expression");
 813   case tok_identifier: return ParseIdentifierExpr();
 814   case tok_number:     return ParseNumberExpr();
 815   case '(':            return ParseParenExpr();
 816   case tok_if:         return ParseIfExpr();
 817   case tok_for:        return ParseForExpr();
 818   <b>case tok_var:        return ParseVarExpr();</b>
 819   }
 820 }
 821 </pre>
 822 </div>
 823
 824 <p>Next we define ParseVarExpr:</p>
 825
 826 <div class="doc_code">
 827 <pre>
 828 /// varexpr ::= 'var' identifier ('=' expression)?
 829 //                    (',' identifier ('=' expression)?)* 'in' expression
 830 static ExprAST *ParseVarExpr() {
 831   getNextToken();  // eat the var.
 832
 833   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
 834
 835   // At least one variable name is required.
 836   if (CurTok != tok_identifier)
 837     return Error("expected identifier after var");
 838 </pre>
 839 </div>
 840
 841 <p>The first part of this code parses the list of identifier/expr pairs into the
 842 local <tt>VarNames</tt> vector.
 843
 844 <div class="doc_code">
 845 <pre>
 846   while (1) {
 847     std::string Name = IdentifierStr;
 848     getNextToken();  // eat identifier.
 849
 850     // Read the optional initializer.
 851     ExprAST *Init = 0;
 852     if (CurTok == '=') {
 853       getNextToken(); // eat the '='.
 854
 855       Init = ParseExpression();
 856       if (Init == 0) return 0;
 857     }
 858
 859     VarNames.push_back(std::make_pair(Name, Init));
 860
 861     // End of var list, exit loop.
 862     if (CurTok != ',') break;
 863     getNextToken(); // eat the ','.
 864
 865     if (CurTok != tok_identifier)
 866       return Error("expected identifier list after var");
 867   }
 868 </pre>
 869 </div>
 870
 871 <p>Once all the variables are parsed, we then parse the body and create the
 872 AST node:</p>
 873
 874 <div class="doc_code">
 875 <pre>
 876   // At this point, we have to have 'in'.
 877   if (CurTok != tok_in)
 878     return Error("expected 'in' keyword after 'var'");
 879   getNextToken();  // eat 'in'.
 880
 881   ExprAST *Body = ParseExpression();
 882   if (Body == 0) return 0;
 883
 884   return new VarExprAST(VarNames, Body);
 885 }
 886 </pre>
 887 </div>
 888
 889 <p>Now that we can parse and represent the code, we need to support emission of
 890 LLVM IR for it.  This code starts out with:</p>
 891
 892 <div class="doc_code">
 893 <pre>
 894 Value *VarExprAST::Codegen() {
 895   std::vector&lt;AllocaInst *&gt; OldBindings;
 896
 897   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
 898
 899   // Register all variables and emit their initializer.
 900   for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
 901     const std::string &amp;VarName = VarNames[i].first;
 902     ExprAST *Init = VarNames[i].second;
 903 </pre>
 904 </div>
 905
 906 <p>Basically it loops over all the variables, installing them one at a time.
 907 For each variable we put into the symbol table, we remember the previous value
 908 that we replace in OldBindings.</p>
 909
 910 <div class="doc_code">
 911 <pre>
 912     // Emit the initializer before adding the variable to scope, this prevents
 913     // the initializer from referencing the variable itself, and permits stuff
 914     // like this:
 915     //  var a = 1 in
 916     //    var a = a in ...   # refers to outer 'a'.
 917     Value *InitVal;
 918     if (Init) {
 919       InitVal = Init-&gt;Codegen();
 920       if (InitVal == 0) return 0;
 921     } else { // If not specified, use 0.0.
 922       InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
 923     }
 924
 925     AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
 926     Builder.CreateStore(InitVal, Alloca);
 927
 928     // Remember the old variable binding so that we can restore the binding when
 929     // we unrecurse.
 930     OldBindings.push_back(NamedValues[VarName]);
 931
 932     // Remember this binding.
 933     NamedValues[VarName] = Alloca;
 934   }
 935 </pre>
 936 </div>
 937
 938 <p>There are more comments here than code.  The basic idea is that we emit the
 939 initializer, create the alloca, then update the symbol table to point to it.
 940 Once all the variables are installed in the symbol table, we evaluate the body
 941 of the var/in expression:</p>
 942
 943 <div class="doc_code">
 944 <pre>
 945   // Codegen the body, now that all vars are in scope.
 946   Value *BodyVal = Body-&gt;Codegen();
 947   if (BodyVal == 0) return 0;
 948 </pre>
 949 </div>
 950
 951 <p>Finally, before returning, we restore the previous variable bindings:</p>
 952
 953 <div class="doc_code">
 954 <pre>
 955   // Pop all our variables from scope.
 956   for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
 957     NamedValues[VarNames[i].first] = OldBindings[i];
 958
 959   // Return the body computation.
 960   return BodyVal;
 961 }
 962 </pre>
 963 </div>
 964
 965 <p>The end result of all of this is that we get properly scoped variable
 966 definitions, and we even (trivially) allow mutation of them :).</p>
 967
 968 <p>With this, we completed what we set out to do.  Our nice iterative fib
 969 example from the intro compiles and runs just fine.  The mem2reg pass optimizes
 970 all of our stack variables into SSA registers, inserting PHI nodes where needed,
 971 and our front-end remains simple: no iterated dominator frontier computation
 972 anywhere in sight.</p>
 973
 974 </div>
 975
 976 <!-- *********************************************************************** -->
 977 <div class="doc_section"><a name="code">Full Code Listing</a></div>
 978 <!-- *********************************************************************** -->
 979
 980 <div class="doc_text">
 981
 982 <p>
 983 Here is the complete code listing for our running example, enhanced with mutable
 984 variables and var/in support.  To build this example, use:
 985 </p>
 986
 987 <div class="doc_code">
 988 <pre>
 989    # Compile
 990    g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
 991    # Run
 992    ./toy
 993 </pre>
 994 </div>
 995
 996 <p>Here is the code:</p>
 997
 998 <div class="doc_code">
 999 <pre>
1000 #include "llvm/DerivedTypes.h"
1001 #include "llvm/ExecutionEngine/ExecutionEngine.h"
1002 #include "llvm/Module.h"
1003 #include "llvm/ModuleProvider.h"
1004 #include "llvm/PassManager.h"
1005 #include "llvm/Analysis/Verifier.h"
1006 #include "llvm/Target/TargetData.h"
1007 #include "llvm/Transforms/Scalar.h"
1008 #include "llvm/Support/LLVMBuilder.h"
1009 #include &lt;cstdio&gt;
1010 #include &lt;string&gt;
1011 #include &lt;map&gt;
1012 #include &lt;vector&gt;
1013 using namespace llvm;
1014
1015 //===----------------------------------------------------------------------===//
1016 // Lexer
1017 //===----------------------------------------------------------------------===//
1018
1019 // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
1020 // of these for known things.
1021 enum Token {
1022   tok_eof = -1,
1023
1024   // commands
1025   tok_def = -2, tok_extern = -3,
1026
1027   // primary
1028   tok_identifier = -4, tok_number = -5,
1029
1030   // control
1031   tok_if = -6, tok_then = -7, tok_else = -8,
1032   tok_for = -9, tok_in = -10,
1033
1034   // operators
1035   tok_binary = -11, tok_unary = -12,
1036
1037   // var definition
1038   tok_var = -13
1039 };
1040
1041 static std::string IdentifierStr;  // Filled in if tok_identifier
1042 static double NumVal;              // Filled in if tok_number
1043
1044 /// gettok - Return the next token from standard input.
1045 static int gettok() {
1046   static int LastChar = ' ';
1047
1048   // Skip any whitespace.
1049   while (isspace(LastChar))
1050     LastChar = getchar();
1051
1052   if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
1053     IdentifierStr = LastChar;
1054     while (isalnum((LastChar = getchar())))
1055       IdentifierStr += LastChar;
1056
1057     if (IdentifierStr == "def") return tok_def;
1058     if (IdentifierStr == "extern") return tok_extern;
1059     if (IdentifierStr == "if") return tok_if;
1060     if (IdentifierStr == "then") return tok_then;
1061     if (IdentifierStr == "else") return tok_else;
1062     if (IdentifierStr == "for") return tok_for;
1063     if (IdentifierStr == "in") return tok_in;
1064     if (IdentifierStr == "binary") return tok_binary;
1065     if (IdentifierStr == "unary") return tok_unary;
1066     if (IdentifierStr == "var") return tok_var;
1067     return tok_identifier;
1068   }
1069
1070   if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
1071     std::string NumStr;
1072     do {
1073       NumStr += LastChar;
1074       LastChar = getchar();
1075     } while (isdigit(LastChar) || LastChar == '.');
1076
1077     NumVal = strtod(NumStr.c_str(), 0);
1078     return tok_number;
1079   }
1080
1081   if (LastChar == '#') {
1082     // Comment until end of line.
1083     do LastChar = getchar();
1084     while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
1085
1086     if (LastChar != EOF)
1087       return gettok();
1088   }
1089
1090   // Check for end of file.  Don't eat the EOF.
1091   if (LastChar == EOF)
1092     return tok_eof;
1093
1094   // Otherwise, just return the character as its ascii value.
1095   int ThisChar = LastChar;
1096   LastChar = getchar();
1097   return ThisChar;
1098 }
1099
1100 //===----------------------------------------------------------------------===//
1101 // Abstract Syntax Tree (aka Parse Tree)
1102 //===----------------------------------------------------------------------===//
1103
1104 /// ExprAST - Base class for all expression nodes.
1105 class ExprAST {
1106 public:
1107   virtual ~ExprAST() {}
1108   virtual Value *Codegen() = 0;
1109 };
1110
1111 /// NumberExprAST - Expression class for numeric literals like "1.0".
1112 class NumberExprAST : public ExprAST {
1113   double Val;
1114 public:
1115   NumberExprAST(double val) : Val(val) {}
1116   virtual Value *Codegen();
1117 };
1118
1119 /// VariableExprAST - Expression class for referencing a variable, like "a".
1120 class VariableExprAST : public ExprAST {
1121   std::string Name;
1122 public:
1123   VariableExprAST(const std::string &amp;name) : Name(name) {}
1124   const std::string &amp;getName() const { return Name; }
1125   virtual Value *Codegen();
1126 };
1127
1128 /// UnaryExprAST - Expression class for a unary operator.
1129 class UnaryExprAST : public ExprAST {
1130   char Opcode;
1131   ExprAST *Operand;
1132 public:
1133   UnaryExprAST(char opcode, ExprAST *operand)
1134     : Opcode(opcode), Operand(operand) {}
1135   virtual Value *Codegen();
1136 };
1137
1138 /// BinaryExprAST - Expression class for a binary operator.
1139 class BinaryExprAST : public ExprAST {
1140   char Op;
1141   ExprAST *LHS, *RHS;
1142 public:
1143   BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
1144     : Op(op), LHS(lhs), RHS(rhs) {}
1145   virtual Value *Codegen();
1146 };
1147
1148 /// CallExprAST - Expression class for function calls.
1149 class CallExprAST : public ExprAST {
1150   std::string Callee;
1151   std::vector&lt;ExprAST*&gt; Args;
1152 public:
1153   CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1154     : Callee(callee), Args(args) {}
1155   virtual Value *Codegen();
1156 };
1157
1158 /// IfExprAST - Expression class for if/then/else.
1159 class IfExprAST : public ExprAST {
1160   ExprAST *Cond, *Then, *Else;
1161 public:
1162   IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1163   : Cond(cond), Then(then), Else(_else) {}
1164   virtual Value *Codegen();
1165 };
1166
1167 /// ForExprAST - Expression class for for/in.
1168 class ForExprAST : public ExprAST {
1169   std::string VarName;
1170   ExprAST *Start, *End, *Step, *Body;
1171 public:
1172   ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1173              ExprAST *step, ExprAST *body)
1174     : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1175   virtual Value *Codegen();
1176 };
1177
1178 /// VarExprAST - Expression class for var/in
1179 class VarExprAST : public ExprAST {
1180   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1181   ExprAST *Body;
1182 public:
1183   VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
1184              ExprAST *body)
1185   : VarNames(varnames), Body(body) {}
1186
1187   virtual Value *Codegen();
1188 };
1189
1190 /// PrototypeAST - This class represents the "prototype" for a function,
1191 /// which captures its argument names as well as if it is an operator.
1192 class PrototypeAST {
1193   std::string Name;
1194   std::vector&lt;std::string&gt; Args;
1195   bool isOperator;
1196   unsigned Precedence;  // Precedence if a binary op.
1197 public:
1198   PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1199                bool isoperator = false, unsigned prec = 0)
1200   : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1201
1202   bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1203   bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1204
1205   char getOperatorName() const {
1206     assert(isUnaryOp() || isBinaryOp());
1207     return Name[Name.size()-1];
1208   }
1209
1210   unsigned getBinaryPrecedence() const { return Precedence; }
1211
1212   Function *Codegen();
1213
1214   void CreateArgumentAllocas(Function *F);
1215 };
1216
1217 /// FunctionAST - This class represents a function definition itself.
1218 class FunctionAST {
1219   PrototypeAST *Proto;
1220   ExprAST *Body;
1221 public:
1222   FunctionAST(PrototypeAST *proto, ExprAST *body)
1223     : Proto(proto), Body(body) {}
1224
1225   Function *Codegen();
1226 };
1227
1228 //===----------------------------------------------------------------------===//
1229 // Parser
1230 //===----------------------------------------------------------------------===//
1231
1232 /// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
1233 /// token the parser it looking at.  getNextToken reads another token from the
1234 /// lexer and updates CurTok with its results.
1235 static int CurTok;
1236 static int getNextToken() {
1237   return CurTok = gettok();
1238 }
1239
1240 /// BinopPrecedence - This holds the precedence for each binary operator that is
1241 /// defined.
1242 static std::map&lt;char, int&gt; BinopPrecedence;
1243
1244 /// GetTokPrecedence - Get the precedence of the pending binary operator token.
1245 static int GetTokPrecedence() {
1246   if (!isascii(CurTok))
1247     return -1;
1248
1249   // Make sure it's a declared binop.
1250   int TokPrec = BinopPrecedence[CurTok];
1251   if (TokPrec &lt;= 0) return -1;
1252   return TokPrec;
1253 }
1254
1255 /// Error* - These are little helper functions for error handling.
1256 ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1257 PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1258 FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1259
1260 static ExprAST *ParseExpression();
1261
1262 /// identifierexpr
1263 ///   ::= identifier
1264 ///   ::= identifier '(' expression* ')'
1265 static ExprAST *ParseIdentifierExpr() {
1266   std::string IdName = IdentifierStr;
1267
1268   getNextToken();  // eat identifier.
1269
1270   if (CurTok != '(') // Simple variable ref.
1271     return new VariableExprAST(IdName);
1272
1273   // Call.
1274   getNextToken();  // eat (
1275   std::vector&lt;ExprAST*&gt; Args;
1276   if (CurTok != ')') {
1277     while (1) {
1278       ExprAST *Arg = ParseExpression();
1279       if (!Arg) return 0;
1280       Args.push_back(Arg);
1281
1282       if (CurTok == ')') break;
1283
1284       if (CurTok != ',')
1285         return Error("Expected ')'");
1286       getNextToken();
1287     }
1288   }
1289
1290   // Eat the ')'.
1291   getNextToken();
1292
1293   return new CallExprAST(IdName, Args);
1294 }
1295
1296 /// numberexpr ::= number
1297 static ExprAST *ParseNumberExpr() {
1298   ExprAST *Result = new NumberExprAST(NumVal);
1299   getNextToken(); // consume the number
1300   return Result;
1301 }
1302
1303 /// parenexpr ::= '(' expression ')'
1304 static ExprAST *ParseParenExpr() {
1305   getNextToken();  // eat (.
1306   ExprAST *V = ParseExpression();
1307   if (!V) return 0;
1308
1309   if (CurTok != ')')
1310     return Error("expected ')'");
1311   getNextToken();  // eat ).
1312   return V;
1313 }
1314
1315 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
1316 static ExprAST *ParseIfExpr() {
1317   getNextToken();  // eat the if.
1318
1319   // condition.
1320   ExprAST *Cond = ParseExpression();
1321   if (!Cond) return 0;
1322
1323   if (CurTok != tok_then)
1324     return Error("expected then");
1325   getNextToken();  // eat the then
1326
1327   ExprAST *Then = ParseExpression();
1328   if (Then == 0) return 0;
1329
1330   if (CurTok != tok_else)
1331     return Error("expected else");
1332
1333   getNextToken();
1334
1335   ExprAST *Else = ParseExpression();
1336   if (!Else) return 0;
1337
1338   return new IfExprAST(Cond, Then, Else);
1339 }
1340
1341 /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
1342 static ExprAST *ParseForExpr() {
1343   getNextToken();  // eat the for.
1344
1345   if (CurTok != tok_identifier)
1346     return Error("expected identifier after for");
1347
1348   std::string IdName = IdentifierStr;
1349   getNextToken();  // eat identifier.
1350
1351   if (CurTok != '=')
1352     return Error("expected '=' after for");
1353   getNextToken();  // eat '='.
1354
1355
1356   ExprAST *Start = ParseExpression();
1357   if (Start == 0) return 0;
1358   if (CurTok != ',')
1359     return Error("expected ',' after for start value");
1360   getNextToken();
1361
1362   ExprAST *End = ParseExpression();
1363   if (End == 0) return 0;
1364
1365   // The step value is optional.
1366   ExprAST *Step = 0;
1367   if (CurTok == ',') {
1368     getNextToken();
1369     Step = ParseExpression();
1370     if (Step == 0) return 0;
1371   }
1372
1373   if (CurTok != tok_in)
1374     return Error("expected 'in' after for");
1375   getNextToken();  // eat 'in'.
1376
1377   ExprAST *Body = ParseExpression();
1378   if (Body == 0) return 0;
1379
1380   return new ForExprAST(IdName, Start, End, Step, Body);
1381 }
1382
1383 /// varexpr ::= 'var' identifier ('=' expression)?
1384 //                    (',' identifier ('=' expression)?)* 'in' expression
1385 static ExprAST *ParseVarExpr() {
1386   getNextToken();  // eat the var.
1387
1388   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1389
1390   // At least one variable name is required.
1391   if (CurTok != tok_identifier)
1392     return Error("expected identifier after var");
1393
1394   while (1) {
1395     std::string Name = IdentifierStr;
1396     getNextToken();  // eat identifier.
1397
1398     // Read the optional initializer.
1399     ExprAST *Init = 0;
1400     if (CurTok == '=') {
1401       getNextToken(); // eat the '='.
1402
1403       Init = ParseExpression();
1404       if (Init == 0) return 0;
1405     }
1406
1407     VarNames.push_back(std::make_pair(Name, Init));
1408
1409     // End of var list, exit loop.
1410     if (CurTok != ',') break;
1411     getNextToken(); // eat the ','.
1412
1413     if (CurTok != tok_identifier)
1414       return Error("expected identifier list after var");
1415   }
1416
1417   // At this point, we have to have 'in'.
1418   if (CurTok != tok_in)
1419     return Error("expected 'in' keyword after 'var'");
1420   getNextToken();  // eat 'in'.
1421
1422   ExprAST *Body = ParseExpression();
1423   if (Body == 0) return 0;
1424
1425   return new VarExprAST(VarNames, Body);
1426 }
1427
1428
1429 /// primary
1430 ///   ::= identifierexpr
1431 ///   ::= numberexpr
1432 ///   ::= parenexpr
1433 ///   ::= ifexpr
1434 ///   ::= forexpr
1435 ///   ::= varexpr
1436 static ExprAST *ParsePrimary() {
1437   switch (CurTok) {
1438   default: return Error("unknown token when expecting an expression");
1439   case tok_identifier: return ParseIdentifierExpr();
1440   case tok_number:     return ParseNumberExpr();
1441   case '(':            return ParseParenExpr();
1442   case tok_if:         return ParseIfExpr();
1443   case tok_for:        return ParseForExpr();
1444   case tok_var:        return ParseVarExpr();
1445   }
1446 }
1447
1448 /// unary
1449 ///   ::= primary
1450 ///   ::= '!' unary
1451 static ExprAST *ParseUnary() {
1452   // If the current token is not an operator, it must be a primary expr.
1453   if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1454     return ParsePrimary();
1455
1456   // If this is a unary operator, read it.
1457   int Opc = CurTok;
1458   getNextToken();
1459   if (ExprAST *Operand = ParseUnary())
1460     return new UnaryExprAST(Opc, Operand);
1461   return 0;
1462 }
1463
1464 /// binoprhs
1465 ///   ::= ('+' unary)*
1466 static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1467   // If this is a binop, find its precedence.
1468   while (1) {
1469     int TokPrec = GetTokPrecedence();
1470
1471     // If this is a binop that binds at least as tightly as the current binop,
1472     // consume it, otherwise we are done.
1473     if (TokPrec &lt; ExprPrec)
1474       return LHS;
1475
1476     // Okay, we know this is a binop.
1477     int BinOp = CurTok;
1478     getNextToken();  // eat binop
1479
1480     // Parse the unary expression after the binary operator.
1481     ExprAST *RHS = ParseUnary();
1482     if (!RHS) return 0;
1483
1484     // If BinOp binds less tightly with RHS than the operator after RHS, let
1485     // the pending operator take RHS as its LHS.
1486     int NextPrec = GetTokPrecedence();
1487     if (TokPrec &lt; NextPrec) {
1488       RHS = ParseBinOpRHS(TokPrec+1, RHS);
1489       if (RHS == 0) return 0;
1490     }
1491
1492     // Merge LHS/RHS.
1493     LHS = new BinaryExprAST(BinOp, LHS, RHS);
1494   }
1495 }
1496
1497 /// expression
1498 ///   ::= unary binoprhs
1499 ///
1500 static ExprAST *ParseExpression() {
1501   ExprAST *LHS = ParseUnary();
1502   if (!LHS) return 0;
1503
1504   return ParseBinOpRHS(0, LHS);
1505 }
1506
1507 /// prototype
1508 ///   ::= id '(' id* ')'
1509 ///   ::= binary LETTER number? (id, id)
1510 ///   ::= unary LETTER (id)
1511 static PrototypeAST *ParsePrototype() {
1512   std::string FnName;
1513
1514   int Kind = 0;  // 0 = identifier, 1 = unary, 2 = binary.
1515   unsigned BinaryPrecedence = 30;
1516
1517   switch (CurTok) {
1518   default:
1519     return ErrorP("Expected function name in prototype");
1520   case tok_identifier:
1521     FnName = IdentifierStr;
1522     Kind = 0;
1523     getNextToken();
1524     break;
1525   case tok_unary:
1526     getNextToken();
1527     if (!isascii(CurTok))
1528       return ErrorP("Expected unary operator");
1529     FnName = "unary";
1530     FnName += (char)CurTok;
1531     Kind = 1;
1532     getNextToken();
1533     break;
1534   case tok_binary:
1535     getNextToken();
1536     if (!isascii(CurTok))
1537       return ErrorP("Expected binary operator");
1538     FnName = "binary";
1539     FnName += (char)CurTok;
1540     Kind = 2;
1541     getNextToken();
1542
1543     // Read the precedence if present.
1544     if (CurTok == tok_number) {
1545       if (NumVal &lt; 1 || NumVal &gt; 100)
1546         return ErrorP("Invalid precedecnce: must be 1..100");
1547       BinaryPrecedence = (unsigned)NumVal;
1548       getNextToken();
1549     }
1550     break;
1551   }
1552
1553   if (CurTok != '(')
1554     return ErrorP("Expected '(' in prototype");
1555
1556   std::vector&lt;std::string&gt; ArgNames;
1557   while (getNextToken() == tok_identifier)
1558     ArgNames.push_back(IdentifierStr);
1559   if (CurTok != ')')
1560     return ErrorP("Expected ')' in prototype");
1561
1562   // success.
1563   getNextToken();  // eat ')'.
1564
1565   // Verify right number of names for operator.
1566   if (Kind &amp;&amp; ArgNames.size() != Kind)
1567     return ErrorP("Invalid number of operands for operator");
1568
1569   return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1570 }
1571
1572 /// definition ::= 'def' prototype expression
1573 static FunctionAST *ParseDefinition() {
1574   getNextToken();  // eat def.
1575   PrototypeAST *Proto = ParsePrototype();
1576   if (Proto == 0) return 0;
1577
1578   if (ExprAST *E = ParseExpression())
1579     return new FunctionAST(Proto, E);
1580   return 0;
1581 }
1582
1583 /// toplevelexpr ::= expression
1584 static FunctionAST *ParseTopLevelExpr() {
1585   if (ExprAST *E = ParseExpression()) {
1586     // Make an anonymous proto.
1587     PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1588     return new FunctionAST(Proto, E);
1589   }
1590   return 0;
1591 }
1592
1593 /// external ::= 'extern' prototype
1594 static PrototypeAST *ParseExtern() {
1595   getNextToken();  // eat extern.
1596   return ParsePrototype();
1597 }
1598
1599 //===----------------------------------------------------------------------===//
1600 // Code Generation
1601 //===----------------------------------------------------------------------===//
1602
1603 static Module *TheModule;
1604 static LLVMFoldingBuilder Builder;
1605 static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
1606 static FunctionPassManager *TheFPM;
1607
1608 Value *ErrorV(const char *Str) { Error(Str); return 0; }
1609
1610 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
1611 /// the function.  This is used for mutable variables etc.
1612 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
1613                                           const std::string &amp;VarName) {
1614   LLVMBuilder TmpB(&amp;TheFunction-&gt;getEntryBlock(),
1615                    TheFunction-&gt;getEntryBlock().begin());
1616   return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
1617 }
1618
1619
1620 Value *NumberExprAST::Codegen() {
1621   return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1622 }
1623
1624 Value *VariableExprAST::Codegen() {
1625   // Look this variable up in the function.
1626   Value *V = NamedValues[Name];
1627   if (V == 0) return ErrorV("Unknown variable name");
1628
1629   // Load the value.
1630   return Builder.CreateLoad(V, Name.c_str());
1631 }
1632
1633 Value *UnaryExprAST::Codegen() {
1634   Value *OperandV = Operand-&gt;Codegen();
1635   if (OperandV == 0) return 0;
1636
1637   Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1638   if (F == 0)
1639     return ErrorV("Unknown unary operator");
1640
1641   return Builder.CreateCall(F, OperandV, "unop");
1642 }
1643
1644
1645 Value *BinaryExprAST::Codegen() {
1646   // Special case '=' because we don't want to emit the LHS as an expression.
1647   if (Op == '=') {
1648     // Assignment requires the LHS to be an identifier.
1649     VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
1650     if (!LHSE)
1651       return ErrorV("destination of '=' must be a variable");
1652     // Codegen the RHS.
1653     Value *Val = RHS-&gt;Codegen();
1654     if (Val == 0) return 0;
1655
1656     // Look up the name.
1657     Value *Variable = NamedValues[LHSE-&gt;getName()];
1658     if (Variable == 0) return ErrorV("Unknown variable name");
1659
1660     Builder.CreateStore(Val, Variable);
1661     return Val;
1662   }
1663
1664
1665   Value *L = LHS-&gt;Codegen();
1666   Value *R = RHS-&gt;Codegen();
1667   if (L == 0 || R == 0) return 0;
1668
1669   switch (Op) {
1670   case '+': return Builder.CreateAdd(L, R, "addtmp");
1671   case '-': return Builder.CreateSub(L, R, "subtmp");
1672   case '*': return Builder.CreateMul(L, R, "multmp");
1673   case '&lt;':
1674     L = Builder.CreateFCmpULT(L, R, "multmp");
1675     // Convert bool 0/1 to double 0.0 or 1.0
1676     return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1677   default: break;
1678   }
1679
1680   // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1681   // a call to it.
1682   Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1683   assert(F &amp;&amp; "binary operator not found!");
1684
1685   Value *Ops[] = { L, R };
1686   return Builder.CreateCall(F, Ops, Ops+2, "binop");
1687 }
1688
1689 Value *CallExprAST::Codegen() {
1690   // Look up the name in the global module table.
1691   Function *CalleeF = TheModule-&gt;getFunction(Callee);
1692   if (CalleeF == 0)
1693     return ErrorV("Unknown function referenced");
1694
1695   // If argument mismatch error.
1696   if (CalleeF-&gt;arg_size() != Args.size())
1697     return ErrorV("Incorrect # arguments passed");
1698
1699   std::vector&lt;Value*&gt; ArgsV;
1700   for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1701     ArgsV.push_back(Args[i]-&gt;Codegen());
1702     if (ArgsV.back() == 0) return 0;
1703   }
1704
1705   return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1706 }
1707
1708 Value *IfExprAST::Codegen() {
1709   Value *CondV = Cond-&gt;Codegen();
1710   if (CondV == 0) return 0;
1711
1712   // Convert condition to a bool by comparing equal to 0.0.
1713   CondV = Builder.CreateFCmpONE(CondV,
1714                                 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1715                                 "ifcond");
1716
1717   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1718
1719   // Create blocks for the then and else cases.  Insert the 'then' block at the
1720   // end of the function.
1721   BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
1722   BasicBlock *ElseBB = new BasicBlock("else");
1723   BasicBlock *MergeBB = new BasicBlock("ifcont");
1724
1725   Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1726
1727   // Emit then value.
1728   Builder.SetInsertPoint(ThenBB);
1729
1730   Value *ThenV = Then-&gt;Codegen();
1731   if (ThenV == 0) return 0;
1732
1733   Builder.CreateBr(MergeBB);
1734   // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1735   ThenBB = Builder.GetInsertBlock();
1736
1737   // Emit else block.
1738   TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1739   Builder.SetInsertPoint(ElseBB);
1740
1741   Value *ElseV = Else-&gt;Codegen();
1742   if (ElseV == 0) return 0;
1743
1744   Builder.CreateBr(MergeBB);
1745   // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1746   ElseBB = Builder.GetInsertBlock();
1747
1748   // Emit merge block.
1749   TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1750   Builder.SetInsertPoint(MergeBB);
1751   PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
1752
1753   PN-&gt;addIncoming(ThenV, ThenBB);
1754   PN-&gt;addIncoming(ElseV, ElseBB);
1755   return PN;
1756 }
1757
1758 Value *ForExprAST::Codegen() {
1759   // Output this as:
1760   //   var = alloca double
1761   //   ...
1762   //   start = startexpr
1763   //   store start -&gt; var
1764   //   goto loop
1765   // loop:
1766   //   ...
1767   //   bodyexpr
1768   //   ...
1769   // loopend:
1770   //   step = stepexpr
1771   //   endcond = endexpr
1772   //
1773   //   curvar = load var
1774   //   nextvar = curvar + step
1775   //   store nextvar -&gt; var
1776   //   br endcond, loop, endloop
1777   // outloop:
1778
1779   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1780
1781   // Create an alloca for the variable in the entry block.
1782   AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1783
1784   // Emit the start code first, without 'variable' in scope.
1785   Value *StartVal = Start-&gt;Codegen();
1786   if (StartVal == 0) return 0;
1787
1788   // Store the value into the alloca.
1789   Builder.CreateStore(StartVal, Alloca);
1790
1791   // Make the new basic block for the loop header, inserting after current
1792   // block.
1793   BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1794   BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
1795
1796   // Insert an explicit fall through from the current block to the LoopBB.
1797   Builder.CreateBr(LoopBB);
1798
1799   // Start insertion in LoopBB.
1800   Builder.SetInsertPoint(LoopBB);
1801
1802   // Within the loop, the variable is defined equal to the PHI node.  If it
1803   // shadows an existing variable, we have to restore it, so save it now.
1804   AllocaInst *OldVal = NamedValues[VarName];
1805   NamedValues[VarName] = Alloca;
1806
1807   // Emit the body of the loop.  This, like any other expr, can change the
1808   // current BB.  Note that we ignore the value computed by the body, but don't
1809   // allow an error.
1810   if (Body-&gt;Codegen() == 0)
1811     return 0;
1812
1813   // Emit the step value.
1814   Value *StepVal;
1815   if (Step) {
1816     StepVal = Step-&gt;Codegen();
1817     if (StepVal == 0) return 0;
1818   } else {
1819     // If not specified, use 1.0.
1820     StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
1821   }
1822
1823   // Compute the end condition.
1824   Value *EndCond = End-&gt;Codegen();
1825   if (EndCond == 0) return EndCond;
1826
1827   // Reload, increment, and restore the alloca.  This handles the case where
1828   // the body of the loop mutates the variable.
1829   Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
1830   Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
1831   Builder.CreateStore(NextVar, Alloca);
1832
1833   // Convert condition to a bool by comparing equal to 0.0.
1834   EndCond = Builder.CreateFCmpONE(EndCond,
1835                                   ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1836                                   "loopcond");
1837
1838   // Create the "after loop" block and insert it.
1839   BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1840   BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
1841
1842   // Insert the conditional branch into the end of LoopEndBB.
1843   Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1844
1845   // Any new code will be inserted in AfterBB.
1846   Builder.SetInsertPoint(AfterBB);
1847
1848   // Restore the unshadowed variable.
1849   if (OldVal)
1850     NamedValues[VarName] = OldVal;
1851   else
1852     NamedValues.erase(VarName);
1853
1854
1855   // for expr always returns 0.0.
1856   return Constant::getNullValue(Type::DoubleTy);
1857 }
1858
1859 Value *VarExprAST::Codegen() {
1860   std::vector&lt;AllocaInst *&gt; OldBindings;
1861
1862   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1863
1864   // Register all variables and emit their initializer.
1865   for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
1866     const std::string &amp;VarName = VarNames[i].first;
1867     ExprAST *Init = VarNames[i].second;
1868
1869     // Emit the initializer before adding the variable to scope, this prevents
1870     // the initializer from referencing the variable itself, and permits stuff
1871     // like this:
1872     //  var a = 1 in
1873     //    var a = a in ...   # refers to outer 'a'.
1874     Value *InitVal;
1875     if (Init) {
1876       InitVal = Init-&gt;Codegen();
1877       if (InitVal == 0) return 0;
1878     } else { // If not specified, use 0.0.
1879       InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
1880     }
1881
1882     AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1883     Builder.CreateStore(InitVal, Alloca);
1884
1885     // Remember the old variable binding so that we can restore the binding when
1886     // we unrecurse.
1887     OldBindings.push_back(NamedValues[VarName]);
1888
1889     // Remember this binding.
1890     NamedValues[VarName] = Alloca;
1891   }
1892
1893   // Codegen the body, now that all vars are in scope.
1894   Value *BodyVal = Body-&gt;Codegen();
1895   if (BodyVal == 0) return 0;
1896
1897   // Pop all our variables from scope.
1898   for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
1899     NamedValues[VarNames[i].first] = OldBindings[i];
1900
1901   // Return the body computation.
1902   return BodyVal;
1903 }
1904
1905
1906 Function *PrototypeAST::Codegen() {
1907   // Make the function type:  double(double,double) etc.
1908   std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1909   FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
1910
1911   Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1912
1913   // If F conflicted, there was already something named 'Name'.  If it has a
1914   // body, don't allow redefinition or reextern.
1915   if (F-&gt;getName() != Name) {
1916     // Delete the one we just made and get the existing one.
1917     F-&gt;eraseFromParent();
1918     F = TheModule-&gt;getFunction(Name);
1919
1920     // If F already has a body, reject this.
1921     if (!F-&gt;empty()) {
1922       ErrorF("redefinition of function");
1923       return 0;
1924     }
1925
1926     // If F took a different number of args, reject.
1927     if (F-&gt;arg_size() != Args.size()) {
1928       ErrorF("redefinition of function with different # args");
1929       return 0;
1930     }
1931   }
1932
1933   // Set names for all arguments.
1934   unsigned Idx = 0;
1935   for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1936        ++AI, ++Idx)
1937     AI-&gt;setName(Args[Idx]);
1938
1939   return F;
1940 }
1941
1942 /// CreateArgumentAllocas - Create an alloca for each argument and register the
1943 /// argument in the symbol table so that references to it will succeed.
1944 void PrototypeAST::CreateArgumentAllocas(Function *F) {
1945   Function::arg_iterator AI = F-&gt;arg_begin();
1946   for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
1947     // Create an alloca for this variable.
1948     AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
1949
1950     // Store the initial value into the alloca.
1951     Builder.CreateStore(AI, Alloca);
1952
1953     // Add arguments to variable symbol table.
1954     NamedValues[Args[Idx]] = Alloca;
1955   }
1956 }
1957
1958
1959 Function *FunctionAST::Codegen() {
1960   NamedValues.clear();
1961
1962   Function *TheFunction = Proto-&gt;Codegen();
1963   if (TheFunction == 0)
1964     return 0;
1965
1966   // If this is an operator, install it.
1967   if (Proto-&gt;isBinaryOp())
1968     BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1969
1970   // Create a new basic block to start insertion into.
1971   BasicBlock *BB = new BasicBlock("entry", TheFunction);
1972   Builder.SetInsertPoint(BB);
1973
1974   // Add all arguments to the symbol table and create their allocas.
1975   Proto-&gt;CreateArgumentAllocas(TheFunction);
1976
1977   if (Value *RetVal = Body-&gt;Codegen()) {
1978     // Finish off the function.
1979     Builder.CreateRet(RetVal);
1980
1981     // Validate the generated code, checking for consistency.
1982     verifyFunction(*TheFunction);
1983
1984     // Optimize the function.
1985     TheFPM-&gt;run(*TheFunction);
1986
1987     return TheFunction;
1988   }
1989
1990   // Error reading body, remove function.
1991   TheFunction-&gt;eraseFromParent();
1992
1993   if (Proto-&gt;isBinaryOp())
1994     BinopPrecedence.erase(Proto-&gt;getOperatorName());
1995   return 0;
1996 }
1997
1998 //===----------------------------------------------------------------------===//
1999 // Top-Level parsing and JIT Driver
2000 //===----------------------------------------------------------------------===//
2001
2002 static ExecutionEngine *TheExecutionEngine;
2003
2004 static void HandleDefinition() {
2005   if (FunctionAST *F = ParseDefinition()) {
2006     if (Function *LF = F-&gt;Codegen()) {
2007       fprintf(stderr, "Read function definition:");
2008       LF-&gt;dump();
2009     }
2010   } else {
2011     // Skip token for error recovery.
2012     getNextToken();
2013   }
2014 }
2015
2016 static void HandleExtern() {
2017   if (PrototypeAST *P = ParseExtern()) {
2018     if (Function *F = P-&gt;Codegen()) {
2019       fprintf(stderr, "Read extern: ");
2020       F-&gt;dump();
2021     }
2022   } else {
2023     // Skip token for error recovery.
2024     getNextToken();
2025   }
2026 }
2027
2028 static void HandleTopLevelExpression() {
2029   // Evaluate a top level expression into an anonymous function.
2030   if (FunctionAST *F = ParseTopLevelExpr()) {
2031     if (Function *LF = F-&gt;Codegen()) {
2032       // JIT the function, returning a function pointer.
2033       void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
2034
2035       // Cast it to the right type (takes no arguments, returns a double) so we
2036       // can call it as a native function.
2037       double (*FP)() = (double (*)())FPtr;
2038       fprintf(stderr, "Evaluated to %f\n", FP());
2039     }
2040   } else {
2041     // Skip token for error recovery.
2042     getNextToken();
2043   }
2044 }
2045
2046 /// top ::= definition | external | expression | ';'
2047 static void MainLoop() {
2048   while (1) {
2049     fprintf(stderr, "ready&gt; ");
2050     switch (CurTok) {
2051     case tok_eof:    return;
2052     case ';':        getNextToken(); break;  // ignore top level semicolons.
2053     case tok_def:    HandleDefinition(); break;
2054     case tok_extern: HandleExtern(); break;
2055     default:         HandleTopLevelExpression(); break;
2056     }
2057   }
2058 }
2059
2060
2061
2062 //===----------------------------------------------------------------------===//
2063 // "Library" functions that can be "extern'd" from user code.
2064 //===----------------------------------------------------------------------===//
2065
2066 /// putchard - putchar that takes a double and returns 0.
2067 extern "C"
2068 double putchard(double X) {
2069   putchar((char)X);
2070   return 0;
2071 }
2072
2073 /// printd - printf that takes a double prints it as "%f\n", returning 0.
2074 extern "C"
2075 double printd(double X) {
2076   printf("%f\n", X);
2077   return 0;
2078 }
2079
2080 //===----------------------------------------------------------------------===//
2081 // Main driver code.
2082 //===----------------------------------------------------------------------===//
2083
2084 int main() {
2085   // Install standard binary operators.
2086   // 1 is lowest precedence.
2087   BinopPrecedence['='] = 2;
2088   BinopPrecedence['&lt;'] = 10;
2089   BinopPrecedence['+'] = 20;
2090   BinopPrecedence['-'] = 20;
2091   BinopPrecedence['*'] = 40;  // highest.
2092
2093   // Prime the first token.
2094   fprintf(stderr, "ready&gt; ");
2095   getNextToken();
2096
2097   // Make the module, which holds all the code.
2098   TheModule = new Module("my cool jit");
2099
2100   // Create the JIT.
2101   TheExecutionEngine = ExecutionEngine::create(TheModule);
2102
2103   {
2104     ExistingModuleProvider OurModuleProvider(TheModule);
2105     FunctionPassManager OurFPM(&amp;OurModuleProvider);
2106
2107     // Set up the optimizer pipeline.  Start with registering info about how the
2108     // target lays out data structures.
2109     OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
2110     // Promote allocas to registers.
2111     OurFPM.add(createPromoteMemoryToRegisterPass());
2112     // Do simple "peephole" optimizations and bit-twiddling optzns.
2113     OurFPM.add(createInstructionCombiningPass());
2114     // Reassociate expressions.
2115     OurFPM.add(createReassociatePass());
2116     // Eliminate Common SubExpressions.
2117     OurFPM.add(createGVNPass());
2118     // Simplify the control flow graph (deleting unreachable blocks, etc).
2119     OurFPM.add(createCFGSimplificationPass());
2120
2121     // Set the global so the code gen can use this.
2122     TheFPM = &amp;OurFPM;
2123
2124     // Run the main "interpreter loop" now.
2125     MainLoop();
2126
2127     TheFPM = 0;
2128   }  // Free module provider and pass manager.
2129
2130
2131   // Print out all of the generated code.
2132   TheModule-&gt;dump();
2133   return 0;
2134 }
2135 </pre>
2136 </div>
2137
2138 </div>
2139
2140 <!-- *********************************************************************** -->
2141 <hr>
2142 <address>
2143   <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
2144   src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
2145   <a href="http://validator.w3.org/check/referer"><img
2146   src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
2147
2148   <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
2149   <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
2150   Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
2151 </address>
2152 </body>
2153 </html>