docs/tutorial/LangImpl7.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   2                       "http://www.w3.org/TR/html4/strict.dtd">
   3
   4 <html>
   5 <head>
   6   <title>Kaleidoscope: Extending the Language: Mutable Variables / SSA
   7          construction</title>
   8   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   9   <meta name="author" content="Chris Lattner">
  10   <link rel="stylesheet" href="../llvm.css" type="text/css">
  11 </head>
  12
  13 <body>
  14
  15 <div class="doc_title">Kaleidoscope: Extending the Language: Mutable Variables</div>
  16
  17 <div class="doc_author">
  18   <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
  19 </div>
  20
  21 <!-- *********************************************************************** -->
  22 <div class="doc_section"><a name="intro">Part 7 Introduction</a></div>
  23 <!-- *********************************************************************** -->
  24
  25 <div class="doc_text">
  26
  27 <p>Welcome to Part 7 of the "<a href="index.html">Implementing a language with
  28 LLVM</a>" tutorial.  In parts 1 through 6, we've built a very respectable,
  29 albeit simple, <a
  30 href="http://en.wikipedia.org/wiki/Functional_programming">functional
  31 programming language</a>.  In our journey, we learned some parsing techniques,
  32 how to build and represent an AST, how to build LLVM IR, and how to optimize
  33 the resultant code and JIT compile it.</p>
  34
  35 <p>While Kaleidoscope is interesting as a functional language, this makes it
  36 "too easy" to generate LLVM IR for it.  In particular, a functional language
  37 makes it very easy to build LLVM IR directly in <a
  38 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA form</a>.
  39 Since LLVM requires that the input code be in SSA form, this is a very nice
  40 property and it is often unclear to newcomers how to generate code for an
  41 imperative language with mutable variables.</p>
  42
  43 <p>The short (and happy) summary of this chapter is that there is no need for
  44 your front-end to build SSA form: LLVM provides highly tuned and well tested
  45 support for this, though the way it works is a bit unexpected for some.</p>
  46
  47 </div>
  48
  49 <!-- *********************************************************************** -->
  50 <div class="doc_section"><a name="why">Why is this a hard problem?</a></div>
  51 <!-- *********************************************************************** -->
  52
  53 <div class="doc_text">
  54
  55 <p>
  56 To understand why mutable variables cause complexities in SSA construction,
  57 consider this extremely simple C example:
  58 </p>
  59
  60 <div class="doc_code">
  61 <pre>
  62 int G, H;
  63 int test(_Bool Condition) {
  64   int X;
  65   if (Condition)
  66     X = G;
  67   else
  68     X = H;
  69   return X;
  70 }
  71 </pre>
  72 </div>
  73
  74 <p>In this case, we have the variable "X", whose value depends on the path
  75 executed in the program.  Because there are two different possible values for X
  76 before the return instruction, a PHI node is inserted to merge the two values.
  77 The LLVM IR that we want for this example looks like this:</p>
  78
  79 <div class="doc_code">
  80 <pre>
  81 @G = weak global i32 0   ; type of @G is i32*
  82 @H = weak global i32 0   ; type of @H is i32*
  83
  84 define i32 @test(i1 %Condition) {
  85 entry:
  86         br i1 %Condition, label %cond_true, label %cond_false
  87
  88 cond_true:
  89         %X.0 = load i32* @G
  90         br label %cond_next
  91
  92 cond_false:
  93         %X.1 = load i32* @H
  94         br label %cond_next
  95
  96 cond_next:
  97         %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
  98         ret i32 %X.2
  99 }
 100 </pre>
 101 </div>
 102
 103 <p>In this example, the loads from the G and H global variables are explicit in
 104 the LLVM IR, and they live in the then/else branches of the if statement
 105 (cond_true/cond_false).  In order to merge the incoming values, the X.2 phi node
 106 in the cond_next block selects the right value to use based on where control
 107 flow is coming from: if control flow comes from the cond_false block, X.2 gets
 108 the value of X.1.  Alternatively, if control flow comes from cond_tree, it gets
 109 the value of X.0.  The intent of this chapter is not to explain the details of
 110 SSA form.  For more information, see one of the many <a
 111 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online
 112 references</a>.</p>
 113
 114 <p>The question for this article is "who places phi nodes when lowering
 115 assignments to mutable variables?".  The issue here is that LLVM
 116 <em>requires</em> that its IR be in SSA form: there is no "non-ssa" mode for it.
 117 However, SSA construction requires non-trivial algorithms and data structures,
 118 so it is inconvenient and wasteful for every front-end to have to reproduce this
 119 logic.</p>
 120
 121 </div>
 122
 123 <!-- *********************************************************************** -->
 124 <div class="doc_section"><a name="memory">Memory in LLVM</a></div>
 125 <!-- *********************************************************************** -->
 126
 127 <div class="doc_text">
 128
 129 <p>The 'trick' here is that while LLVM does require all register values to be
 130 in SSA form, it does not require (or permit) memory objects to be in SSA form.
 131 In the example above, note that the loads from G and H are direct accesses to
 132 G and H: they are not renamed or versioned.  This differs from some other
 133 compiler systems, which do try to version memory objects.  In LLVM, instead of
 134 encoding dataflow analysis of memory into the LLVM IR, it is handled with <a
 135 href="../WritingAnLLVMPass.html">Analysis Passes</a> which are computed on
 136 demand.</p>
 137
 138 <p>
 139 With this in mind, the high-level idea is that we want to make a stack variable
 140 (which lives in memory, because it is on the stack) for each mutable object in
 141 a function.  To take advantage of this trick, we need to talk about how LLVM
 142 represents stack variables.
 143 </p>
 144
 145 <p>In LLVM, all memory accesses are explicit with load/store instructions, and
 146 it is carefully designed to not have (or need) an "address-of" operator.  Notice
 147 how the type of the @G/@H global variables is actually "i32*" even though the
 148 variable is defined as "i32".  What this means is that @G defines <em>space</em>
 149 for an i32 in the global data area, but its <em>name</em> actually refers to the
 150 address for that space.  Stack variables work the same way, but instead of being
 151 declared with global variable definitions, they are declared with the
 152 <a href="../LangRef.html#i_alloca">LLVM alloca instruction</a>:</p>
 153
 154 <div class="doc_code">
 155 <pre>
 156 define i32 @test(i1 %Condition) {
 157 entry:
 158         %X = alloca i32           ; type of %X is i32*.
 159         ...
 160         %tmp = load i32* %X       ; load the stack value %X from the stack.
 161         %tmp2 = add i32 %tmp, 1   ; increment it
 162         store i32 %tmp2, i32* %X  ; store it back
 163         ...
 164 </pre>
 165 </div>
 166
 167 <p>This code shows an example of how you can declare and manipulate a stack
 168 variable in the LLVM IR.  Stack memory allocated with the alloca instruction is
 169 fully general: you can pass the address of the stack slot to functions, you can
 170 store it in other variables, etc.  In our example above, we could rewrite the
 171 example to use the alloca technique to avoid using a PHI node:</p>
 172
 173 <div class="doc_code">
 174 <pre>
 175 @G = weak global i32 0   ; type of @G is i32*
 176 @H = weak global i32 0   ; type of @H is i32*
 177
 178 define i32 @test(i1 %Condition) {
 179 entry:
 180         %X = alloca i32           ; type of %X is i32*.
 181         br i1 %Condition, label %cond_true, label %cond_false
 182
 183 cond_true:
 184         %X.0 = load i32* @G
 185         store i32 %X.0, i32* %X   ; Update X
 186         br label %cond_next
 187
 188 cond_false:
 189         %X.1 = load i32* @H
 190         store i32 %X.1, i32* %X   ; Update X
 191         br label %cond_next
 192
 193 cond_next:
 194         %X.2 = load i32* %X       ; Read X
 195         ret i32 %X.2
 196 }
 197 </pre>
 198 </div>
 199
 200 <p>With this, we have discovered a way to handle arbitrary mutable variables
 201 without the need to create Phi nodes at all:</p>
 202
 203 <ol>
 204 <li>Each mutable variable becomes a stack allocation.</li>
 205 <li>Each read of the variable becomes a load from the stack.</li>
 206 <li>Each update of the variable becomes a store to the stack.</li>
 207 <li>Taking the address of a variable just uses the stack address directly.</li>
 208 </ol>
 209
 210 <p>While this solution has solved our immediate problem, it introduced another
 211 one: we have now apparently introduced a lot of stack traffic for very simple
 212 and common operations, a major performance problem.  Fortunately for us, the
 213 LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles
 214 this case, promoting allocas like this into SSA registers, inserting Phi nodes
 215 as appropriate.  If you run this example through the pass, for example, you'll
 216 get:</p>
 217
 218 <div class="doc_code">
 219 <pre>
 220 $ <b>llvm-as &lt; example.ll | opt -mem2reg | llvm-dis</b>
 221 @G = weak global i32 0
 222 @H = weak global i32 0
 223
 224 define i32 @test(i1 %Condition) {
 225 entry:
 226         br i1 %Condition, label %cond_true, label %cond_false
 227
 228 cond_true:
 229         %X.0 = load i32* @G
 230         br label %cond_next
 231
 232 cond_false:
 233         %X.1 = load i32* @H
 234         br label %cond_next
 235
 236 cond_next:
 237         %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
 238         ret i32 %X.01
 239 }
 240 </pre>
 241 </div>
 242
 243 <p>The mem2reg pass implements the standard "iterated dominator frontier"
 244 algorithm for constructing SSA form and has a number of optimizations that speed
 245 up very common degenerate cases.  mem2reg really is the answer for dealing with
 246 mutable variables, and we highly recommend that you depend on it.  Note that
 247 mem2reg only works on variables in certain circumstances:</p>
 248
 249 <ol>
 250 <li>mem2reg is alloca-driven: it looks for allocas and if it can handle them, it
 251 promotes them.  It does not apply to global variables or heap allocations.</li>
 252
 253 <li>mem2reg only looks for alloca instructions in the entry block of the
 254 function.  Being in the entry block guarantees that the alloca is only executed
 255 once, which makes analysis simpler.</li>
 256
 257 <li>mem2reg only promotes allocas whose uses are direct loads and stores.  If
 258 the address of the stack object is passed to a function, or if any funny pointer
 259 arithmetic is involved, the alloca will not be promoted.</li>
 260
 261 <li>mem2reg only works on allocas of scalar values, and only if the array size
 262 of the allocation is 1 (or missing in the .ll file).  mem2reg is not capable of
 263 promoting structs or arrays to registers.  Note that the "scalarrepl" pass is
 264 more powerful and can promote structs, "unions", and arrays in many cases.</li>
 265
 266 </ol>
 267
 268 <p>
 269 All of these properties are easy to satisfy for most imperative languages, and
 270 we'll illustrate this below with Kaleidoscope.  The final question you may be
 271 asking is: should I bother with this nonsense for my front-end?  Wouldn't it be
 272 better if I just did SSA construction directly, avoiding use of the mem2reg
 273 optimization pass?  In short, we strongly recommend that use you this technique
 274 for building SSA form, unless there is an extremely good reason not to.  Using
 275 this technique is:</p>
 276
 277 <ul>
 278 <li>Proven and well tested: llvm-gcc and clang both use this technique for local
 279 mutable variables.  As such, the most common clients of LLVM are using this to
 280 handle a bulk of their variables.  You can be sure that bugs are found fast and
 281 fixed early.</li>
 282
 283 <li>Extremely Fast: mem2reg has a number of special cases that make it fast in
 284 common cases as well as fully general.  For example, it has fast-paths for
 285 variables that are only used in a single block, variables that only have one
 286 assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc.
 287 </li>
 288
 289 <li>Needed for debug info generation: <a href="../SourceLevelDebugging.html">
 290 Debug information in LLVM</a> relies on having the address of the variable
 291 exposed to attach debug info to it.  This technique dovetails very naturally
 292 with this style of debug info.</li>
 293 </ul>
 294
 295 <p>If nothing else, this makes it much easier to get your front-end up and
 296 running, and is very simple to implement.  Lets extend Kaleidoscope with mutable
 297 variables now!
 298 </p>
 299
 300 </div>
 301
 302 <!-- *********************************************************************** -->
 303 <div class="doc_section"><a name="kalvars">Mutable Variables in
 304 Kaleidoscope</a></div>
 305 <!-- *********************************************************************** -->
 306
 307 <div class="doc_text">
 308
 309 <p>Now that we know the sort of problem we want to tackle, lets see what this
 310 looks like in the context of our little Kaleidoscope language.  We're going to
 311 add two features:</p>
 312
 313 <ol>
 314 <li>The ability to mutate variables with the '=' operator.</li>
 315 <li>The ability to define new variables.</li>
 316 </ol>
 317
 318 <p>While the first item is really what this is about, we only have variables
 319 for incoming arguments and for induction variables, and redefining them only
 320 goes so far :).  Also, the ability to define new variables is a
 321 useful thing regardless of whether you will be mutating them.  Here's a
 322 motivating example that shows how we could use these:</p>
 323
 324 <div class="doc_code">
 325 <pre>
 326 # Define ':' for sequencing: as a low-precedence operator that ignores operands
 327 # and just returns the RHS.
 328 def binary : 1 (x y) y;
 329
 330 # Recursive fib, we could do this before.
 331 def fib(x)
 332   if (x &lt; 3) then
 333     1
 334   else
 335     fib(x-1)+fib(x-2);
 336
 337 # Iterative fib.
 338 def fibi(x)
 339   <b>var a = 1, b = 1, c in</b>
 340   (for i = 3, i &;t; x in
 341      <b>c = a + b</b> :
 342      <b>a = b</b> :
 343      <b>b = c</b>) :
 344   b;
 345
 346 # Call it.
 347 fibi(10);
 348 </pre>
 349 </div>
 350
 351 <p>
 352 In order to mutate variables, we have to change our existing variables to use
 353 the "alloca trick".  Once we have that, we'll add our new operator, then extend
 354 Kaleidoscope to support new variable definitions.
 355 </p>
 356
 357 </div>
 358
 359 <!-- *********************************************************************** -->
 360 <div class="doc_section"><a name="adjustments">Adjusting Existing Variables for
 361 Mutation</a></div>
 362 <!-- *********************************************************************** -->
 363
 364 <div class="doc_text">
 365
 366 <p>
 367 The symbol table in Kaleidoscope is managed at code generation time by the
 368 '<tt>NamedValues</tt>' map.  This map currently keeps track of the LLVM "Value*"
 369 that holds the double value for the named variable.  In order to support
 370 mutation, we need to change this slightly, so that it <tt>NamedValues</tt> holds
 371 the <em>memory location</em> of the variable in question.  Note that this
 372 change is a refactoring: it changes the structure of the code, but does not
 373 (by itself) change the behavior of the compiler.  All of these changes are
 374 isolated in the Kaleidoscope code generator.</p>
 375
 376 <p>
 377 At this point in Kaleidoscope's development, it only supports variables for two
 378 things: incoming arguments to functions and the induction variable of 'for'
 379 loops.  For consistency, we'll allow mutation of these variables in addition to
 380 other user-defined variables.  This means that these will both need memory
 381 locations.
 382 </p>
 383
 384 <p>To start our transformation of Kaleidoscope, we'll change the NamedValues
 385 map to map to AllocaInst* instead of Value*.  Once we do this, the C++ compiler
 386 will tell use what parts of the code we need to update:</p>
 387
 388 <div class="doc_code">
 389 <pre>
 390 static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
 391 </pre>
 392 </div>
 393
 394 <p>Also, since we will need to create these alloca's, we'll use a helper
 395 function that ensures that the allocas are created in the entry block of the
 396 function:</p>
 397
 398 <div class="doc_code">
 399 <pre>
 400 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
 401 /// the function.  This is used for mutable variables etc.
 402 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
 403                                           const std::string &amp;VarName) {
 404   LLVMBuilder TmpB(&amp;TheFunction-&gt;getEntryBlock(),
 405                    TheFunction-&gt;getEntryBlock().begin());
 406   return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
 407 }
 408 </pre>
 409 </div>
 410
 411 <p>This funny looking code creates an LLVMBuilder object that is pointing at
 412 the first instruction (.begin()) of the entry block.  It then creates an alloca
 413 with the expected name and returns it.  Because all values in Kaleidoscope are
 414 doubles, there is no need to pass in a type to use.</p>
 415
 416 <p>With this in place, the first functionality change we want to make is to
 417 variable references.  In our new scheme, variables live on the stack, so code
 418 generating a reference to them actually needs to produce a load from the stack
 419 slot:</p>
 420
 421 <div class="doc_code">
 422 <pre>
 423 Value *VariableExprAST::Codegen() {
 424   // Look this variable up in the function.
 425   Value *V = NamedValues[Name];
 426   if (V == 0) return ErrorV("Unknown variable name");
 427
 428   // Load the value.
 429   return Builder.CreateLoad(V, Name.c_str());
 430 }
 431 </pre>
 432 </div>
 433
 434 <p>As you can see, this is pretty straight-forward.  Next we need to update the
 435 things that define the variables to set up the alloca.  We'll start with
 436 <tt>ForExprAST::Codegen</tt> (see the <a href="#code">full code listing</a> for
 437 the unabridged code):</p>
 438
 439 <div class="doc_code">
 440 <pre>
 441   Function *TheFunction = Builder.GetInsertBlock()->getParent();
 442
 443   <b>// Create an alloca for the variable in the entry block.
 444   AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);</b>
 445
 446     // Emit the start code first, without 'variable' in scope.
 447   Value *StartVal = Start-&gt;Codegen();
 448   if (StartVal == 0) return 0;
 449
 450   <b>// Store the value into the alloca.
 451   Builder.CreateStore(StartVal, Alloca);</b>
 452   ...
 453
 454   // Compute the end condition.
 455   Value *EndCond = End-&gt;Codegen();
 456   if (EndCond == 0) return EndCond;
 457
 458   <b>// Reload, increment, and restore the alloca.  This handles the case where
 459   // the body of the loop mutates the variable.
 460   Value *CurVar = Builder.CreateLoad(Alloca);
 461   Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
 462   Builder.CreateStore(NextVar, Alloca);</b>
 463   ...
 464 </pre>
 465 </div>
 466
 467 <p>This code is virtually identical to the code <a
 468 href="LangImpl5.html#forcodegen">before we allowed mutable variables</a>.  The
 469 big difference is that we no longer have to construct a PHI node, and we use
 470 load/store to access the variable as needed.</p>
 471
 472 <p>To support mutable argument variables, we need to also make allocas for them.
 473 The code for this is also pretty simple:</p>
 474
 475 <div class="doc_code">
 476 <pre>
 477 /// CreateArgumentAllocas - Create an alloca for each argument and register the
 478 /// argument in the symbol table so that references to it will succeed.
 479 void PrototypeAST::CreateArgumentAllocas(Function *F) {
 480   Function::arg_iterator AI = F-&gt;arg_begin();
 481   for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
 482     // Create an alloca for this variable.
 483     AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
 484
 485     // Store the initial value into the alloca.
 486     Builder.CreateStore(AI, Alloca);
 487
 488     // Add arguments to variable symbol table.
 489     NamedValues[Args[Idx]] = Alloca;
 490   }
 491 }
 492 </pre>
 493 </div>
 494
 495 <p>For each argument, we make an alloca, store the input value to the function
 496 into the alloca, and register the alloca as the memory location for the
 497 argument.  This method gets invoked by <tt>FunctionAST::Codegen</tt> right after
 498 it sets up the entry block for the function.</p>
 499
 500 <p>The final missing piece is adding the 'mem2reg' pass, which allows us to get
 501 good codegen once again:</p>
 502
 503 <div class="doc_code">
 504 <pre>
 505     // Set up the optimizer pipeline.  Start with registering info about how the
 506     // target lays out data structures.
 507     OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
 508     <b>// Promote allocas to registers.
 509     OurFPM.add(createPromoteMemoryToRegisterPass());</b>
 510     // Do simple "peephole" optimizations and bit-twiddling optzns.
 511     OurFPM.add(createInstructionCombiningPass());
 512     // Reassociate expressions.
 513     OurFPM.add(createReassociatePass());
 514 </pre>
 515 </div>
 516
 517 <p>It is interesting to see what the code looks like before and after the
 518 mem2reg optimization runs.  For example, this is the before/after code for our
 519 recursive fib.  Before the optimization:</p>
 520
 521 <div class="doc_code">
 522 <pre>
 523 define double @fib(double %x) {
 524 entry:
 525         <b>%x1 = alloca double
 526         store double %x, double* %x1
 527         %x2 = load double* %x1</b>
 528         %multmp = fcmp ult double %x2, 3.000000e+00
 529         %booltmp = uitofp i1 %multmp to double
 530         %ifcond = fcmp one double %booltmp, 0.000000e+00
 531         br i1 %ifcond, label %then, label %else
 532
 533 then:           ; preds = %entry
 534         br label %ifcont
 535
 536 else:           ; preds = %entry
 537         <b>%x3 = load double* %x1</b>
 538         %subtmp = sub double %x3, 1.000000e+00
 539         %calltmp = call double @fib( double %subtmp )
 540         <b>%x4 = load double* %x1</b>
 541         %subtmp5 = sub double %x4, 2.000000e+00
 542         %calltmp6 = call double @fib( double %subtmp5 )
 543         %addtmp = add double %calltmp, %calltmp6
 544         br label %ifcont
 545
 546 ifcont:         ; preds = %else, %then
 547         %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
 548         ret double %iftmp
 549 }
 550 </pre>
 551 </div>
 552
 553 <p>Here there is only one variable (x, the input argument) but you can still
 554 see the extremely simple-minded code generation strategy we are using.  In the
 555 entry block, an alloca is created, and the initial input value is stored into
 556 it.  Each reference to the variable does a reload from the stack.  Also, note
 557 that we didn't modify the if/then/else expression, so it still inserts a PHI
 558 node.  While we could make an alloca for it, it is actually easier to create a
 559 PHI node for it, so we still just make the PHI.</p>
 560
 561 <p>Here is the code after the mem2reg pass runs:</p>
 562
 563 <div class="doc_code">
 564 <pre>
 565 define double @fib(double %x) {
 566 entry:
 567         %multmp = fcmp ult double <b>%x</b>, 3.000000e+00
 568         %booltmp = uitofp i1 %multmp to double
 569         %ifcond = fcmp one double %booltmp, 0.000000e+00
 570         br i1 %ifcond, label %then, label %else
 571
 572 then:
 573         br label %ifcont
 574
 575 else:
 576         %subtmp = sub double <b>%x</b>, 1.000000e+00
 577         %calltmp = call double @fib( double %subtmp )
 578         %subtmp5 = sub double <b>%x</b>, 2.000000e+00
 579         %calltmp6 = call double @fib( double %subtmp5 )
 580         %addtmp = add double %calltmp, %calltmp6
 581         br label %ifcont
 582
 583 ifcont:         ; preds = %else, %then
 584         %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
 585         ret double %iftmp
 586 }
 587 </pre>
 588 </div>
 589
 590 <p>This is a trivial case for mem2reg, since there are no redefinitions of the
 591 variable.  The point of showing this is to calm your tension about inserting
 592 such blatent inefficiencies :).</p>
 593
 594 <p>After the rest of the optimizers run, we get:</p>
 595
 596 <div class="doc_code">
 597 <pre>
 598 define double @fib(double %x) {
 599 entry:
 600         %multmp = fcmp ult double %x, 3.000000e+00
 601         %booltmp = uitofp i1 %multmp to double
 602         %ifcond = fcmp ueq double %booltmp, 0.000000e+00
 603         br i1 %ifcond, label %else, label %ifcont
 604
 605 else:
 606         %subtmp = sub double %x, 1.000000e+00
 607         %calltmp = call double @fib( double %subtmp )
 608         %subtmp5 = sub double %x, 2.000000e+00
 609         %calltmp6 = call double @fib( double %subtmp5 )
 610         %addtmp = add double %calltmp, %calltmp6
 611         ret double %addtmp
 612
 613 ifcont:
 614         ret double 1.000000e+00
 615 }
 616 </pre>
 617 </div>
 618
 619 <p>Here we see that the simplifycfg pass decided to clone the return instruction
 620 into the end of the 'else' block.  This allowed it to eliminate some branches
 621 and the PHI node.</p>
 622
 623 <p>Now that all symbol table references are updated to use stack variables,
 624 we'll add the assignment operator.</p>
 625
 626 </div>
 627
 628 <!-- *********************************************************************** -->
 629 <div class="doc_section"><a name="assignment">New Assignment Operator</a></div>
 630 <!-- *********************************************************************** -->
 631
 632 <div class="doc_text">
 633
 634 <p>With our current framework, adding a new assignment operator is really
 635 simple.  We will parse it just like any other binary operator, but handle it
 636 internally (instead of allowing the user to define it).  The first step is to
 637 set a precedence:</p>
 638
 639 <div class="doc_code">
 640 <pre>
 641  int main() {
 642    // Install standard binary operators.
 643    // 1 is lowest precedence.
 644    <b>BinopPrecedence['='] = 2;</b>
 645    BinopPrecedence['&lt;'] = 10;
 646    BinopPrecedence['+'] = 20;
 647    BinopPrecedence['-'] = 20;
 648 </pre>
 649 </div>
 650
 651 <p>Now that the parser knows the precedence of the binary operator, it takes
 652 care of all the parsing and AST generation.  We just need to implement codegen
 653 for the assignment operator.  This looks like:</p>
 654
 655 <div class="doc_code">
 656 <pre>
 657 Value *BinaryExprAST::Codegen() {
 658   // Special case '=' because we don't want to emit the LHS as an expression.
 659   if (Op == '=') {
 660     // Assignment requires the LHS to be an identifier.
 661     VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
 662     if (!LHSE)
 663       return ErrorV("destination of '=' must be a variable");
 664 </pre>
 665 </div>
 666
 667 <p>Unlike the rest of the binary operators, our assignment operator doesn't
 668 follow the "emit LHS, emit RHS, do computation" model.  As such, it is handled
 669 as a special case before the other binary operators are handled.  The other
 670 strange thing about it is that it requires the LHS to be a variable directly.
 671 </p>
 672
 673 <div class="doc_code">
 674 <pre>
 675     // Codegen the RHS.
 676     Value *Val = RHS-&gt;Codegen();
 677     if (Val == 0) return 0;
 678
 679     // Look up the name.
 680     Value *Variable = NamedValues[LHSE-&gt;getName()];
 681     if (Variable == 0) return ErrorV("Unknown variable name");
 682
 683     Builder.CreateStore(Val, Variable);
 684     return Val;
 685   }
 686   ...
 687 </pre>
 688 </div>
 689
 690 <p>Once it has the variable, codegen'ing the assignment is straight-forward:
 691 we emit the RHS of the assignment, create a store, and return the computed
 692 value.  Returning a value allows for chained assignments like "X = (Y = Z)".</p>
 693
 694 <p>Now that we have an assignment operator, we can mutate loop variables and
 695 arguments.  For example, we can now run code like this:</p>
 696
 697 <div class="doc_code">
 698 <pre>
 699 # Function to print a double.
 700 extern printd(x);
 701
 702 # Define ':' for sequencing: as a low-precedence operator that ignores operands
 703 # and just returns the RHS.
 704 def binary : 1 (x y) y;
 705
 706 def test(x)
 707   printd(x) :
 708   x = 4 :
 709   printd(x);
 710
 711 test(123);
 712 </pre>
 713 </div>
 714
 715 <p>When run, this example prints "123" and then "4", showing that we did
 716 actually mutate the value!  Okay, we have now officially implemented our goal:
 717 getting this to work requires SSA construction in the general case.  However,
 718 to be really useful, we want the ability to define our own local variables, lets
 719 add this next!
 720 </p>
 721
 722 </div>
 723
 724 <!-- *********************************************************************** -->
 725 <div class="doc_section"><a name="localvars">User-defined Local
 726 Variables</a></div>
 727 <!-- *********************************************************************** -->
 728
 729 <div class="doc_text">
 730
 731 <p>Adding var/in is just like any other other extensions we made to
 732 Kaleidoscope: we extend the lexer, the parser, the AST and the code generator.
 733 The first step for adding our new 'var/in' construct is to extend the lexer.
 734 As before, this is pretty trivial, the code looks like this:</p>
 735
 736 <div class="doc_code">
 737 <pre>
 738 enum Token {
 739   ...
 740   <b>// var definition
 741   tok_var = -13</b>
 742 ...
 743 }
 744 ...
 745 static int gettok() {
 746 ...
 747     if (IdentifierStr == "in") return tok_in;
 748     if (IdentifierStr == "binary") return tok_binary;
 749     if (IdentifierStr == "unary") return tok_unary;
 750     <b>if (IdentifierStr == "var") return tok_var;</b>
 751     return tok_identifier;
 752 ...
 753 </pre>
 754 </div>
 755
 756 <p>The next step is to define the AST node that we will construct.  For var/in,
 757 it will look like this:</p>
 758
 759 <div class="doc_code">
 760 <pre>
 761 /// VarExprAST - Expression class for var/in
 762 class VarExprAST : public ExprAST {
 763   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
 764   ExprAST *Body;
 765 public:
 766   VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
 767              ExprAST *body)
 768   : VarNames(varnames), Body(body) {}
 769
 770   virtual Value *Codegen();
 771 };
 772 </pre>
 773 </div>
 774
 775 <p>var/in allows a list of names to be defined all at once, and each name can
 776 optionally have an initializer value.  As such, we capture this information in
 777 the VarNames vector.  Also, var/in has a body, this body is allowed to access
 778 the variables defined by the let/in.</p>
 779
 780 <p>With this ready, we can define the parser pieces.  First thing we do is add
 781 it as a primary expression:</p>
 782
 783 <div class="doc_code">
 784 <pre>
 785 /// primary
 786 ///   ::= identifierexpr
 787 ///   ::= numberexpr
 788 ///   ::= parenexpr
 789 ///   ::= ifexpr
 790 ///   ::= forexpr
 791 <b>///   ::= varexpr</b>
 792 static ExprAST *ParsePrimary() {
 793   switch (CurTok) {
 794   default: return Error("unknown token when expecting an expression");
 795   case tok_identifier: return ParseIdentifierExpr();
 796   case tok_number:     return ParseNumberExpr();
 797   case '(':            return ParseParenExpr();
 798   case tok_if:         return ParseIfExpr();
 799   case tok_for:        return ParseForExpr();
 800   <b>case tok_var:        return ParseVarExpr();</b>
 801   }
 802 }
 803 </pre>
 804 </div>
 805
 806 <p>Next we define ParseVarExpr:</p>
 807
 808 <div class="doc_code">
 809 <pre>
 810 /// varexpr ::= 'var' identifer ('=' expression)?
 811 //                    (',' identifer ('=' expression)?)* 'in' expression
 812 static ExprAST *ParseVarExpr() {
 813   getNextToken();  // eat the var.
 814
 815   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
 816
 817   // At least one variable name is required.
 818   if (CurTok != tok_identifier)
 819     return Error("expected identifier after var");
 820 </pre>
 821 </div>
 822
 823 <p>The first part of this code parses the list of identifier/expr pairs into the
 824 local <tt>VarNames</tt> vector.
 825
 826 <div class="doc_code">
 827 <pre>
 828   while (1) {
 829     std::string Name = IdentifierStr;
 830     getNextToken();  // eat identifer.
 831
 832     // Read the optional initializer.
 833     ExprAST *Init = 0;
 834     if (CurTok == '=') {
 835       getNextToken(); // eat the '='.
 836
 837       Init = ParseExpression();
 838       if (Init == 0) return 0;
 839     }
 840
 841     VarNames.push_back(std::make_pair(Name, Init));
 842
 843     // End of var list, exit loop.
 844     if (CurTok != ',') break;
 845     getNextToken(); // eat the ','.
 846
 847     if (CurTok != tok_identifier)
 848       return Error("expected identifier list after var");
 849   }
 850 </pre>
 851 </div>
 852
 853 <p>Once all the variables are parsed, we then parse the body and create the
 854 AST node:</p>
 855
 856 <div class="doc_code">
 857 <pre>
 858   // At this point, we have to have 'in'.
 859   if (CurTok != tok_in)
 860     return Error("expected 'in' keyword after 'var'");
 861   getNextToken();  // eat 'in'.
 862
 863   ExprAST *Body = ParseExpression();
 864   if (Body == 0) return 0;
 865
 866   return new VarExprAST(VarNames, Body);
 867 }
 868 </pre>
 869 </div>
 870
 871 <p>Now that we can parse and represent the code, we need to support emission of
 872 LLVM IR for it.  This code starts out with:</p>
 873
 874 <div class="doc_code">
 875 <pre>
 876 Value *VarExprAST::Codegen() {
 877   std::vector&lt;AllocaInst *&gt; OldBindings;
 878
 879   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
 880
 881   // Register all variables and emit their initializer.
 882   for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
 883     const std::string &amp;VarName = VarNames[i].first;
 884     ExprAST *Init = VarNames[i].second;
 885 </pre>
 886 </div>
 887
 888 <p>Basically it loops over all the variables, installing them one at a time.
 889 For each variable we put into the symbol table, we remember the previous value
 890 that we replace in OldBindings.</p>
 891
 892 <div class="doc_code">
 893 <pre>
 894     // Emit the initializer before adding the variable to scope, this prevents
 895     // the initializer from referencing the variable itself, and permits stuff
 896     // like this:
 897     //  var a = 1 in
 898     //    var a = a in ...   # refers to outer 'a'.
 899     Value *InitVal;
 900     if (Init) {
 901       InitVal = Init-&gt;Codegen();
 902       if (InitVal == 0) return 0;
 903     } else { // If not specified, use 0.0.
 904       InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
 905     }
 906
 907     AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
 908     Builder.CreateStore(InitVal, Alloca);
 909
 910     // Remember the old variable binding so that we can restore the binding when
 911     // we unrecurse.
 912     OldBindings.push_back(NamedValues[VarName]);
 913
 914     // Remember this binding.
 915     NamedValues[VarName] = Alloca;
 916   }
 917 </pre>
 918 </div>
 919
 920 <p>There are more comments here than code.  The basic idea is that we emit the
 921 initializer, create the alloca, then update the symbol table to point to it.
 922 Once all the variables are installed in the symbol table, we evaluate the body
 923 of the var/in expression:</p>
 924
 925 <div class="doc_code">
 926 <pre>
 927   // Codegen the body, now that all vars are in scope.
 928   Value *BodyVal = Body-&gt;Codegen();
 929   if (BodyVal == 0) return 0;
 930 </pre>
 931 </div>
 932
 933 <p>Finally, before returning, we restore the previous variable bindings:</p>
 934
 935 <div class="doc_code">
 936 <pre>
 937   // Pop all our variables from scope.
 938   for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
 939     NamedValues[VarNames[i].first] = OldBindings[i];
 940
 941   // Return the body computation.
 942   return BodyVal;
 943 }
 944 </pre>
 945 </div>
 946
 947 <p>The end result of all of this is that we get properly scoped variable
 948 definitions, and we even (trivially) allow mutation of them :).</p>
 949
 950 <p>With this, we completed what we set out to do.  Our nice iterative fib
 951 example from the intro compiles and runs just fine.  The mem2reg pass optimizes
 952 all of our stack variables into SSA registers, inserting PHI nodes where needed,
 953 and our front-end remains simple: no iterated dominator frontier computation
 954 anywhere in sight.</p>
 955
 956 </div>
 957
 958 <!-- *********************************************************************** -->
 959 <div class="doc_section"><a name="code">Full Code Listing</a></div>
 960 <!-- *********************************************************************** -->
 961
 962 <div class="doc_text">
 963
 964 <p>
 965 Here is the complete code listing for our running example, enhanced with mutable
 966 variables and var/in support.  To build this example, use:
 967 </p>
 968
 969 <div class="doc_code">
 970 <pre>
 971    # Compile
 972    g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
 973    # Run
 974    ./toy
 975 </pre>
 976 </div>
 977
 978 <p>Here is the code:</p>
 979
 980 <div class="doc_code">
 981 <pre>
 982 #include "llvm/DerivedTypes.h"
 983 #include "llvm/ExecutionEngine/ExecutionEngine.h"
 984 #include "llvm/Module.h"
 985 #include "llvm/ModuleProvider.h"
 986 #include "llvm/PassManager.h"
 987 #include "llvm/Analysis/Verifier.h"
 988 #include "llvm/Target/TargetData.h"
 989 #include "llvm/Transforms/Scalar.h"
 990 #include "llvm/Support/LLVMBuilder.h"
 991 #include &lt;cstdio&gt;
 992 #include &lt;string&gt;
 993 #include &lt;map&gt;
 994 #include &lt;vector&gt;
 995 using namespace llvm;
 996
 997 //===----------------------------------------------------------------------===//
 998 // Lexer
 999 //===----------------------------------------------------------------------===//
1000
1001 // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
1002 // of these for known things.
1003 enum Token {
1004   tok_eof = -1,
1005
1006   // commands
1007   tok_def = -2, tok_extern = -3,
1008
1009   // primary
1010   tok_identifier = -4, tok_number = -5,
1011
1012   // control
1013   tok_if = -6, tok_then = -7, tok_else = -8,
1014   tok_for = -9, tok_in = -10,
1015
1016   // operators
1017   tok_binary = -11, tok_unary = -12,
1018
1019   // var definition
1020   tok_var = -13
1021 };
1022
1023 static std::string IdentifierStr;  // Filled in if tok_identifier
1024 static double NumVal;              // Filled in if tok_number
1025
1026 /// gettok - Return the next token from standard input.
1027 static int gettok() {
1028   static int LastChar = ' ';
1029
1030   // Skip any whitespace.
1031   while (isspace(LastChar))
1032     LastChar = getchar();
1033
1034   if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
1035     IdentifierStr = LastChar;
1036     while (isalnum((LastChar = getchar())))
1037       IdentifierStr += LastChar;
1038
1039     if (IdentifierStr == "def") return tok_def;
1040     if (IdentifierStr == "extern") return tok_extern;
1041     if (IdentifierStr == "if") return tok_if;
1042     if (IdentifierStr == "then") return tok_then;
1043     if (IdentifierStr == "else") return tok_else;
1044     if (IdentifierStr == "for") return tok_for;
1045     if (IdentifierStr == "in") return tok_in;
1046     if (IdentifierStr == "binary") return tok_binary;
1047     if (IdentifierStr == "unary") return tok_unary;
1048     if (IdentifierStr == "var") return tok_var;
1049     return tok_identifier;
1050   }
1051
1052   if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
1053     std::string NumStr;
1054     do {
1055       NumStr += LastChar;
1056       LastChar = getchar();
1057     } while (isdigit(LastChar) || LastChar == '.');
1058
1059     NumVal = strtod(NumStr.c_str(), 0);
1060     return tok_number;
1061   }
1062
1063   if (LastChar == '#') {
1064     // Comment until end of line.
1065     do LastChar = getchar();
1066     while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
1067
1068     if (LastChar != EOF)
1069       return gettok();
1070   }
1071
1072   // Check for end of file.  Don't eat the EOF.
1073   if (LastChar == EOF)
1074     return tok_eof;
1075
1076   // Otherwise, just return the character as its ascii value.
1077   int ThisChar = LastChar;
1078   LastChar = getchar();
1079   return ThisChar;
1080 }
1081
1082 //===----------------------------------------------------------------------===//
1083 // Abstract Syntax Tree (aka Parse Tree)
1084 //===----------------------------------------------------------------------===//
1085
1086 /// ExprAST - Base class for all expression nodes.
1087 class ExprAST {
1088 public:
1089   virtual ~ExprAST() {}
1090   virtual Value *Codegen() = 0;
1091 };
1092
1093 /// NumberExprAST - Expression class for numeric literals like "1.0".
1094 class NumberExprAST : public ExprAST {
1095   double Val;
1096 public:
1097   NumberExprAST(double val) : Val(val) {}
1098   virtual Value *Codegen();
1099 };
1100
1101 /// VariableExprAST - Expression class for referencing a variable, like "a".
1102 class VariableExprAST : public ExprAST {
1103   std::string Name;
1104 public:
1105   VariableExprAST(const std::string &amp;name) : Name(name) {}
1106   const std::string &amp;getName() const { return Name; }
1107   virtual Value *Codegen();
1108 };
1109
1110 /// UnaryExprAST - Expression class for a unary operator.
1111 class UnaryExprAST : public ExprAST {
1112   char Opcode;
1113   ExprAST *Operand;
1114 public:
1115   UnaryExprAST(char opcode, ExprAST *operand)
1116     : Opcode(opcode), Operand(operand) {}
1117   virtual Value *Codegen();
1118 };
1119
1120 /// BinaryExprAST - Expression class for a binary operator.
1121 class BinaryExprAST : public ExprAST {
1122   char Op;
1123   ExprAST *LHS, *RHS;
1124 public:
1125   BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
1126     : Op(op), LHS(lhs), RHS(rhs) {}
1127   virtual Value *Codegen();
1128 };
1129
1130 /// CallExprAST - Expression class for function calls.
1131 class CallExprAST : public ExprAST {
1132   std::string Callee;
1133   std::vector&lt;ExprAST*&gt; Args;
1134 public:
1135   CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1136     : Callee(callee), Args(args) {}
1137   virtual Value *Codegen();
1138 };
1139
1140 /// IfExprAST - Expression class for if/then/else.
1141 class IfExprAST : public ExprAST {
1142   ExprAST *Cond, *Then, *Else;
1143 public:
1144   IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1145   : Cond(cond), Then(then), Else(_else) {}
1146   virtual Value *Codegen();
1147 };
1148
1149 /// ForExprAST - Expression class for for/in.
1150 class ForExprAST : public ExprAST {
1151   std::string VarName;
1152   ExprAST *Start, *End, *Step, *Body;
1153 public:
1154   ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1155              ExprAST *step, ExprAST *body)
1156     : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1157   virtual Value *Codegen();
1158 };
1159
1160 /// VarExprAST - Expression class for var/in
1161 class VarExprAST : public ExprAST {
1162   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1163   ExprAST *Body;
1164 public:
1165   VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
1166              ExprAST *body)
1167   : VarNames(varnames), Body(body) {}
1168
1169   virtual Value *Codegen();
1170 };
1171
1172 /// PrototypeAST - This class represents the "prototype" for a function,
1173 /// which captures its argument names as well as if it is an operator.
1174 class PrototypeAST {
1175   std::string Name;
1176   std::vector&lt;std::string&gt; Args;
1177   bool isOperator;
1178   unsigned Precedence;  // Precedence if a binary op.
1179 public:
1180   PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1181                bool isoperator = false, unsigned prec = 0)
1182   : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1183
1184   bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1185   bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1186
1187   char getOperatorName() const {
1188     assert(isUnaryOp() || isBinaryOp());
1189     return Name[Name.size()-1];
1190   }
1191
1192   unsigned getBinaryPrecedence() const { return Precedence; }
1193
1194   Function *Codegen();
1195
1196   void CreateArgumentAllocas(Function *F);
1197 };
1198
1199 /// FunctionAST - This class represents a function definition itself.
1200 class FunctionAST {
1201   PrototypeAST *Proto;
1202   ExprAST *Body;
1203 public:
1204   FunctionAST(PrototypeAST *proto, ExprAST *body)
1205     : Proto(proto), Body(body) {}
1206
1207   Function *Codegen();
1208 };
1209
1210 //===----------------------------------------------------------------------===//
1211 // Parser
1212 //===----------------------------------------------------------------------===//
1213
1214 /// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
1215 /// token the parser it looking at.  getNextToken reads another token from the
1216 /// lexer and updates CurTok with its results.
1217 static int CurTok;
1218 static int getNextToken() {
1219   return CurTok = gettok();
1220 }
1221
1222 /// BinopPrecedence - This holds the precedence for each binary operator that is
1223 /// defined.
1224 static std::map&lt;char, int&gt; BinopPrecedence;
1225
1226 /// GetTokPrecedence - Get the precedence of the pending binary operator token.
1227 static int GetTokPrecedence() {
1228   if (!isascii(CurTok))
1229     return -1;
1230
1231   // Make sure it's a declared binop.
1232   int TokPrec = BinopPrecedence[CurTok];
1233   if (TokPrec &lt;= 0) return -1;
1234   return TokPrec;
1235 }
1236
1237 /// Error* - These are little helper functions for error handling.
1238 ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1239 PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1240 FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1241
1242 static ExprAST *ParseExpression();
1243
1244 /// identifierexpr
1245 ///   ::= identifer
1246 ///   ::= identifer '(' expression* ')'
1247 static ExprAST *ParseIdentifierExpr() {
1248   std::string IdName = IdentifierStr;
1249
1250   getNextToken();  // eat identifer.
1251
1252   if (CurTok != '(') // Simple variable ref.
1253     return new VariableExprAST(IdName);
1254
1255   // Call.
1256   getNextToken();  // eat (
1257   std::vector&lt;ExprAST*&gt; Args;
1258   if (CurTok != ')') {
1259     while (1) {
1260       ExprAST *Arg = ParseExpression();
1261       if (!Arg) return 0;
1262       Args.push_back(Arg);
1263
1264       if (CurTok == ')') break;
1265
1266       if (CurTok != ',')
1267         return Error("Expected ')'");
1268       getNextToken();
1269     }
1270   }
1271
1272   // Eat the ')'.
1273   getNextToken();
1274
1275   return new CallExprAST(IdName, Args);
1276 }
1277
1278 /// numberexpr ::= number
1279 static ExprAST *ParseNumberExpr() {
1280   ExprAST *Result = new NumberExprAST(NumVal);
1281   getNextToken(); // consume the number
1282   return Result;
1283 }
1284
1285 /// parenexpr ::= '(' expression ')'
1286 static ExprAST *ParseParenExpr() {
1287   getNextToken();  // eat (.
1288   ExprAST *V = ParseExpression();
1289   if (!V) return 0;
1290
1291   if (CurTok != ')')
1292     return Error("expected ')'");
1293   getNextToken();  // eat ).
1294   return V;
1295 }
1296
1297 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
1298 static ExprAST *ParseIfExpr() {
1299   getNextToken();  // eat the if.
1300
1301   // condition.
1302   ExprAST *Cond = ParseExpression();
1303   if (!Cond) return 0;
1304
1305   if (CurTok != tok_then)
1306     return Error("expected then");
1307   getNextToken();  // eat the then
1308
1309   ExprAST *Then = ParseExpression();
1310   if (Then == 0) return 0;
1311
1312   if (CurTok != tok_else)
1313     return Error("expected else");
1314
1315   getNextToken();
1316
1317   ExprAST *Else = ParseExpression();
1318   if (!Else) return 0;
1319
1320   return new IfExprAST(Cond, Then, Else);
1321 }
1322
1323 /// forexpr ::= 'for' identifer '=' expr ',' expr (',' expr)? 'in' expression
1324 static ExprAST *ParseForExpr() {
1325   getNextToken();  // eat the for.
1326
1327   if (CurTok != tok_identifier)
1328     return Error("expected identifier after for");
1329
1330   std::string IdName = IdentifierStr;
1331   getNextToken();  // eat identifer.
1332
1333   if (CurTok != '=')
1334     return Error("expected '=' after for");
1335   getNextToken();  // eat '='.
1336
1337
1338   ExprAST *Start = ParseExpression();
1339   if (Start == 0) return 0;
1340   if (CurTok != ',')
1341     return Error("expected ',' after for start value");
1342   getNextToken();
1343
1344   ExprAST *End = ParseExpression();
1345   if (End == 0) return 0;
1346
1347   // The step value is optional.
1348   ExprAST *Step = 0;
1349   if (CurTok == ',') {
1350     getNextToken();
1351     Step = ParseExpression();
1352     if (Step == 0) return 0;
1353   }
1354
1355   if (CurTok != tok_in)
1356     return Error("expected 'in' after for");
1357   getNextToken();  // eat 'in'.
1358
1359   ExprAST *Body = ParseExpression();
1360   if (Body == 0) return 0;
1361
1362   return new ForExprAST(IdName, Start, End, Step, Body);
1363 }
1364
1365 /// varexpr ::= 'var' identifer ('=' expression)?
1366 //                    (',' identifer ('=' expression)?)* 'in' expression
1367 static ExprAST *ParseVarExpr() {
1368   getNextToken();  // eat the var.
1369
1370   std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1371
1372   // At least one variable name is required.
1373   if (CurTok != tok_identifier)
1374     return Error("expected identifier after var");
1375
1376   while (1) {
1377     std::string Name = IdentifierStr;
1378     getNextToken();  // eat identifer.
1379
1380     // Read the optional initializer.
1381     ExprAST *Init = 0;
1382     if (CurTok == '=') {
1383       getNextToken(); // eat the '='.
1384
1385       Init = ParseExpression();
1386       if (Init == 0) return 0;
1387     }
1388
1389     VarNames.push_back(std::make_pair(Name, Init));
1390
1391     // End of var list, exit loop.
1392     if (CurTok != ',') break;
1393     getNextToken(); // eat the ','.
1394
1395     if (CurTok != tok_identifier)
1396       return Error("expected identifier list after var");
1397   }
1398
1399   // At this point, we have to have 'in'.
1400   if (CurTok != tok_in)
1401     return Error("expected 'in' keyword after 'var'");
1402   getNextToken();  // eat 'in'.
1403
1404   ExprAST *Body = ParseExpression();
1405   if (Body == 0) return 0;
1406
1407   return new VarExprAST(VarNames, Body);
1408 }
1409
1410
1411 /// primary
1412 ///   ::= identifierexpr
1413 ///   ::= numberexpr
1414 ///   ::= parenexpr
1415 ///   ::= ifexpr
1416 ///   ::= forexpr
1417 ///   ::= varexpr
1418 static ExprAST *ParsePrimary() {
1419   switch (CurTok) {
1420   default: return Error("unknown token when expecting an expression");
1421   case tok_identifier: return ParseIdentifierExpr();
1422   case tok_number:     return ParseNumberExpr();
1423   case '(':            return ParseParenExpr();
1424   case tok_if:         return ParseIfExpr();
1425   case tok_for:        return ParseForExpr();
1426   case tok_var:        return ParseVarExpr();
1427   }
1428 }
1429
1430 /// unary
1431 ///   ::= primary
1432 ///   ::= '!' unary
1433 static ExprAST *ParseUnary() {
1434   // If the current token is not an operator, it must be a primary expr.
1435   if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1436     return ParsePrimary();
1437
1438   // If this is a unary operator, read it.
1439   int Opc = CurTok;
1440   getNextToken();
1441   if (ExprAST *Operand = ParseUnary())
1442     return new UnaryExprAST(Opc, Operand);
1443   return 0;
1444 }
1445
1446 /// binoprhs
1447 ///   ::= ('+' unary)*
1448 static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1449   // If this is a binop, find its precedence.
1450   while (1) {
1451     int TokPrec = GetTokPrecedence();
1452
1453     // If this is a binop that binds at least as tightly as the current binop,
1454     // consume it, otherwise we are done.
1455     if (TokPrec &lt; ExprPrec)
1456       return LHS;
1457
1458     // Okay, we know this is a binop.
1459     int BinOp = CurTok;
1460     getNextToken();  // eat binop
1461
1462     // Parse the unary expression after the binary operator.
1463     ExprAST *RHS = ParseUnary();
1464     if (!RHS) return 0;
1465
1466     // If BinOp binds less tightly with RHS than the operator after RHS, let
1467     // the pending operator take RHS as its LHS.
1468     int NextPrec = GetTokPrecedence();
1469     if (TokPrec &lt; NextPrec) {
1470       RHS = ParseBinOpRHS(TokPrec+1, RHS);
1471       if (RHS == 0) return 0;
1472     }
1473
1474     // Merge LHS/RHS.
1475     LHS = new BinaryExprAST(BinOp, LHS, RHS);
1476   }
1477 }
1478
1479 /// expression
1480 ///   ::= unary binoprhs
1481 ///
1482 static ExprAST *ParseExpression() {
1483   ExprAST *LHS = ParseUnary();
1484   if (!LHS) return 0;
1485
1486   return ParseBinOpRHS(0, LHS);
1487 }
1488
1489 /// prototype
1490 ///   ::= id '(' id* ')'
1491 ///   ::= binary LETTER number? (id, id)
1492 ///   ::= unary LETTER (id)
1493 static PrototypeAST *ParsePrototype() {
1494   std::string FnName;
1495
1496   int Kind = 0;  // 0 = identifier, 1 = unary, 2 = binary.
1497   unsigned BinaryPrecedence = 30;
1498
1499   switch (CurTok) {
1500   default:
1501     return ErrorP("Expected function name in prototype");
1502   case tok_identifier:
1503     FnName = IdentifierStr;
1504     Kind = 0;
1505     getNextToken();
1506     break;
1507   case tok_unary:
1508     getNextToken();
1509     if (!isascii(CurTok))
1510       return ErrorP("Expected unary operator");
1511     FnName = "unary";
1512     FnName += (char)CurTok;
1513     Kind = 1;
1514     getNextToken();
1515     break;
1516   case tok_binary:
1517     getNextToken();
1518     if (!isascii(CurTok))
1519       return ErrorP("Expected binary operator");
1520     FnName = "binary";
1521     FnName += (char)CurTok;
1522     Kind = 2;
1523     getNextToken();
1524
1525     // Read the precedence if present.
1526     if (CurTok == tok_number) {
1527       if (NumVal &lt; 1 || NumVal &gt; 100)
1528         return ErrorP("Invalid precedecnce: must be 1..100");
1529       BinaryPrecedence = (unsigned)NumVal;
1530       getNextToken();
1531     }
1532     break;
1533   }
1534
1535   if (CurTok != '(')
1536     return ErrorP("Expected '(' in prototype");
1537
1538   std::vector&lt;std::string&gt; ArgNames;
1539   while (getNextToken() == tok_identifier)
1540     ArgNames.push_back(IdentifierStr);
1541   if (CurTok != ')')
1542     return ErrorP("Expected ')' in prototype");
1543
1544   // success.
1545   getNextToken();  // eat ')'.
1546
1547   // Verify right number of names for operator.
1548   if (Kind &amp;&amp; ArgNames.size() != Kind)
1549     return ErrorP("Invalid number of operands for operator");
1550
1551   return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1552 }
1553
1554 /// definition ::= 'def' prototype expression
1555 static FunctionAST *ParseDefinition() {
1556   getNextToken();  // eat def.
1557   PrototypeAST *Proto = ParsePrototype();
1558   if (Proto == 0) return 0;
1559
1560   if (ExprAST *E = ParseExpression())
1561     return new FunctionAST(Proto, E);
1562   return 0;
1563 }
1564
1565 /// toplevelexpr ::= expression
1566 static FunctionAST *ParseTopLevelExpr() {
1567   if (ExprAST *E = ParseExpression()) {
1568     // Make an anonymous proto.
1569     PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1570     return new FunctionAST(Proto, E);
1571   }
1572   return 0;
1573 }
1574
1575 /// external ::= 'extern' prototype
1576 static PrototypeAST *ParseExtern() {
1577   getNextToken();  // eat extern.
1578   return ParsePrototype();
1579 }
1580
1581 //===----------------------------------------------------------------------===//
1582 // Code Generation
1583 //===----------------------------------------------------------------------===//
1584
1585 static Module *TheModule;
1586 static LLVMFoldingBuilder Builder;
1587 static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
1588 static FunctionPassManager *TheFPM;
1589
1590 Value *ErrorV(const char *Str) { Error(Str); return 0; }
1591
1592 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
1593 /// the function.  This is used for mutable variables etc.
1594 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
1595                                           const std::string &amp;VarName) {
1596   LLVMBuilder TmpB(&amp;TheFunction-&gt;getEntryBlock(),
1597                    TheFunction-&gt;getEntryBlock().begin());
1598   return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
1599 }
1600
1601
1602 Value *NumberExprAST::Codegen() {
1603   return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1604 }
1605
1606 Value *VariableExprAST::Codegen() {
1607   // Look this variable up in the function.
1608   Value *V = NamedValues[Name];
1609   if (V == 0) return ErrorV("Unknown variable name");
1610
1611   // Load the value.
1612   return Builder.CreateLoad(V, Name.c_str());
1613 }
1614
1615 Value *UnaryExprAST::Codegen() {
1616   Value *OperandV = Operand-&gt;Codegen();
1617   if (OperandV == 0) return 0;
1618
1619   Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1620   if (F == 0)
1621     return ErrorV("Unknown unary operator");
1622
1623   return Builder.CreateCall(F, OperandV, "unop");
1624 }
1625
1626
1627 Value *BinaryExprAST::Codegen() {
1628   // Special case '=' because we don't want to emit the LHS as an expression.
1629   if (Op == '=') {
1630     // Assignment requires the LHS to be an identifier.
1631     VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
1632     if (!LHSE)
1633       return ErrorV("destination of '=' must be a variable");
1634     // Codegen the RHS.
1635     Value *Val = RHS-&gt;Codegen();
1636     if (Val == 0) return 0;
1637
1638     // Look up the name.
1639     Value *Variable = NamedValues[LHSE-&gt;getName()];
1640     if (Variable == 0) return ErrorV("Unknown variable name");
1641
1642     Builder.CreateStore(Val, Variable);
1643     return Val;
1644   }
1645
1646
1647   Value *L = LHS-&gt;Codegen();
1648   Value *R = RHS-&gt;Codegen();
1649   if (L == 0 || R == 0) return 0;
1650
1651   switch (Op) {
1652   case '+': return Builder.CreateAdd(L, R, "addtmp");
1653   case '-': return Builder.CreateSub(L, R, "subtmp");
1654   case '*': return Builder.CreateMul(L, R, "multmp");
1655   case '&lt;':
1656     L = Builder.CreateFCmpULT(L, R, "multmp");
1657     // Convert bool 0/1 to double 0.0 or 1.0
1658     return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1659   default: break;
1660   }
1661
1662   // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1663   // a call to it.
1664   Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1665   assert(F &amp;&amp; "binary operator not found!");
1666
1667   Value *Ops[] = { L, R };
1668   return Builder.CreateCall(F, Ops, Ops+2, "binop");
1669 }
1670
1671 Value *CallExprAST::Codegen() {
1672   // Look up the name in the global module table.
1673   Function *CalleeF = TheModule-&gt;getFunction(Callee);
1674   if (CalleeF == 0)
1675     return ErrorV("Unknown function referenced");
1676
1677   // If argument mismatch error.
1678   if (CalleeF-&gt;arg_size() != Args.size())
1679     return ErrorV("Incorrect # arguments passed");
1680
1681   std::vector&lt;Value*&gt; ArgsV;
1682   for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1683     ArgsV.push_back(Args[i]-&gt;Codegen());
1684     if (ArgsV.back() == 0) return 0;
1685   }
1686
1687   return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1688 }
1689
1690 Value *IfExprAST::Codegen() {
1691   Value *CondV = Cond-&gt;Codegen();
1692   if (CondV == 0) return 0;
1693
1694   // Convert condition to a bool by comparing equal to 0.0.
1695   CondV = Builder.CreateFCmpONE(CondV,
1696                                 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1697                                 "ifcond");
1698
1699   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1700
1701   // Create blocks for the then and else cases.  Insert the 'then' block at the
1702   // end of the function.
1703   BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
1704   BasicBlock *ElseBB = new BasicBlock("else");
1705   BasicBlock *MergeBB = new BasicBlock("ifcont");
1706
1707   Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1708
1709   // Emit then value.
1710   Builder.SetInsertPoint(ThenBB);
1711
1712   Value *ThenV = Then-&gt;Codegen();
1713   if (ThenV == 0) return 0;
1714
1715   Builder.CreateBr(MergeBB);
1716   // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1717   ThenBB = Builder.GetInsertBlock();
1718
1719   // Emit else block.
1720   TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1721   Builder.SetInsertPoint(ElseBB);
1722
1723   Value *ElseV = Else-&gt;Codegen();
1724   if (ElseV == 0) return 0;
1725
1726   Builder.CreateBr(MergeBB);
1727   // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1728   ElseBB = Builder.GetInsertBlock();
1729
1730   // Emit merge block.
1731   TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1732   Builder.SetInsertPoint(MergeBB);
1733   PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
1734
1735   PN-&gt;addIncoming(ThenV, ThenBB);
1736   PN-&gt;addIncoming(ElseV, ElseBB);
1737   return PN;
1738 }
1739
1740 Value *ForExprAST::Codegen() {
1741   // Output this as:
1742   //   var = alloca double
1743   //   ...
1744   //   start = startexpr
1745   //   store start -&gt; var
1746   //   goto loop
1747   // loop:
1748   //   ...
1749   //   bodyexpr
1750   //   ...
1751   // loopend:
1752   //   step = stepexpr
1753   //   endcond = endexpr
1754   //
1755   //   curvar = load var
1756   //   nextvar = curvar + step
1757   //   store nextvar -&gt; var
1758   //   br endcond, loop, endloop
1759   // outloop:
1760
1761   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1762
1763   // Create an alloca for the variable in the entry block.
1764   AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1765
1766   // Emit the start code first, without 'variable' in scope.
1767   Value *StartVal = Start-&gt;Codegen();
1768   if (StartVal == 0) return 0;
1769
1770   // Store the value into the alloca.
1771   Builder.CreateStore(StartVal, Alloca);
1772
1773   // Make the new basic block for the loop header, inserting after current
1774   // block.
1775   BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1776   BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
1777
1778   // Insert an explicit fall through from the current block to the LoopBB.
1779   Builder.CreateBr(LoopBB);
1780
1781   // Start insertion in LoopBB.
1782   Builder.SetInsertPoint(LoopBB);
1783
1784   // Within the loop, the variable is defined equal to the PHI node.  If it
1785   // shadows an existing variable, we have to restore it, so save it now.
1786   AllocaInst *OldVal = NamedValues[VarName];
1787   NamedValues[VarName] = Alloca;
1788
1789   // Emit the body of the loop.  This, like any other expr, can change the
1790   // current BB.  Note that we ignore the value computed by the body, but don't
1791   // allow an error.
1792   if (Body-&gt;Codegen() == 0)
1793     return 0;
1794
1795   // Emit the step value.
1796   Value *StepVal;
1797   if (Step) {
1798     StepVal = Step-&gt;Codegen();
1799     if (StepVal == 0) return 0;
1800   } else {
1801     // If not specified, use 1.0.
1802     StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
1803   }
1804
1805   // Compute the end condition.
1806   Value *EndCond = End-&gt;Codegen();
1807   if (EndCond == 0) return EndCond;
1808
1809   // Reload, increment, and restore the alloca.  This handles the case where
1810   // the body of the loop mutates the variable.
1811   Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
1812   Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
1813   Builder.CreateStore(NextVar, Alloca);
1814
1815   // Convert condition to a bool by comparing equal to 0.0.
1816   EndCond = Builder.CreateFCmpONE(EndCond,
1817                                   ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1818                                   "loopcond");
1819
1820   // Create the "after loop" block and insert it.
1821   BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1822   BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
1823
1824   // Insert the conditional branch into the end of LoopEndBB.
1825   Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1826
1827   // Any new code will be inserted in AfterBB.
1828   Builder.SetInsertPoint(AfterBB);
1829
1830   // Restore the unshadowed variable.
1831   if (OldVal)
1832     NamedValues[VarName] = OldVal;
1833   else
1834     NamedValues.erase(VarName);
1835
1836
1837   // for expr always returns 0.0.
1838   return Constant::getNullValue(Type::DoubleTy);
1839 }
1840
1841 Value *VarExprAST::Codegen() {
1842   std::vector&lt;AllocaInst *&gt; OldBindings;
1843
1844   Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1845
1846   // Register all variables and emit their initializer.
1847   for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
1848     const std::string &amp;VarName = VarNames[i].first;
1849     ExprAST *Init = VarNames[i].second;
1850
1851     // Emit the initializer before adding the variable to scope, this prevents
1852     // the initializer from referencing the variable itself, and permits stuff
1853     // like this:
1854     //  var a = 1 in
1855     //    var a = a in ...   # refers to outer 'a'.
1856     Value *InitVal;
1857     if (Init) {
1858       InitVal = Init-&gt;Codegen();
1859       if (InitVal == 0) return 0;
1860     } else { // If not specified, use 0.0.
1861       InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
1862     }
1863
1864     AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1865     Builder.CreateStore(InitVal, Alloca);
1866
1867     // Remember the old variable binding so that we can restore the binding when
1868     // we unrecurse.
1869     OldBindings.push_back(NamedValues[VarName]);
1870
1871     // Remember this binding.
1872     NamedValues[VarName] = Alloca;
1873   }
1874
1875   // Codegen the body, now that all vars are in scope.
1876   Value *BodyVal = Body-&gt;Codegen();
1877   if (BodyVal == 0) return 0;
1878
1879   // Pop all our variables from scope.
1880   for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
1881     NamedValues[VarNames[i].first] = OldBindings[i];
1882
1883   // Return the body computation.
1884   return BodyVal;
1885 }
1886
1887
1888 Function *PrototypeAST::Codegen() {
1889   // Make the function type:  double(double,double) etc.
1890   std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1891   FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
1892
1893   Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1894
1895   // If F conflicted, there was already something named 'Name'.  If it has a
1896   // body, don't allow redefinition or reextern.
1897   if (F-&gt;getName() != Name) {
1898     // Delete the one we just made and get the existing one.
1899     F-&gt;eraseFromParent();
1900     F = TheModule-&gt;getFunction(Name);
1901
1902     // If F already has a body, reject this.
1903     if (!F-&gt;empty()) {
1904       ErrorF("redefinition of function");
1905       return 0;
1906     }
1907
1908     // If F took a different number of args, reject.
1909     if (F-&gt;arg_size() != Args.size()) {
1910       ErrorF("redefinition of function with different # args");
1911       return 0;
1912     }
1913   }
1914
1915   // Set names for all arguments.
1916   unsigned Idx = 0;
1917   for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1918        ++AI, ++Idx)
1919     AI-&gt;setName(Args[Idx]);
1920
1921   return F;
1922 }
1923
1924 /// CreateArgumentAllocas - Create an alloca for each argument and register the
1925 /// argument in the symbol table so that references to it will succeed.
1926 void PrototypeAST::CreateArgumentAllocas(Function *F) {
1927   Function::arg_iterator AI = F-&gt;arg_begin();
1928   for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
1929     // Create an alloca for this variable.
1930     AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
1931
1932     // Store the initial value into the alloca.
1933     Builder.CreateStore(AI, Alloca);
1934
1935     // Add arguments to variable symbol table.
1936     NamedValues[Args[Idx]] = Alloca;
1937   }
1938 }
1939
1940
1941 Function *FunctionAST::Codegen() {
1942   NamedValues.clear();
1943
1944   Function *TheFunction = Proto-&gt;Codegen();
1945   if (TheFunction == 0)
1946     return 0;
1947
1948   // If this is an operator, install it.
1949   if (Proto-&gt;isBinaryOp())
1950     BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1951
1952   // Create a new basic block to start insertion into.
1953   BasicBlock *BB = new BasicBlock("entry", TheFunction);
1954   Builder.SetInsertPoint(BB);
1955
1956   // Add all arguments to the symbol table and create their allocas.
1957   Proto-&gt;CreateArgumentAllocas(TheFunction);
1958
1959   if (Value *RetVal = Body-&gt;Codegen()) {
1960     // Finish off the function.
1961     Builder.CreateRet(RetVal);
1962
1963     // Validate the generated code, checking for consistency.
1964     verifyFunction(*TheFunction);
1965
1966     // Optimize the function.
1967     TheFPM-&gt;run(*TheFunction);
1968
1969     return TheFunction;
1970   }
1971
1972   // Error reading body, remove function.
1973   TheFunction-&gt;eraseFromParent();
1974
1975   if (Proto-&gt;isBinaryOp())
1976     BinopPrecedence.erase(Proto-&gt;getOperatorName());
1977   return 0;
1978 }
1979
1980 //===----------------------------------------------------------------------===//
1981 // Top-Level parsing and JIT Driver
1982 //===----------------------------------------------------------------------===//
1983
1984 static ExecutionEngine *TheExecutionEngine;
1985
1986 static void HandleDefinition() {
1987   if (FunctionAST *F = ParseDefinition()) {
1988     if (Function *LF = F-&gt;Codegen()) {
1989       fprintf(stderr, "Read function definition:");
1990       LF-&gt;dump();
1991     }
1992   } else {
1993     // Skip token for error recovery.
1994     getNextToken();
1995   }
1996 }
1997
1998 static void HandleExtern() {
1999   if (PrototypeAST *P = ParseExtern()) {
2000     if (Function *F = P-&gt;Codegen()) {
2001       fprintf(stderr, "Read extern: ");
2002       F-&gt;dump();
2003     }
2004   } else {
2005     // Skip token for error recovery.
2006     getNextToken();
2007   }
2008 }
2009
2010 static void HandleTopLevelExpression() {
2011   // Evaluate a top level expression into an anonymous function.
2012   if (FunctionAST *F = ParseTopLevelExpr()) {
2013     if (Function *LF = F-&gt;Codegen()) {
2014       // JIT the function, returning a function pointer.
2015       void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
2016
2017       // Cast it to the right type (takes no arguments, returns a double) so we
2018       // can call it as a native function.
2019       double (*FP)() = (double (*)())FPtr;
2020       fprintf(stderr, "Evaluated to %f\n", FP());
2021     }
2022   } else {
2023     // Skip token for error recovery.
2024     getNextToken();
2025   }
2026 }
2027
2028 /// top ::= definition | external | expression | ';'
2029 static void MainLoop() {
2030   while (1) {
2031     fprintf(stderr, "ready&gt; ");
2032     switch (CurTok) {
2033     case tok_eof:    return;
2034     case ';':        getNextToken(); break;  // ignore top level semicolons.
2035     case tok_def:    HandleDefinition(); break;
2036     case tok_extern: HandleExtern(); break;
2037     default:         HandleTopLevelExpression(); break;
2038     }
2039   }
2040 }
2041
2042
2043
2044 //===----------------------------------------------------------------------===//
2045 // "Library" functions that can be "extern'd" from user code.
2046 //===----------------------------------------------------------------------===//
2047
2048 /// putchard - putchar that takes a double and returns 0.
2049 extern "C"
2050 double putchard(double X) {
2051   putchar((char)X);
2052   return 0;
2053 }
2054
2055 /// printd - printf that takes a double prints it as "%f\n", returning 0.
2056 extern "C"
2057 double printd(double X) {
2058   printf("%f\n", X);
2059   return 0;
2060 }
2061
2062 //===----------------------------------------------------------------------===//
2063 // Main driver code.
2064 //===----------------------------------------------------------------------===//
2065
2066 int main() {
2067   // Install standard binary operators.
2068   // 1 is lowest precedence.
2069   BinopPrecedence['='] = 2;
2070   BinopPrecedence['&lt;'] = 10;
2071   BinopPrecedence['+'] = 20;
2072   BinopPrecedence['-'] = 20;
2073   BinopPrecedence['*'] = 40;  // highest.
2074
2075   // Prime the first token.
2076   fprintf(stderr, "ready&gt; ");
2077   getNextToken();
2078
2079   // Make the module, which holds all the code.
2080   TheModule = new Module("my cool jit");
2081
2082   // Create the JIT.
2083   TheExecutionEngine = ExecutionEngine::create(TheModule);
2084
2085   {
2086     ExistingModuleProvider OurModuleProvider(TheModule);
2087     FunctionPassManager OurFPM(&amp;OurModuleProvider);
2088
2089     // Set up the optimizer pipeline.  Start with registering info about how the
2090     // target lays out data structures.
2091     OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
2092     // Promote allocas to registers.
2093     OurFPM.add(createPromoteMemoryToRegisterPass());
2094     // Do simple "peephole" optimizations and bit-twiddling optzns.
2095     OurFPM.add(createInstructionCombiningPass());
2096     // Reassociate expressions.
2097     OurFPM.add(createReassociatePass());
2098     // Eliminate Common SubExpressions.
2099     OurFPM.add(createGVNPass());
2100     // Simplify the control flow graph (deleting unreachable blocks, etc).
2101     OurFPM.add(createCFGSimplificationPass());
2102
2103     // Set the global so the code gen can use this.
2104     TheFPM = &amp;OurFPM;
2105
2106     // Run the main "interpreter loop" now.
2107     MainLoop();
2108
2109     TheFPM = 0;
2110   }  // Free module provider and pass manager.
2111
2112
2113   // Print out all of the generated code.
2114   TheModule-&gt;dump();
2115   return 0;
2116 }
2117 </pre>
2118 </div>
2119
2120 </div>
2121
2122 <!-- *********************************************************************** -->
2123 <hr>
2124 <address>
2125   <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
2126   src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
2127   <a href="http://validator.w3.org/check/referer"><img
2128   src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
2129
2130   <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
2131   <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
2132   Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
2133 </address>
2134 </body>
2135 </html>