docs/tutorial/OCamlLangImpl7.rst

   1 =======================================================
   2 Kaleidoscope: Extending the Language: Mutable Variables
   3 =======================================================
   4
   5 .. contents::
   6    :local:
   7
   8 Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick
   9 Tryzelaar <mailto:idadesub@users.sourceforge.net>`_
  10
  11 Chapter 7 Introduction
  12 ======================
  13
  14 Welcome to Chapter 7 of the "`Implementing a language with
  15 LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
  16 very respectable, albeit simple, `functional programming
  17 language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
  18 journey, we learned some parsing techniques, how to build and represent
  19 an AST, how to build LLVM IR, and how to optimize the resultant code as
  20 well as JIT compile it.
  21
  22 While Kaleidoscope is interesting as a functional language, the fact
  23 that it is functional makes it "too easy" to generate LLVM IR for it. In
  24 particular, a functional language makes it very easy to build LLVM IR
  25 directly in `SSA
  26 form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
  27 Since LLVM requires that the input code be in SSA form, this is a very
  28 nice property and it is often unclear to newcomers how to generate code
  29 for an imperative language with mutable variables.
  30
  31 The short (and happy) summary of this chapter is that there is no need
  32 for your front-end to build SSA form: LLVM provides highly tuned and
  33 well tested support for this, though the way it works is a bit
  34 unexpected for some.
  35
  36 Why is this a hard problem?
  37 ===========================
  38
  39 To understand why mutable variables cause complexities in SSA
  40 construction, consider this extremely simple C example:
  41
  42 .. code-block:: c
  43
  44     int G, H;
  45     int test(_Bool Condition) {
  46       int X;
  47       if (Condition)
  48         X = G;
  49       else
  50         X = H;
  51       return X;
  52     }
  53
  54 In this case, we have the variable "X", whose value depends on the path
  55 executed in the program. Because there are two different possible values
  56 for X before the return instruction, a PHI node is inserted to merge the
  57 two values. The LLVM IR that we want for this example looks like this:
  58
  59 .. code-block:: llvm
  60
  61     @G = weak global i32 0   ; type of @G is i32*
  62     @H = weak global i32 0   ; type of @H is i32*
  63
  64     define i32 @test(i1 %Condition) {
  65     entry:
  66       br i1 %Condition, label %cond_true, label %cond_false
  67
  68     cond_true:
  69       %X.0 = load i32* @G
  70       br label %cond_next
  71
  72     cond_false:
  73       %X.1 = load i32* @H
  74       br label %cond_next
  75
  76     cond_next:
  77       %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
  78       ret i32 %X.2
  79     }
  80
  81 In this example, the loads from the G and H global variables are
  82 explicit in the LLVM IR, and they live in the then/else branches of the
  83 if statement (cond\_true/cond\_false). In order to merge the incoming
  84 values, the X.2 phi node in the cond\_next block selects the right value
  85 to use based on where control flow is coming from: if control flow comes
  86 from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
  87 control flow comes from cond\_true, it gets the value of X.0. The intent
  88 of this chapter is not to explain the details of SSA form. For more
  89 information, see one of the many `online
  90 references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
  91
  92 The question for this article is "who places the phi nodes when lowering
  93 assignments to mutable variables?". The issue here is that LLVM
  94 *requires* that its IR be in SSA form: there is no "non-ssa" mode for
  95 it. However, SSA construction requires non-trivial algorithms and data
  96 structures, so it is inconvenient and wasteful for every front-end to
  97 have to reproduce this logic.
  98
  99 Memory in LLVM
 100 ==============
 101
 102 The 'trick' here is that while LLVM does require all register values to
 103 be in SSA form, it does not require (or permit) memory objects to be in
 104 SSA form. In the example above, note that the loads from G and H are
 105 direct accesses to G and H: they are not renamed or versioned. This
 106 differs from some other compiler systems, which do try to version memory
 107 objects. In LLVM, instead of encoding dataflow analysis of memory into
 108 the LLVM IR, it is handled with `Analysis
 109 Passes <../WritingAnLLVMPass.html>`_ which are computed on demand.
 110
 111 With this in mind, the high-level idea is that we want to make a stack
 112 variable (which lives in memory, because it is on the stack) for each
 113 mutable object in a function. To take advantage of this trick, we need
 114 to talk about how LLVM represents stack variables.
 115
 116 In LLVM, all memory accesses are explicit with load/store instructions,
 117 and it is carefully designed not to have (or need) an "address-of"
 118 operator. Notice how the type of the @G/@H global variables is actually
 119 "i32\*" even though the variable is defined as "i32". What this means is
 120 that @G defines *space* for an i32 in the global data area, but its
 121 *name* actually refers to the address for that space. Stack variables
 122 work the same way, except that instead of being declared with global
 123 variable definitions, they are declared with the `LLVM alloca
 124 instruction <../LangRef.html#i_alloca>`_:
 125
 126 .. code-block:: llvm
 127
 128     define i32 @example() {
 129     entry:
 130       %X = alloca i32           ; type of %X is i32*.
 131       ...
 132       %tmp = load i32* %X       ; load the stack value %X from the stack.
 133       %tmp2 = add i32 %tmp, 1   ; increment it
 134       store i32 %tmp2, i32* %X  ; store it back
 135       ...
 136
 137 This code shows an example of how you can declare and manipulate a stack
 138 variable in the LLVM IR. Stack memory allocated with the alloca
 139 instruction is fully general: you can pass the address of the stack slot
 140 to functions, you can store it in other variables, etc. In our example
 141 above, we could rewrite the example to use the alloca technique to avoid
 142 using a PHI node:
 143
 144 .. code-block:: llvm
 145
 146     @G = weak global i32 0   ; type of @G is i32*
 147     @H = weak global i32 0   ; type of @H is i32*
 148
 149     define i32 @test(i1 %Condition) {
 150     entry:
 151       %X = alloca i32           ; type of %X is i32*.
 152       br i1 %Condition, label %cond_true, label %cond_false
 153
 154     cond_true:
 155       %X.0 = load i32* @G
 156             store i32 %X.0, i32* %X   ; Update X
 157       br label %cond_next
 158
 159     cond_false:
 160       %X.1 = load i32* @H
 161             store i32 %X.1, i32* %X   ; Update X
 162       br label %cond_next
 163
 164     cond_next:
 165       %X.2 = load i32* %X       ; Read X
 166       ret i32 %X.2
 167     }
 168
 169 With this, we have discovered a way to handle arbitrary mutable
 170 variables without the need to create Phi nodes at all:
 171
 172 #. Each mutable variable becomes a stack allocation.
 173 #. Each read of the variable becomes a load from the stack.
 174 #. Each update of the variable becomes a store to the stack.
 175 #. Taking the address of a variable just uses the stack address
 176    directly.
 177
 178 While this solution has solved our immediate problem, it introduced
 179 another one: we have now apparently introduced a lot of stack traffic
 180 for very simple and common operations, a major performance problem.
 181 Fortunately for us, the LLVM optimizer has a highly-tuned optimization
 182 pass named "mem2reg" that handles this case, promoting allocas like this
 183 into SSA registers, inserting Phi nodes as appropriate. If you run this
 184 example through the pass, for example, you'll get:
 185
 186 .. code-block:: bash
 187
 188     $ llvm-as < example.ll | opt -mem2reg | llvm-dis
 189     @G = weak global i32 0
 190     @H = weak global i32 0
 191
 192     define i32 @test(i1 %Condition) {
 193     entry:
 194       br i1 %Condition, label %cond_true, label %cond_false
 195
 196     cond_true:
 197       %X.0 = load i32* @G
 198       br label %cond_next
 199
 200     cond_false:
 201       %X.1 = load i32* @H
 202       br label %cond_next
 203
 204     cond_next:
 205       %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
 206       ret i32 %X.01
 207     }
 208
 209 The mem2reg pass implements the standard "iterated dominance frontier"
 210 algorithm for constructing SSA form and has a number of optimizations
 211 that speed up (very common) degenerate cases. The mem2reg optimization
 212 pass is the answer to dealing with mutable variables, and we highly
 213 recommend that you depend on it. Note that mem2reg only works on
 214 variables in certain circumstances:
 215
 216 #. mem2reg is alloca-driven: it looks for allocas and if it can handle
 217    them, it promotes them. It does not apply to global variables or heap
 218    allocations.
 219 #. mem2reg only looks for alloca instructions in the entry block of the
 220    function. Being in the entry block guarantees that the alloca is only
 221    executed once, which makes analysis simpler.
 222 #. mem2reg only promotes allocas whose uses are direct loads and stores.
 223    If the address of the stack object is passed to a function, or if any
 224    funny pointer arithmetic is involved, the alloca will not be
 225    promoted.
 226 #. mem2reg only works on allocas of `first
 227    class <../LangRef.html#t_classifications>`_ values (such as pointers,
 228    scalars and vectors), and only if the array size of the allocation is
 229    1 (or missing in the .ll file). mem2reg is not capable of promoting
 230    structs or arrays to registers. Note that the "scalarrepl" pass is
 231    more powerful and can promote structs, "unions", and arrays in many
 232    cases.
 233
 234 All of these properties are easy to satisfy for most imperative
 235 languages, and we'll illustrate it below with Kaleidoscope. The final
 236 question you may be asking is: should I bother with this nonsense for my
 237 front-end? Wouldn't it be better if I just did SSA construction
 238 directly, avoiding use of the mem2reg optimization pass? In short, we
 239 strongly recommend that you use this technique for building SSA form,
 240 unless there is an extremely good reason not to. Using this technique
 241 is:
 242
 243 -  Proven and well tested: llvm-gcc and clang both use this technique
 244    for local mutable variables. As such, the most common clients of LLVM
 245    are using this to handle a bulk of their variables. You can be sure
 246    that bugs are found fast and fixed early.
 247 -  Extremely Fast: mem2reg has a number of special cases that make it
 248    fast in common cases as well as fully general. For example, it has
 249    fast-paths for variables that are only used in a single block,
 250    variables that only have one assignment point, good heuristics to
 251    avoid insertion of unneeded phi nodes, etc.
 252 -  Needed for debug info generation: `Debug information in
 253    LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
 254    the variable exposed so that debug info can be attached to it. This
 255    technique dovetails very naturally with this style of debug info.
 256
 257 If nothing else, this makes it much easier to get your front-end up and
 258 running, and is very simple to implement. Lets extend Kaleidoscope with
 259 mutable variables now!
 260
 261 Mutable Variables in Kaleidoscope
 262 =================================
 263
 264 Now that we know the sort of problem we want to tackle, lets see what
 265 this looks like in the context of our little Kaleidoscope language.
 266 We're going to add two features:
 267
 268 #. The ability to mutate variables with the '=' operator.
 269 #. The ability to define new variables.
 270
 271 While the first item is really what this is about, we only have
 272 variables for incoming arguments as well as for induction variables, and
 273 redefining those only goes so far :). Also, the ability to define new
 274 variables is a useful thing regardless of whether you will be mutating
 275 them. Here's a motivating example that shows how we could use these:
 276
 277 ::
 278
 279     # Define ':' for sequencing: as a low-precedence operator that ignores operands
 280     # and just returns the RHS.
 281     def binary : 1 (x y) y;
 282
 283     # Recursive fib, we could do this before.
 284     def fib(x)
 285       if (x < 3) then
 286         1
 287       else
 288         fib(x-1)+fib(x-2);
 289
 290     # Iterative fib.
 291     def fibi(x)
 292       var a = 1, b = 1, c in
 293       (for i = 3, i < x in
 294          c = a + b :
 295          a = b :
 296          b = c) :
 297       b;
 298
 299     # Call it.
 300     fibi(10);
 301
 302 In order to mutate variables, we have to change our existing variables
 303 to use the "alloca trick". Once we have that, we'll add our new
 304 operator, then extend Kaleidoscope to support new variable definitions.
 305
 306 Adjusting Existing Variables for Mutation
 307 =========================================
 308
 309 The symbol table in Kaleidoscope is managed at code generation time by
 310 the '``named_values``' map. This map currently keeps track of the LLVM
 311 "Value\*" that holds the double value for the named variable. In order
 312 to support mutation, we need to change this slightly, so that it
 313 ``named_values`` holds the *memory location* of the variable in
 314 question. Note that this change is a refactoring: it changes the
 315 structure of the code, but does not (by itself) change the behavior of
 316 the compiler. All of these changes are isolated in the Kaleidoscope code
 317 generator.
 318
 319 At this point in Kaleidoscope's development, it only supports variables
 320 for two things: incoming arguments to functions and the induction
 321 variable of 'for' loops. For consistency, we'll allow mutation of these
 322 variables in addition to other user-defined variables. This means that
 323 these will both need memory locations.
 324
 325 To start our transformation of Kaleidoscope, we'll change the
 326 ``named_values`` map so that it maps to AllocaInst\* instead of Value\*.
 327 Once we do this, the C++ compiler will tell us what parts of the code we
 328 need to update:
 329
 330 **Note:** the ocaml bindings currently model both ``Value*``'s and
 331 ``AllocInst*``'s as ``Llvm.llvalue``'s, but this may change in the future
 332 to be more type safe.
 333
 334 .. code-block:: ocaml
 335
 336     let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
 337
 338 Also, since we will need to create these alloca's, we'll use a helper
 339 function that ensures that the allocas are created in the entry block of
 340 the function:
 341
 342 .. code-block:: ocaml
 343
 344     (* Create an alloca instruction in the entry block of the function. This
 345      * is used for mutable variables etc. *)
 346     let create_entry_block_alloca the_function var_name =
 347       let builder = builder_at (instr_begin (entry_block the_function)) in
 348       build_alloca double_type var_name builder
 349
 350 This funny looking code creates an ``Llvm.llbuilder`` object that is
 351 pointing at the first instruction of the entry block. It then creates an
 352 alloca with the expected name and returns it. Because all values in
 353 Kaleidoscope are doubles, there is no need to pass in a type to use.
 354
 355 With this in place, the first functionality change we want to make is to
 356 variable references. In our new scheme, variables live on the stack, so
 357 code generating a reference to them actually needs to produce a load
 358 from the stack slot:
 359
 360 .. code-block:: ocaml
 361
 362     let rec codegen_expr = function
 363       ...
 364       | Ast.Variable name ->
 365           let v = try Hashtbl.find named_values name with
 366             | Not_found -> raise (Error "unknown variable name")
 367           in
 368           (* Load the value. *)
 369           build_load v name builder
 370
 371 As you can see, this is pretty straightforward. Now we need to update
 372 the things that define the variables to set up the alloca. We'll start
 373 with ``codegen_expr Ast.For ...`` (see the `full code listing <#code>`_
 374 for the unabridged code):
 375
 376 .. code-block:: ocaml
 377
 378       | Ast.For (var_name, start, end_, step, body) ->
 379           let the_function = block_parent (insertion_block builder) in
 380
 381           (* Create an alloca for the variable in the entry block. *)
 382           let alloca = create_entry_block_alloca the_function var_name in
 383
 384           (* Emit the start code first, without 'variable' in scope. *)
 385           let start_val = codegen_expr start in
 386
 387           (* Store the value into the alloca. *)
 388           ignore(build_store start_val alloca builder);
 389
 390           ...
 391
 392           (* Within the loop, the variable is defined equal to the PHI node. If it
 393            * shadows an existing variable, we have to restore it, so save it
 394            * now. *)
 395           let old_val =
 396             try Some (Hashtbl.find named_values var_name) with Not_found -> None
 397           in
 398           Hashtbl.add named_values var_name alloca;
 399
 400           ...
 401
 402           (* Compute the end condition. *)
 403           let end_cond = codegen_expr end_ in
 404
 405           (* Reload, increment, and restore the alloca. This handles the case where
 406            * the body of the loop mutates the variable. *)
 407           let cur_var = build_load alloca var_name builder in
 408           let next_var = build_add cur_var step_val "nextvar" builder in
 409           ignore(build_store next_var alloca builder);
 410           ...
 411
 412 This code is virtually identical to the code `before we allowed mutable
 413 variables <OCamlLangImpl5.html#forcodegen>`_. The big difference is that
 414 we no longer have to construct a PHI node, and we use load/store to
 415 access the variable as needed.
 416
 417 To support mutable argument variables, we need to also make allocas for
 418 them. The code for this is also pretty simple:
 419
 420 .. code-block:: ocaml
 421
 422     (* Create an alloca for each argument and register the argument in the symbol
 423      * table so that references to it will succeed. *)
 424     let create_argument_allocas the_function proto =
 425       let args = match proto with
 426         | Ast.Prototype (_, args) | Ast.BinOpPrototype (_, args, _) -> args
 427       in
 428       Array.iteri (fun i ai ->
 429         let var_name = args.(i) in
 430         (* Create an alloca for this variable. *)
 431         let alloca = create_entry_block_alloca the_function var_name in
 432
 433         (* Store the initial value into the alloca. *)
 434         ignore(build_store ai alloca builder);
 435
 436         (* Add arguments to variable symbol table. *)
 437         Hashtbl.add named_values var_name alloca;
 438       ) (params the_function)
 439
 440 For each argument, we make an alloca, store the input value to the
 441 function into the alloca, and register the alloca as the memory location
 442 for the argument. This method gets invoked by ``Codegen.codegen_func``
 443 right after it sets up the entry block for the function.
 444
 445 The final missing piece is adding the mem2reg pass, which allows us to
 446 get good codegen once again:
 447
 448 .. code-block:: ocaml
 449
 450     let main () =
 451       ...
 452       let the_fpm = PassManager.create_function Codegen.the_module in
 453
 454       (* Set up the optimizer pipeline.  Start with registering info about how the
 455        * target lays out data structures. *)
 456       DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
 457
 458       (* Promote allocas to registers. *)
 459       add_memory_to_register_promotion the_fpm;
 460
 461       (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
 462       add_instruction_combining the_fpm;
 463
 464       (* reassociate expressions. *)
 465       add_reassociation the_fpm;
 466
 467 It is interesting to see what the code looks like before and after the
 468 mem2reg optimization runs. For example, this is the before/after code
 469 for our recursive fib function. Before the optimization:
 470
 471 .. code-block:: llvm
 472
 473     define double @fib(double %x) {
 474     entry:
 475       %x1 = alloca double
 476       store double %x, double* %x1
 477       %x2 = load double* %x1
 478       %cmptmp = fcmp ult double %x2, 3.000000e+00
 479       %booltmp = uitofp i1 %cmptmp to double
 480       %ifcond = fcmp one double %booltmp, 0.000000e+00
 481       br i1 %ifcond, label %then, label %else
 482
 483     then:    ; preds = %entry
 484       br label %ifcont
 485
 486     else:    ; preds = %entry
 487       %x3 = load double* %x1
 488       %subtmp = fsub double %x3, 1.000000e+00
 489       %calltmp = call double @fib(double %subtmp)
 490       %x4 = load double* %x1
 491       %subtmp5 = fsub double %x4, 2.000000e+00
 492       %calltmp6 = call double @fib(double %subtmp5)
 493       %addtmp = fadd double %calltmp, %calltmp6
 494       br label %ifcont
 495
 496     ifcont:    ; preds = %else, %then
 497       %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
 498       ret double %iftmp
 499     }
 500
 501 Here there is only one variable (x, the input argument) but you can
 502 still see the extremely simple-minded code generation strategy we are
 503 using. In the entry block, an alloca is created, and the initial input
 504 value is stored into it. Each reference to the variable does a reload
 505 from the stack. Also, note that we didn't modify the if/then/else
 506 expression, so it still inserts a PHI node. While we could make an
 507 alloca for it, it is actually easier to create a PHI node for it, so we
 508 still just make the PHI.
 509
 510 Here is the code after the mem2reg pass runs:
 511
 512 .. code-block:: llvm
 513
 514     define double @fib(double %x) {
 515     entry:
 516       %cmptmp = fcmp ult double %x, 3.000000e+00
 517       %booltmp = uitofp i1 %cmptmp to double
 518       %ifcond = fcmp one double %booltmp, 0.000000e+00
 519       br i1 %ifcond, label %then, label %else
 520
 521     then:
 522       br label %ifcont
 523
 524     else:
 525       %subtmp = fsub double %x, 1.000000e+00
 526       %calltmp = call double @fib(double %subtmp)
 527       %subtmp5 = fsub double %x, 2.000000e+00
 528       %calltmp6 = call double @fib(double %subtmp5)
 529       %addtmp = fadd double %calltmp, %calltmp6
 530       br label %ifcont
 531
 532     ifcont:    ; preds = %else, %then
 533       %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
 534       ret double %iftmp
 535     }
 536
 537 This is a trivial case for mem2reg, since there are no redefinitions of
 538 the variable. The point of showing this is to calm your tension about
 539 inserting such blatent inefficiencies :).
 540
 541 After the rest of the optimizers run, we get:
 542
 543 .. code-block:: llvm
 544
 545     define double @fib(double %x) {
 546     entry:
 547       %cmptmp = fcmp ult double %x, 3.000000e+00
 548       %booltmp = uitofp i1 %cmptmp to double
 549       %ifcond = fcmp ueq double %booltmp, 0.000000e+00
 550       br i1 %ifcond, label %else, label %ifcont
 551
 552     else:
 553       %subtmp = fsub double %x, 1.000000e+00
 554       %calltmp = call double @fib(double %subtmp)
 555       %subtmp5 = fsub double %x, 2.000000e+00
 556       %calltmp6 = call double @fib(double %subtmp5)
 557       %addtmp = fadd double %calltmp, %calltmp6
 558       ret double %addtmp
 559
 560     ifcont:
 561       ret double 1.000000e+00
 562     }
 563
 564 Here we see that the simplifycfg pass decided to clone the return
 565 instruction into the end of the 'else' block. This allowed it to
 566 eliminate some branches and the PHI node.
 567
 568 Now that all symbol table references are updated to use stack variables,
 569 we'll add the assignment operator.
 570
 571 New Assignment Operator
 572 =======================
 573
 574 With our current framework, adding a new assignment operator is really
 575 simple. We will parse it just like any other binary operator, but handle
 576 it internally (instead of allowing the user to define it). The first
 577 step is to set a precedence:
 578
 579 .. code-block:: ocaml
 580
 581     let main () =
 582       (* Install standard binary operators.
 583        * 1 is the lowest precedence. *)
 584       Hashtbl.add Parser.binop_precedence '=' 2;
 585       Hashtbl.add Parser.binop_precedence '<' 10;
 586       Hashtbl.add Parser.binop_precedence '+' 20;
 587       Hashtbl.add Parser.binop_precedence '-' 20;
 588       ...
 589
 590 Now that the parser knows the precedence of the binary operator, it
 591 takes care of all the parsing and AST generation. We just need to
 592 implement codegen for the assignment operator. This looks like:
 593
 594 .. code-block:: ocaml
 595
 596     let rec codegen_expr = function
 597           begin match op with
 598           | '=' ->
 599               (* Special case '=' because we don't want to emit the LHS as an
 600                * expression. *)
 601               let name =
 602                 match lhs with
 603                 | Ast.Variable name -> name
 604                 | _ -> raise (Error "destination of '=' must be a variable")
 605               in
 606
 607 Unlike the rest of the binary operators, our assignment operator doesn't
 608 follow the "emit LHS, emit RHS, do computation" model. As such, it is
 609 handled as a special case before the other binary operators are handled.
 610 The other strange thing is that it requires the LHS to be a variable. It
 611 is invalid to have "(x+1) = expr" - only things like "x = expr" are
 612 allowed.
 613
 614 .. code-block:: ocaml
 615
 616               (* Codegen the rhs. *)
 617               let val_ = codegen_expr rhs in
 618
 619               (* Lookup the name. *)
 620               let variable = try Hashtbl.find named_values name with
 621               | Not_found -> raise (Error "unknown variable name")
 622               in
 623               ignore(build_store val_ variable builder);
 624               val_
 625           | _ ->
 626                 ...
 627
 628 Once we have the variable, codegen'ing the assignment is
 629 straightforward: we emit the RHS of the assignment, create a store, and
 630 return the computed value. Returning a value allows for chained
 631 assignments like "X = (Y = Z)".
 632
 633 Now that we have an assignment operator, we can mutate loop variables
 634 and arguments. For example, we can now run code like this:
 635
 636 ::
 637
 638     # Function to print a double.
 639     extern printd(x);
 640
 641     # Define ':' for sequencing: as a low-precedence operator that ignores operands
 642     # and just returns the RHS.
 643     def binary : 1 (x y) y;
 644
 645     def test(x)
 646       printd(x) :
 647       x = 4 :
 648       printd(x);
 649
 650     test(123);
 651
 652 When run, this example prints "123" and then "4", showing that we did
 653 actually mutate the value! Okay, we have now officially implemented our
 654 goal: getting this to work requires SSA construction in the general
 655 case. However, to be really useful, we want the ability to define our
 656 own local variables, lets add this next!
 657
 658 User-defined Local Variables
 659 ============================
 660
 661 Adding var/in is just like any other other extensions we made to
 662 Kaleidoscope: we extend the lexer, the parser, the AST and the code
 663 generator. The first step for adding our new 'var/in' construct is to
 664 extend the lexer. As before, this is pretty trivial, the code looks like
 665 this:
 666
 667 .. code-block:: ocaml
 668
 669     type token =
 670       ...
 671       (* var definition *)
 672       | Var
 673
 674     ...
 675
 676     and lex_ident buffer = parser
 677           ...
 678           | "in" -> [< 'Token.In; stream >]
 679           | "binary" -> [< 'Token.Binary; stream >]
 680           | "unary" -> [< 'Token.Unary; stream >]
 681           | "var" -> [< 'Token.Var; stream >]
 682           ...
 683
 684 The next step is to define the AST node that we will construct. For
 685 var/in, it looks like this:
 686
 687 .. code-block:: ocaml
 688
 689     type expr =
 690       ...
 691       (* variant for var/in. *)
 692       | Var of (string * expr option) array * expr
 693       ...
 694
 695 var/in allows a list of names to be defined all at once, and each name
 696 can optionally have an initializer value. As such, we capture this
 697 information in the VarNames vector. Also, var/in has a body, this body
 698 is allowed to access the variables defined by the var/in.
 699
 700 With this in place, we can define the parser pieces. The first thing we
 701 do is add it as a primary expression:
 702
 703 .. code-block:: ocaml
 704
 705     (* primary
 706      *   ::= identifier
 707      *   ::= numberexpr
 708      *   ::= parenexpr
 709      *   ::= ifexpr
 710      *   ::= forexpr
 711      *   ::= varexpr *)
 712     let rec parse_primary = parser
 713       ...
 714       (* varexpr
 715        *   ::= 'var' identifier ('=' expression?
 716        *             (',' identifier ('=' expression)?)* 'in' expression *)
 717       | [< 'Token.Var;
 718            (* At least one variable name is required. *)
 719            'Token.Ident id ?? "expected identifier after var";
 720            init=parse_var_init;
 721            var_names=parse_var_names [(id, init)];
 722            (* At this point, we have to have 'in'. *)
 723            'Token.In ?? "expected 'in' keyword after 'var'";
 724            body=parse_expr >] ->
 725           Ast.Var (Array.of_list (List.rev var_names), body)
 726
 727     ...
 728
 729     and parse_var_init = parser
 730       (* read in the optional initializer. *)
 731       | [< 'Token.Kwd '='; e=parse_expr >] -> Some e
 732       | [< >] -> None
 733
 734     and parse_var_names accumulator = parser
 735       | [< 'Token.Kwd ',';
 736            'Token.Ident id ?? "expected identifier list after var";
 737            init=parse_var_init;
 738            e=parse_var_names ((id, init) :: accumulator) >] -> e
 739       | [< >] -> accumulator
 740
 741 Now that we can parse and represent the code, we need to support
 742 emission of LLVM IR for it. This code starts out with:
 743
 744 .. code-block:: ocaml
 745
 746     let rec codegen_expr = function
 747       ...
 748       | Ast.Var (var_names, body)
 749           let old_bindings = ref [] in
 750
 751           let the_function = block_parent (insertion_block builder) in
 752
 753           (* Register all variables and emit their initializer. *)
 754           Array.iter (fun (var_name, init) ->
 755
 756 Basically it loops over all the variables, installing them one at a
 757 time. For each variable we put into the symbol table, we remember the
 758 previous value that we replace in OldBindings.
 759
 760 .. code-block:: ocaml
 761
 762             (* Emit the initializer before adding the variable to scope, this
 763              * prevents the initializer from referencing the variable itself, and
 764              * permits stuff like this:
 765              *   var a = 1 in
 766              *     var a = a in ...   # refers to outer 'a'. *)
 767             let init_val =
 768               match init with
 769               | Some init -> codegen_expr init
 770               (* If not specified, use 0.0. *)
 771               | None -> const_float double_type 0.0
 772             in
 773
 774             let alloca = create_entry_block_alloca the_function var_name in
 775             ignore(build_store init_val alloca builder);
 776
 777             (* Remember the old variable binding so that we can restore the binding
 778              * when we unrecurse. *)
 779
 780             begin
 781               try
 782                 let old_value = Hashtbl.find named_values var_name in
 783                 old_bindings := (var_name, old_value) :: !old_bindings;
 784               with Not_found > ()
 785             end;
 786
 787             (* Remember this binding. *)
 788             Hashtbl.add named_values var_name alloca;
 789           ) var_names;
 790
 791 There are more comments here than code. The basic idea is that we emit
 792 the initializer, create the alloca, then update the symbol table to
 793 point to it. Once all the variables are installed in the symbol table,
 794 we evaluate the body of the var/in expression:
 795
 796 .. code-block:: ocaml
 797
 798           (* Codegen the body, now that all vars are in scope. *)
 799           let body_val = codegen_expr body in
 800
 801 Finally, before returning, we restore the previous variable bindings:
 802
 803 .. code-block:: ocaml
 804
 805           (* Pop all our variables from scope. *)
 806           List.iter (fun (var_name, old_value) ->
 807             Hashtbl.add named_values var_name old_value
 808           ) !old_bindings;
 809
 810           (* Return the body computation. *)
 811           body_val
 812
 813 The end result of all of this is that we get properly scoped variable
 814 definitions, and we even (trivially) allow mutation of them :).
 815
 816 With this, we completed what we set out to do. Our nice iterative fib
 817 example from the intro compiles and runs just fine. The mem2reg pass
 818 optimizes all of our stack variables into SSA registers, inserting PHI
 819 nodes where needed, and our front-end remains simple: no "iterated
 820 dominance frontier" computation anywhere in sight.
 821
 822 Full Code Listing
 823 =================
 824
 825 Here is the complete code listing for our running example, enhanced with
 826 mutable variables and var/in support. To build this example, use:
 827
 828 .. code-block:: bash
 829
 830     # Compile
 831     ocamlbuild toy.byte
 832     # Run
 833     ./toy.byte
 834
 835 Here is the code:
 836
 837 \_tags:
 838     ::
 839
 840         <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
 841         <*.{byte,native}>: g++, use_llvm, use_llvm_analysis
 842         <*.{byte,native}>: use_llvm_executionengine, use_llvm_target
 843         <*.{byte,native}>: use_llvm_scalar_opts, use_bindings
 844
 845 myocamlbuild.ml:
 846     .. code-block:: ocaml
 847
 848         open Ocamlbuild_plugin;;
 849
 850         ocaml_lib ~extern:true "llvm";;
 851         ocaml_lib ~extern:true "llvm_analysis";;
 852         ocaml_lib ~extern:true "llvm_executionengine";;
 853         ocaml_lib ~extern:true "llvm_target";;
 854         ocaml_lib ~extern:true "llvm_scalar_opts";;
 855
 856         flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);;
 857         dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
 858
 859 token.ml:
 860     .. code-block:: ocaml
 861
 862         (*===----------------------------------------------------------------------===
 863          * Lexer Tokens
 864          *===----------------------------------------------------------------------===*)
 865
 866         (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
 867          * these others for known things. *)
 868         type token =
 869           (* commands *)
 870           | Def | Extern
 871
 872           (* primary *)
 873           | Ident of string | Number of float
 874
 875           (* unknown *)
 876           | Kwd of char
 877
 878           (* control *)
 879           | If | Then | Else
 880           | For | In
 881
 882           (* operators *)
 883           | Binary | Unary
 884
 885           (* var definition *)
 886           | Var
 887
 888 lexer.ml:
 889     .. code-block:: ocaml
 890
 891         (*===----------------------------------------------------------------------===
 892          * Lexer
 893          *===----------------------------------------------------------------------===*)
 894
 895         let rec lex = parser
 896           (* Skip any whitespace. *)
 897           | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
 898
 899           (* identifier: [a-zA-Z][a-zA-Z0-9] *)
 900           | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
 901               let buffer = Buffer.create 1 in
 902               Buffer.add_char buffer c;
 903               lex_ident buffer stream
 904
 905           (* number: [0-9.]+ *)
 906           | [< ' ('0' .. '9' as c); stream >] ->
 907               let buffer = Buffer.create 1 in
 908               Buffer.add_char buffer c;
 909               lex_number buffer stream
 910
 911           (* Comment until end of line. *)
 912           | [< ' ('#'); stream >] ->
 913               lex_comment stream
 914
 915           (* Otherwise, just return the character as its ascii value. *)
 916           | [< 'c; stream >] ->
 917               [< 'Token.Kwd c; lex stream >]
 918
 919           (* end of stream. *)
 920           | [< >] -> [< >]
 921
 922         and lex_number buffer = parser
 923           | [< ' ('0' .. '9' | '.' as c); stream >] ->
 924               Buffer.add_char buffer c;
 925               lex_number buffer stream
 926           | [< stream=lex >] ->
 927               [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
 928
 929         and lex_ident buffer = parser
 930           | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
 931               Buffer.add_char buffer c;
 932               lex_ident buffer stream
 933           | [< stream=lex >] ->
 934               match Buffer.contents buffer with
 935               | "def" -> [< 'Token.Def; stream >]
 936               | "extern" -> [< 'Token.Extern; stream >]
 937               | "if" -> [< 'Token.If; stream >]
 938               | "then" -> [< 'Token.Then; stream >]
 939               | "else" -> [< 'Token.Else; stream >]
 940               | "for" -> [< 'Token.For; stream >]
 941               | "in" -> [< 'Token.In; stream >]
 942               | "binary" -> [< 'Token.Binary; stream >]
 943               | "unary" -> [< 'Token.Unary; stream >]
 944               | "var" -> [< 'Token.Var; stream >]
 945               | id -> [< 'Token.Ident id; stream >]
 946
 947         and lex_comment = parser
 948           | [< ' ('\n'); stream=lex >] -> stream
 949           | [< 'c; e=lex_comment >] -> e
 950           | [< >] -> [< >]
 951
 952 ast.ml:
 953     .. code-block:: ocaml
 954
 955         (*===----------------------------------------------------------------------===
 956          * Abstract Syntax Tree (aka Parse Tree)
 957          *===----------------------------------------------------------------------===*)
 958
 959         (* expr - Base type for all expression nodes. *)
 960         type expr =
 961           (* variant for numeric literals like "1.0". *)
 962           | Number of float
 963
 964           (* variant for referencing a variable, like "a". *)
 965           | Variable of string
 966
 967           (* variant for a unary operator. *)
 968           | Unary of char * expr
 969
 970           (* variant for a binary operator. *)
 971           | Binary of char * expr * expr
 972
 973           (* variant for function calls. *)
 974           | Call of string * expr array
 975
 976           (* variant for if/then/else. *)
 977           | If of expr * expr * expr
 978
 979           (* variant for for/in. *)
 980           | For of string * expr * expr * expr option * expr
 981
 982           (* variant for var/in. *)
 983           | Var of (string * expr option) array * expr
 984
 985         (* proto - This type represents the "prototype" for a function, which captures
 986          * its name, and its argument names (thus implicitly the number of arguments the
 987          * function takes). *)
 988         type proto =
 989           | Prototype of string * string array
 990           | BinOpPrototype of string * string array * int
 991
 992         (* func - This type represents a function definition itself. *)
 993         type func = Function of proto * expr
 994
 995 parser.ml:
 996     .. code-block:: ocaml
 997
 998         (*===---------------------------------------------------------------------===
 999          * Parser
1000          *===---------------------------------------------------------------------===*)
1001
1002         (* binop_precedence - This holds the precedence for each binary operator that is
1003          * defined *)
1004         let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
1005
1006         (* precedence - Get the precedence of the pending binary operator token. *)
1007         let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
1008
1009         (* primary
1010          *   ::= identifier
1011          *   ::= numberexpr
1012          *   ::= parenexpr
1013          *   ::= ifexpr
1014          *   ::= forexpr
1015          *   ::= varexpr *)
1016         let rec parse_primary = parser
1017           (* numberexpr ::= number *)
1018           | [< 'Token.Number n >] -> Ast.Number n
1019
1020           (* parenexpr ::= '(' expression ')' *)
1021           | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
1022
1023           (* identifierexpr
1024            *   ::= identifier
1025            *   ::= identifier '(' argumentexpr ')' *)
1026           | [< 'Token.Ident id; stream >] ->
1027               let rec parse_args accumulator = parser
1028                 | [< e=parse_expr; stream >] ->
1029                     begin parser
1030                       | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
1031                       | [< >] -> e :: accumulator
1032                     end stream
1033                 | [< >] -> accumulator
1034               in
1035               let rec parse_ident id = parser
1036                 (* Call. *)
1037                 | [< 'Token.Kwd '(';
1038                      args=parse_args [];
1039                      'Token.Kwd ')' ?? "expected ')'">] ->
1040                     Ast.Call (id, Array.of_list (List.rev args))
1041
1042                 (* Simple variable ref. *)
1043                 | [< >] -> Ast.Variable id
1044               in
1045               parse_ident id stream
1046
1047           (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
1048           | [< 'Token.If; c=parse_expr;
1049                'Token.Then ?? "expected 'then'"; t=parse_expr;
1050                'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
1051               Ast.If (c, t, e)
1052
1053           (* forexpr
1054                 ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
1055           | [< 'Token.For;
1056                'Token.Ident id ?? "expected identifier after for";
1057                'Token.Kwd '=' ?? "expected '=' after for";
1058                stream >] ->
1059               begin parser
1060                 | [<
1061                      start=parse_expr;
1062                      'Token.Kwd ',' ?? "expected ',' after for";
1063                      end_=parse_expr;
1064                      stream >] ->
1065                     let step =
1066                       begin parser
1067                       | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
1068                       | [< >] -> None
1069                       end stream
1070                     in
1071                     begin parser
1072                     | [< 'Token.In; body=parse_expr >] ->
1073                         Ast.For (id, start, end_, step, body)
1074                     | [< >] ->
1075                         raise (Stream.Error "expected 'in' after for")
1076                     end stream
1077                 | [< >] ->
1078                     raise (Stream.Error "expected '=' after for")
1079               end stream
1080
1081           (* varexpr
1082            *   ::= 'var' identifier ('=' expression?
1083            *             (',' identifier ('=' expression)?)* 'in' expression *)
1084           | [< 'Token.Var;
1085                (* At least one variable name is required. *)
1086                'Token.Ident id ?? "expected identifier after var";
1087                init=parse_var_init;
1088                var_names=parse_var_names [(id, init)];
1089                (* At this point, we have to have 'in'. *)
1090                'Token.In ?? "expected 'in' keyword after 'var'";
1091                body=parse_expr >] ->
1092               Ast.Var (Array.of_list (List.rev var_names), body)
1093
1094           | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
1095
1096         (* unary
1097          *   ::= primary
1098          *   ::= '!' unary *)
1099         and parse_unary = parser
1100           (* If this is a unary operator, read it. *)
1101           | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
1102               Ast.Unary (op, operand)
1103
1104           (* If the current token is not an operator, it must be a primary expr. *)
1105           | [< stream >] -> parse_primary stream
1106
1107         (* binoprhs
1108          *   ::= ('+' primary)* *)
1109         and parse_bin_rhs expr_prec lhs stream =
1110           match Stream.peek stream with
1111           (* If this is a binop, find its precedence. *)
1112           | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
1113               let token_prec = precedence c in
1114
1115               (* If this is a binop that binds at least as tightly as the current binop,
1116                * consume it, otherwise we are done. *)
1117               if token_prec < expr_prec then lhs else begin
1118                 (* Eat the binop. *)
1119                 Stream.junk stream;
1120
1121                 (* Parse the primary expression after the binary operator. *)
1122                 let rhs = parse_unary stream in
1123
1124                 (* Okay, we know this is a binop. *)
1125                 let rhs =
1126                   match Stream.peek stream with
1127                   | Some (Token.Kwd c2) ->
1128                       (* If BinOp binds less tightly with rhs than the operator after
1129                        * rhs, let the pending operator take rhs as its lhs. *)
1130                       let next_prec = precedence c2 in
1131                       if token_prec < next_prec
1132                       then parse_bin_rhs (token_prec + 1) rhs stream
1133                       else rhs
1134                   | _ -> rhs
1135                 in
1136
1137                 (* Merge lhs/rhs. *)
1138                 let lhs = Ast.Binary (c, lhs, rhs) in
1139                 parse_bin_rhs expr_prec lhs stream
1140               end
1141           | _ -> lhs
1142
1143         and parse_var_init = parser
1144           (* read in the optional initializer. *)
1145           | [< 'Token.Kwd '='; e=parse_expr >] -> Some e
1146           | [< >] -> None
1147
1148         and parse_var_names accumulator = parser
1149           | [< 'Token.Kwd ',';
1150                'Token.Ident id ?? "expected identifier list after var";
1151                init=parse_var_init;
1152                e=parse_var_names ((id, init) :: accumulator) >] -> e
1153           | [< >] -> accumulator
1154
1155         (* expression
1156          *   ::= primary binoprhs *)
1157         and parse_expr = parser
1158           | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
1159
1160         (* prototype
1161          *   ::= id '(' id* ')'
1162          *   ::= binary LETTER number? (id, id)
1163          *   ::= unary LETTER number? (id) *)
1164         let parse_prototype =
1165           let rec parse_args accumulator = parser
1166             | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
1167             | [< >] -> accumulator
1168           in
1169           let parse_operator = parser
1170             | [< 'Token.Unary >] -> "unary", 1
1171             | [< 'Token.Binary >] -> "binary", 2
1172           in
1173           let parse_binary_precedence = parser
1174             | [< 'Token.Number n >] -> int_of_float n
1175             | [< >] -> 30
1176           in
1177           parser
1178           | [< 'Token.Ident id;
1179                'Token.Kwd '(' ?? "expected '(' in prototype";
1180                args=parse_args [];
1181                'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
1182               (* success. *)
1183               Ast.Prototype (id, Array.of_list (List.rev args))
1184           | [< (prefix, kind)=parse_operator;
1185                'Token.Kwd op ?? "expected an operator";
1186                (* Read the precedence if present. *)
1187                binary_precedence=parse_binary_precedence;
1188                'Token.Kwd '(' ?? "expected '(' in prototype";
1189                 args=parse_args [];
1190                'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
1191               let name = prefix ^ (String.make 1 op) in
1192               let args = Array.of_list (List.rev args) in
1193
1194               (* Verify right number of arguments for operator. *)
1195               if Array.length args != kind
1196               then raise (Stream.Error "invalid number of operands for operator")
1197               else
1198                 if kind == 1 then
1199                   Ast.Prototype (name, args)
1200                 else
1201                   Ast.BinOpPrototype (name, args, binary_precedence)
1202           | [< >] ->
1203               raise (Stream.Error "expected function name in prototype")
1204
1205         (* definition ::= 'def' prototype expression *)
1206         let parse_definition = parser
1207           | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
1208               Ast.Function (p, e)
1209
1210         (* toplevelexpr ::= expression *)
1211         let parse_toplevel = parser
1212           | [< e=parse_expr >] ->
1213               (* Make an anonymous proto. *)
1214               Ast.Function (Ast.Prototype ("", [||]), e)
1215
1216         (*  external ::= 'extern' prototype *)
1217         let parse_extern = parser
1218           | [< 'Token.Extern; e=parse_prototype >] -> e
1219
1220 codegen.ml:
1221     .. code-block:: ocaml
1222
1223         (*===----------------------------------------------------------------------===
1224          * Code Generation
1225          *===----------------------------------------------------------------------===*)
1226
1227         open Llvm
1228
1229         exception Error of string
1230
1231         let context = global_context ()
1232         let the_module = create_module context "my cool jit"
1233         let builder = builder context
1234         let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
1235         let double_type = double_type context
1236
1237         (* Create an alloca instruction in the entry block of the function. This
1238          * is used for mutable variables etc. *)
1239         let create_entry_block_alloca the_function var_name =
1240           let builder = builder_at context (instr_begin (entry_block the_function)) in
1241           build_alloca double_type var_name builder
1242
1243         let rec codegen_expr = function
1244           | Ast.Number n -> const_float double_type n
1245           | Ast.Variable name ->
1246               let v = try Hashtbl.find named_values name with
1247                 | Not_found -> raise (Error "unknown variable name")
1248               in
1249               (* Load the value. *)
1250               build_load v name builder
1251           | Ast.Unary (op, operand) ->
1252               let operand = codegen_expr operand in
1253               let callee = "unary" ^ (String.make 1 op) in
1254               let callee =
1255                 match lookup_function callee the_module with
1256                 | Some callee -> callee
1257                 | None -> raise (Error "unknown unary operator")
1258               in
1259               build_call callee [|operand|] "unop" builder
1260           | Ast.Binary (op, lhs, rhs) ->
1261               begin match op with
1262               | '=' ->
1263                   (* Special case '=' because we don't want to emit the LHS as an
1264                    * expression. *)
1265                   let name =
1266                     match lhs with
1267                     | Ast.Variable name -> name
1268                     | _ -> raise (Error "destination of '=' must be a variable")
1269                   in
1270
1271                   (* Codegen the rhs. *)
1272                   let val_ = codegen_expr rhs in
1273
1274                   (* Lookup the name. *)
1275                   let variable = try Hashtbl.find named_values name with
1276                   | Not_found -> raise (Error "unknown variable name")
1277                   in
1278                   ignore(build_store val_ variable builder);
1279                   val_
1280               | _ ->
1281                   let lhs_val = codegen_expr lhs in
1282                   let rhs_val = codegen_expr rhs in
1283                   begin
1284                     match op with
1285                     | '+' -> build_add lhs_val rhs_val "addtmp" builder
1286                     | '-' -> build_sub lhs_val rhs_val "subtmp" builder
1287                     | '*' -> build_mul lhs_val rhs_val "multmp" builder
1288                     | '<' ->
1289                         (* Convert bool 0/1 to double 0.0 or 1.0 *)
1290                         let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
1291                         build_uitofp i double_type "booltmp" builder
1292                     | _ ->
1293                         (* If it wasn't a builtin binary operator, it must be a user defined
1294                          * one. Emit a call to it. *)
1295                         let callee = "binary" ^ (String.make 1 op) in
1296                         let callee =
1297                           match lookup_function callee the_module with
1298                           | Some callee -> callee
1299                           | None -> raise (Error "binary operator not found!")
1300                         in
1301                         build_call callee [|lhs_val; rhs_val|] "binop" builder
1302                   end
1303               end
1304           | Ast.Call (callee, args) ->
1305               (* Look up the name in the module table. *)
1306               let callee =
1307                 match lookup_function callee the_module with
1308                 | Some callee -> callee
1309                 | None -> raise (Error "unknown function referenced")
1310               in
1311               let params = params callee in
1312
1313               (* If argument mismatch error. *)
1314               if Array.length params == Array.length args then () else
1315                 raise (Error "incorrect # arguments passed");
1316               let args = Array.map codegen_expr args in
1317               build_call callee args "calltmp" builder
1318           | Ast.If (cond, then_, else_) ->
1319               let cond = codegen_expr cond in
1320
1321               (* Convert condition to a bool by comparing equal to 0.0 *)
1322               let zero = const_float double_type 0.0 in
1323               let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
1324
1325               (* Grab the first block so that we might later add the conditional branch
1326                * to it at the end of the function. *)
1327               let start_bb = insertion_block builder in
1328               let the_function = block_parent start_bb in
1329
1330               let then_bb = append_block context "then" the_function in
1331
1332               (* Emit 'then' value. *)
1333               position_at_end then_bb builder;
1334               let then_val = codegen_expr then_ in
1335
1336               (* Codegen of 'then' can change the current block, update then_bb for the
1337                * phi. We create a new name because one is used for the phi node, and the
1338                * other is used for the conditional branch. *)
1339               let new_then_bb = insertion_block builder in
1340
1341               (* Emit 'else' value. *)
1342               let else_bb = append_block context "else" the_function in
1343               position_at_end else_bb builder;
1344               let else_val = codegen_expr else_ in
1345
1346               (* Codegen of 'else' can change the current block, update else_bb for the
1347                * phi. *)
1348               let new_else_bb = insertion_block builder in
1349
1350               (* Emit merge block. *)
1351               let merge_bb = append_block context "ifcont" the_function in
1352               position_at_end merge_bb builder;
1353               let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
1354               let phi = build_phi incoming "iftmp" builder in
1355
1356               (* Return to the start block to add the conditional branch. *)
1357               position_at_end start_bb builder;
1358               ignore (build_cond_br cond_val then_bb else_bb builder);
1359
1360               (* Set a unconditional branch at the end of the 'then' block and the
1361                * 'else' block to the 'merge' block. *)
1362               position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
1363               position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
1364
1365               (* Finally, set the builder to the end of the merge block. *)
1366               position_at_end merge_bb builder;
1367
1368               phi
1369           | Ast.For (var_name, start, end_, step, body) ->
1370               (* Output this as:
1371                *   var = alloca double
1372                *   ...
1373                *   start = startexpr
1374                *   store start -> var
1375                *   goto loop
1376                * loop:
1377                *   ...
1378                *   bodyexpr
1379                *   ...
1380                * loopend:
1381                *   step = stepexpr
1382                *   endcond = endexpr
1383                *
1384                *   curvar = load var
1385                *   nextvar = curvar + step
1386                *   store nextvar -> var
1387                *   br endcond, loop, endloop
1388                * outloop: *)
1389
1390               let the_function = block_parent (insertion_block builder) in
1391
1392               (* Create an alloca for the variable in the entry block. *)
1393               let alloca = create_entry_block_alloca the_function var_name in
1394
1395               (* Emit the start code first, without 'variable' in scope. *)
1396               let start_val = codegen_expr start in
1397
1398               (* Store the value into the alloca. *)
1399               ignore(build_store start_val alloca builder);
1400
1401               (* Make the new basic block for the loop header, inserting after current
1402                * block. *)
1403               let loop_bb = append_block context "loop" the_function in
1404
1405               (* Insert an explicit fall through from the current block to the
1406                * loop_bb. *)
1407               ignore (build_br loop_bb builder);
1408
1409               (* Start insertion in loop_bb. *)
1410               position_at_end loop_bb builder;
1411
1412               (* Within the loop, the variable is defined equal to the PHI node. If it
1413                * shadows an existing variable, we have to restore it, so save it
1414                * now. *)
1415               let old_val =
1416                 try Some (Hashtbl.find named_values var_name) with Not_found -> None
1417               in
1418               Hashtbl.add named_values var_name alloca;
1419
1420               (* Emit the body of the loop.  This, like any other expr, can change the
1421                * current BB.  Note that we ignore the value computed by the body, but
1422                * don't allow an error *)
1423               ignore (codegen_expr body);
1424
1425               (* Emit the step value. *)
1426               let step_val =
1427                 match step with
1428                 | Some step -> codegen_expr step
1429                 (* If not specified, use 1.0. *)
1430                 | None -> const_float double_type 1.0
1431               in
1432
1433               (* Compute the end condition. *)
1434               let end_cond = codegen_expr end_ in
1435
1436               (* Reload, increment, and restore the alloca. This handles the case where
1437                * the body of the loop mutates the variable. *)
1438               let cur_var = build_load alloca var_name builder in
1439               let next_var = build_add cur_var step_val "nextvar" builder in
1440               ignore(build_store next_var alloca builder);
1441
1442               (* Convert condition to a bool by comparing equal to 0.0. *)
1443               let zero = const_float double_type 0.0 in
1444               let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
1445
1446               (* Create the "after loop" block and insert it. *)
1447               let after_bb = append_block context "afterloop" the_function in
1448
1449               (* Insert the conditional branch into the end of loop_end_bb. *)
1450               ignore (build_cond_br end_cond loop_bb after_bb builder);
1451
1452               (* Any new code will be inserted in after_bb. *)
1453               position_at_end after_bb builder;
1454
1455               (* Restore the unshadowed variable. *)
1456               begin match old_val with
1457               | Some old_val -> Hashtbl.add named_values var_name old_val
1458               | None -> ()
1459               end;
1460
1461               (* for expr always returns 0.0. *)
1462               const_null double_type
1463           | Ast.Var (var_names, body) ->
1464               let old_bindings = ref [] in
1465
1466               let the_function = block_parent (insertion_block builder) in
1467
1468               (* Register all variables and emit their initializer. *)
1469               Array.iter (fun (var_name, init) ->
1470                 (* Emit the initializer before adding the variable to scope, this
1471                  * prevents the initializer from referencing the variable itself, and
1472                  * permits stuff like this:
1473                  *   var a = 1 in
1474                  *     var a = a in ...   # refers to outer 'a'. *)
1475                 let init_val =
1476                   match init with
1477                   | Some init -> codegen_expr init
1478                   (* If not specified, use 0.0. *)
1479                   | None -> const_float double_type 0.0
1480                 in
1481
1482                 let alloca = create_entry_block_alloca the_function var_name in
1483                 ignore(build_store init_val alloca builder);
1484
1485                 (* Remember the old variable binding so that we can restore the binding
1486                  * when we unrecurse. *)
1487                 begin
1488                   try
1489                     let old_value = Hashtbl.find named_values var_name in
1490                     old_bindings := (var_name, old_value) :: !old_bindings;
1491                   with Not_found -> ()
1492                 end;
1493
1494                 (* Remember this binding. *)
1495                 Hashtbl.add named_values var_name alloca;
1496               ) var_names;
1497
1498               (* Codegen the body, now that all vars are in scope. *)
1499               let body_val = codegen_expr body in
1500
1501               (* Pop all our variables from scope. *)
1502               List.iter (fun (var_name, old_value) ->
1503                 Hashtbl.add named_values var_name old_value
1504               ) !old_bindings;
1505
1506               (* Return the body computation. *)
1507               body_val
1508
1509         let codegen_proto = function
1510           | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) ->
1511               (* Make the function type: double(double,double) etc. *)
1512               let doubles = Array.make (Array.length args) double_type in
1513               let ft = function_type double_type doubles in
1514               let f =
1515                 match lookup_function name the_module with
1516                 | None -> declare_function name ft the_module
1517
1518                 (* If 'f' conflicted, there was already something named 'name'. If it
1519                  * has a body, don't allow redefinition or reextern. *)
1520                 | Some f ->
1521                     (* If 'f' already has a body, reject this. *)
1522                     if block_begin f <> At_end f then
1523                       raise (Error "redefinition of function");
1524
1525                     (* If 'f' took a different number of arguments, reject. *)
1526                     if element_type (type_of f) <> ft then
1527                       raise (Error "redefinition of function with different # args");
1528                     f
1529               in
1530
1531               (* Set names for all arguments. *)
1532               Array.iteri (fun i a ->
1533                 let n = args.(i) in
1534                 set_value_name n a;
1535                 Hashtbl.add named_values n a;
1536               ) (params f);
1537               f
1538
1539         (* Create an alloca for each argument and register the argument in the symbol
1540          * table so that references to it will succeed. *)
1541         let create_argument_allocas the_function proto =
1542           let args = match proto with
1543             | Ast.Prototype (_, args) | Ast.BinOpPrototype (_, args, _) -> args
1544           in
1545           Array.iteri (fun i ai ->
1546             let var_name = args.(i) in
1547             (* Create an alloca for this variable. *)
1548             let alloca = create_entry_block_alloca the_function var_name in
1549
1550             (* Store the initial value into the alloca. *)
1551             ignore(build_store ai alloca builder);
1552
1553             (* Add arguments to variable symbol table. *)
1554             Hashtbl.add named_values var_name alloca;
1555           ) (params the_function)
1556
1557         let codegen_func the_fpm = function
1558           | Ast.Function (proto, body) ->
1559               Hashtbl.clear named_values;
1560               let the_function = codegen_proto proto in
1561
1562               (* If this is an operator, install it. *)
1563               begin match proto with
1564               | Ast.BinOpPrototype (name, args, prec) ->
1565                   let op = name.[String.length name - 1] in
1566                   Hashtbl.add Parser.binop_precedence op prec;
1567               | _ -> ()
1568               end;
1569
1570               (* Create a new basic block to start insertion into. *)
1571               let bb = append_block context "entry" the_function in
1572               position_at_end bb builder;
1573
1574               try
1575                 (* Add all arguments to the symbol table and create their allocas. *)
1576                 create_argument_allocas the_function proto;
1577
1578                 let ret_val = codegen_expr body in
1579
1580                 (* Finish off the function. *)
1581                 let _ = build_ret ret_val builder in
1582
1583                 (* Validate the generated code, checking for consistency. *)
1584                 Llvm_analysis.assert_valid_function the_function;
1585
1586                 (* Optimize the function. *)
1587                 let _ = PassManager.run_function the_function the_fpm in
1588
1589                 the_function
1590               with e ->
1591                 delete_function the_function;
1592                 raise e
1593
1594 toplevel.ml:
1595     .. code-block:: ocaml
1596
1597         (*===----------------------------------------------------------------------===
1598          * Top-Level parsing and JIT Driver
1599          *===----------------------------------------------------------------------===*)
1600
1601         open Llvm
1602         open Llvm_executionengine
1603
1604         (* top ::= definition | external | expression | ';' *)
1605         let rec main_loop the_fpm the_execution_engine stream =
1606           match Stream.peek stream with
1607           | None -> ()
1608
1609           (* ignore top-level semicolons. *)
1610           | Some (Token.Kwd ';') ->
1611               Stream.junk stream;
1612               main_loop the_fpm the_execution_engine stream
1613
1614           | Some token ->
1615               begin
1616                 try match token with
1617                 | Token.Def ->
1618                     let e = Parser.parse_definition stream in
1619                     print_endline "parsed a function definition.";
1620                     dump_value (Codegen.codegen_func the_fpm e);
1621                 | Token.Extern ->
1622                     let e = Parser.parse_extern stream in
1623                     print_endline "parsed an extern.";
1624                     dump_value (Codegen.codegen_proto e);
1625                 | _ ->
1626                     (* Evaluate a top-level expression into an anonymous function. *)
1627                     let e = Parser.parse_toplevel stream in
1628                     print_endline "parsed a top-level expr";
1629                     let the_function = Codegen.codegen_func the_fpm e in
1630                     dump_value the_function;
1631
1632                     (* JIT the function, returning a function pointer. *)
1633                     let result = ExecutionEngine.run_function the_function [||]
1634                       the_execution_engine in
1635
1636                     print_string "Evaluated to ";
1637                     print_float (GenericValue.as_float Codegen.double_type result);
1638                     print_newline ();
1639                 with Stream.Error s | Codegen.Error s ->
1640                   (* Skip token for error recovery. *)
1641                   Stream.junk stream;
1642                   print_endline s;
1643               end;
1644               print_string "ready> "; flush stdout;
1645               main_loop the_fpm the_execution_engine stream
1646
1647 toy.ml:
1648     .. code-block:: ocaml
1649
1650         (*===----------------------------------------------------------------------===
1651          * Main driver code.
1652          *===----------------------------------------------------------------------===*)
1653
1654         open Llvm
1655         open Llvm_executionengine
1656         open Llvm_target
1657         open Llvm_scalar_opts
1658
1659         let main () =
1660           ignore (initialize_native_target ());
1661
1662           (* Install standard binary operators.
1663            * 1 is the lowest precedence. *)
1664           Hashtbl.add Parser.binop_precedence '=' 2;
1665           Hashtbl.add Parser.binop_precedence '<' 10;
1666           Hashtbl.add Parser.binop_precedence '+' 20;
1667           Hashtbl.add Parser.binop_precedence '-' 20;
1668           Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
1669
1670           (* Prime the first token. *)
1671           print_string "ready> "; flush stdout;
1672           let stream = Lexer.lex (Stream.of_channel stdin) in
1673
1674           (* Create the JIT. *)
1675           let the_execution_engine = ExecutionEngine.create Codegen.the_module in
1676           let the_fpm = PassManager.create_function Codegen.the_module in
1677
1678           (* Set up the optimizer pipeline.  Start with registering info about how the
1679            * target lays out data structures. *)
1680           DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
1681
1682           (* Promote allocas to registers. *)
1683           add_memory_to_register_promotion the_fpm;
1684
1685           (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
1686           add_instruction_combination the_fpm;
1687
1688           (* reassociate expressions. *)
1689           add_reassociation the_fpm;
1690
1691           (* Eliminate Common SubExpressions. *)
1692           add_gvn the_fpm;
1693
1694           (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
1695           add_cfg_simplification the_fpm;
1696
1697           ignore (PassManager.initialize the_fpm);
1698
1699           (* Run the main "interpreter loop" now. *)
1700           Toplevel.main_loop the_fpm the_execution_engine stream;
1701
1702           (* Print out all the generated code. *)
1703           dump_module Codegen.the_module
1704         ;;
1705
1706         main ()
1707
1708 bindings.c
1709     .. code-block:: c
1710
1711         #include <stdio.h>
1712
1713         /* putchard - putchar that takes a double and returns 0. */
1714         extern double putchard(double X) {
1715           putchar((char)X);
1716           return 0;
1717         }
1718
1719         /* printd - printf that takes a double prints it as "%f\n", returning 0. */
1720         extern double printd(double X) {
1721           printf("%f\n", X);
1722           return 0;
1723         }
1724
1725 `Next: Conclusion and other useful LLVM tidbits <OCamlLangImpl8.html>`_
1726