<title>Kaleidoscope: Implementing code generation to LLVM IR</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="author" content="Chris Lattner">
- <link rel="stylesheet" href="../llvm.css" type="text/css">
+ <link rel="stylesheet" href="../_static/llvm.css" type="text/css">
</head>
<body>
-<div class="doc_title">Kaleidoscope: Code generation to LLVM IR</div>
+<h1>Kaleidoscope: Code generation to LLVM IR</h1>
<ul>
<li><a href="index.html">Up to Tutorial Index</a></li>
</div>
<!-- *********************************************************************** -->
-<div class="doc_section"><a name="intro">Chapter 3 Introduction</a></div>
+<h2><a name="intro">Chapter 3 Introduction</a></h2>
<!-- *********************************************************************** -->
-<div class="doc_text">
+<div>
<p>Welcome to Chapter 3 of the "<a href="index.html">Implementing a language
with LLVM</a>" tutorial. This chapter shows you how to transform the <a
</p>
<p><b>Please note</b>: the code in this chapter and later require LLVM 2.2 or
-LLVM SVN to work. LLVM 2.1 and before will not work with it.</p>
+later. LLVM 2.1 and before will not work with it. Also note that you need
+to use a version of this tutorial that matches your LLVM release: If you are
+using an official LLVM release, use the version of the documentation included
+with your release or on the <a href="http://llvm.org/releases/">llvm.org
+releases page</a>.</p>
</div>
<!-- *********************************************************************** -->
-<div class="doc_section"><a name="basics">Code Generation Setup</a></div>
+<h2><a name="basics">Code Generation Setup</a></h2>
<!-- *********************************************************************** -->
-<div class="doc_text">
+<div>
<p>
In order to generate LLVM IR, we want some simple setup to get started. First
class NumberExprAST : public ExprAST {
double Val;
public:
- explicit NumberExprAST(double val) : Val(val) {}
+ NumberExprAST(double val) : Val(val) {}
<b>virtual Value *Codegen();</b>
};
...
Value *ErrorV(const char *Str) { Error(Str); return 0; }
static Module *TheModule;
-static IRBuilder Builder;
+static IRBuilder<> Builder(getGlobalContext());
static std::map<std::string, Value*> NamedValues;
</pre>
</div>
<p>The <tt>Builder</tt> object is a helper object that makes it easy to generate
LLVM instructions. Instances of the <a
href="http://llvm.org/doxygen/IRBuilder_8h-source.html"><tt>IRBuilder</tt></a>
-class keep track of the current place to insert instructions and has methods to
-create new instructions.</p>
+class template keep track of the current place to insert instructions and has
+methods to create new instructions.</p>
<p>The <tt>NamedValues</tt> map keeps track of which values are defined in the
current scope and what their LLVM representation is. (In other words, it is a
</div>
<!-- *********************************************************************** -->
-<div class="doc_section"><a name="exprs">Expression Code Generation</a></div>
+<h2><a name="exprs">Expression Code Generation</a></h2>
<!-- *********************************************************************** -->
-<div class="doc_text">
+<div>
<p>Generating LLVM code for expression nodes is very straightforward: less
than 45 lines of commented code for all four of our expression nodes. First
<div class="doc_code">
<pre>
Value *NumberExprAST::Codegen() {
- return ConstantFP::get(Type::DoubleTy, APFloat(Val));
+ return ConstantFP::get(getGlobalContext(), APFloat(Val));
}
</pre>
</div>
constants of <em>A</em>rbitrary <em>P</em>recision). This code basically just
creates and returns a <tt>ConstantFP</tt>. Note that in the LLVM IR
that constants are all uniqued together and shared. For this reason, the API
-uses "the foo::get(..)" idiom instead of "new foo(..)" or "foo::create(..)".</p>
+uses the "foo::get(...)" idiom instead of "new foo(..)" or "foo::Create(..)".</p>
<div class="doc_code">
<pre>
</div>
<p>References to variables are also quite simple using LLVM. In the simple version
-of Kaleidoscope, we assume that the variable has already been emited somewhere
+of Kaleidoscope, we assume that the variable has already been emitted somewhere
and its value is available. In practice, the only values that can be in the
<tt>NamedValues</tt> map are function arguments. This
code simply checks to see that the specified name is in the map (if not, an
if (L == 0 || R == 0) return 0;
switch (Op) {
- case '+': return Builder.CreateAdd(L, R, "addtmp");
- case '-': return Builder.CreateSub(L, R, "subtmp");
- case '*': return Builder.CreateMul(L, R, "multmp");
+ case '+': return Builder.CreateFAdd(L, R, "addtmp");
+ case '-': return Builder.CreateFSub(L, R, "subtmp");
+ case '*': return Builder.CreateFMul(L, R, "multmp");
case '<':
L = Builder.CreateFCmpULT(L, R, "cmptmp");
// Convert bool 0/1 to double 0.0 or 1.0
- return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
+ return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
+ "booltmp");
default: return ErrorV("invalid binary operator");
}
}
<p>In the example above, the LLVM builder class is starting to show its value.
IRBuilder knows where to insert the newly created instruction, all you have to
-do is specify what instruction to create (e.g. with <tt>CreateAdd</tt>), which
+do is specify what instruction to create (e.g. with <tt>CreateFAdd</tt>), which
operands to use (<tt>L</tt> and <tt>R</tt> here) and optionally provide a name
for the generated instruction.</p>
if (ArgsV.back() == 0) return 0;
}
- return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
+ return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
}
</pre>
</div>
</div>
<!-- *********************************************************************** -->
-<div class="doc_section"><a name="funcs">Function Code Generation</a></div>
+<h2><a name="funcs">Function Code Generation</a></h2>
<!-- *********************************************************************** -->
-<div class="doc_text">
+<div>
<p>Code generation for prototypes and functions must handle a number of
details, which make their code less beautiful than expression code
<pre>
Function *PrototypeAST::Codegen() {
// Make the function type: double(double,double) etc.
- std::vector<const Type*> Doubles(Args.size(), Type::DoubleTy);
- FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
-
+ std::vector<Type*> Doubles(Args.size(),
+ Type::getDoubleTy(getGlobalContext()));
+ FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
+ Doubles, false);
+
Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
</pre>
</div>
<p>The call to <tt>FunctionType::get</tt> creates
the <tt>FunctionType</tt> that should be used for a given Prototype. Since all
function arguments in Kaleidoscope are of type double, the first line creates
-a vector of "N" LLVM double types. It then uses the <tt>FunctionType::get</tt>
+a vector of "N" LLVM double types. It then uses the <tt>Functiontype::get</tt>
method to create a function type that takes "N" doubles as arguments, returns
one double as a result, and that is not vararg (the false parameter indicates
this). Note that Types in LLVM are uniqued just like Constants are, so you
</div>
<p>The Module symbol table works just like the Function symbol table when it
-comes to name conflicts: if a new function is created with a name was previously
-added to the symbol table, it will get implicitly renamed when added to the
+comes to name conflicts: if a new function is created with a name that was previously
+added to the symbol table, the new function will get implicitly renamed when added to the
Module. The code above exploits this fact to determine if there was a previous
definition of this function.</p>
first, we want to allow 'extern'ing a function more than once, as long as the
prototypes for the externs match (since all arguments have the same type, we
just have to check that the number of arguments match). Second, we want to
-allow 'extern'ing a function and then definining a body for it. This is useful
+allow 'extern'ing a function and then defining a body for it. This is useful
when defining mutually recursive functions.</p>
<p>In order to implement this, the code above first checks to see if there is
<div class="doc_code">
<pre>
// Create a new basic block to start insertion into.
- BasicBlock *BB = BasicBlock::Create("entry", TheFunction);
+ BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
Builder.SetInsertPoint(BB);
if (Value *RetVal = Body->Codegen()) {
if (Value *RetVal = Body->Codegen()) {
// Finish off the function.
Builder.CreateRet(RetVal);
-
+
// Validate the generated code, checking for consistency.
verifyFunction(*TheFunction);
+
return TheFunction;
}
</pre>
</div>
<!-- *********************************************************************** -->
-<div class="doc_section"><a name="driver">Driver Changes and
-Closing Thoughts</a></div>
+<h2><a name="driver">Driver Changes and Closing Thoughts</a></h2>
<!-- *********************************************************************** -->
-<div class="doc_text">
+<div>
<p>
For now, code generation to LLVM doesn't really get us much, except that we can
<pre>
ready> <b>4+5</b>;
Read top-level expression:
-define double @""() {
+define double @0() {
entry:
- %addtmp = add double 4.000000e+00, 5.000000e+00
- ret double %addtmp
+ ret double 9.000000e+00
}
</pre>
</div>
<p>Note how the parser turns the top-level expression into anonymous functions
for us. This will be handy when we add <a href="LangImpl4.html#jit">JIT
support</a> in the next chapter. Also note that the code is very literally
-transcribed, no optimizations are being performed. We will
+transcribed, no optimizations are being performed except simple constant
+folding done by IRBuilder. We will
<a href="LangImpl4.html#trivialconstfold">add optimizations</a> explicitly in
the next chapter.</p>
Read function definition:
define double @foo(double %a, double %b) {
entry:
- %multmp = mul double %a, %a
- %multmp1 = mul double 2.000000e+00, %a
- %multmp2 = mul double %multmp1, %b
- %addtmp = add double %multmp, %multmp2
- %multmp3 = mul double %b, %b
- %addtmp4 = add double %addtmp, %multmp3
- ret double %addtmp4
+ %multmp = fmul double %a, %a
+ %multmp1 = fmul double 2.000000e+00, %a
+ %multmp2 = fmul double %multmp1, %b
+ %addtmp = fadd double %multmp, %multmp2
+ %multmp3 = fmul double %b, %b
+ %addtmp4 = fadd double %addtmp, %multmp3
+ ret double %addtmp4
}
</pre>
</div>
Read function definition:
define double @bar(double %a) {
entry:
- %calltmp = call double @foo( double %a, double 4.000000e+00 )
- %calltmp1 = call double @bar( double 3.133700e+04 )
- %addtmp = add double %calltmp, %calltmp1
- ret double %addtmp
+ %calltmp = call double @foo(double %a, double 4.000000e+00)
+ %calltmp1 = call double @bar(double 3.133700e+04)
+ %addtmp = fadd double %calltmp, %calltmp1
+ ret double %addtmp
}
</pre>
</div>
ready> <b>cos(1.234);</b>
Read top-level expression:
-define double @""() {
+define double @1() {
entry:
- %calltmp = call double @cos( double 1.234000e+00 )
- ret double %calltmp
+ %calltmp = call double @cos(double 1.234000e+00)
+ ret double %calltmp
}
</pre>
</div>
ready> <b>^D</b>
; ModuleID = 'my cool jit'
-define double @""() {
+define double @0() {
entry:
- %addtmp = add double 4.000000e+00, 5.000000e+00
- ret double %addtmp
+ %addtmp = fadd double 4.000000e+00, 5.000000e+00
+ ret double %addtmp
}
define double @foo(double %a, double %b) {
entry:
- %multmp = mul double %a, %a
- %multmp1 = mul double 2.000000e+00, %a
- %multmp2 = mul double %multmp1, %b
- %addtmp = add double %multmp, %multmp2
- %multmp3 = mul double %b, %b
- %addtmp4 = add double %addtmp, %multmp3
- ret double %addtmp4
+ %multmp = fmul double %a, %a
+ %multmp1 = fmul double 2.000000e+00, %a
+ %multmp2 = fmul double %multmp1, %b
+ %addtmp = fadd double %multmp, %multmp2
+ %multmp3 = fmul double %b, %b
+ %addtmp4 = fadd double %addtmp, %multmp3
+ ret double %addtmp4
}
define double @bar(double %a) {
entry:
- %calltmp = call double @foo( double %a, double 4.000000e+00 )
- %calltmp1 = call double @bar( double 3.133700e+04 )
- %addtmp = add double %calltmp, %calltmp1
- ret double %addtmp
+ %calltmp = call double @foo(double %a, double 4.000000e+00)
+ %calltmp1 = call double @bar(double 3.133700e+04)
+ %addtmp = fadd double %calltmp, %calltmp1
+ ret double %addtmp
}
declare double @cos(double)
-define double @""() {
+define double @1() {
entry:
- %calltmp = call double @cos( double 1.234000e+00 )
- ret double %calltmp
+ %calltmp = call double @cos(double 1.234000e+00)
+ ret double %calltmp
}
</pre>
</div>
<!-- *********************************************************************** -->
-<div class="doc_section"><a name="code">Full Code Listing</a></div>
+<h2><a name="code">Full Code Listing</a></h2>
<!-- *********************************************************************** -->
-<div class="doc_text">
+<div>
<p>
Here is the complete code listing for our running example, enhanced with the
<div class="doc_code">
<pre>
- # Compile
- g++ -g -O3 toy.cpp `llvm-config --cppflags --ldflags --libs core` -o toy
- # Run
- ./toy
+# Compile
+clang++ -g -O3 toy.cpp `llvm-config --cppflags --ldflags --libs core` -o toy
+# Run
+./toy
</pre>
</div>
// See example below.
#include "llvm/DerivedTypes.h"
+#include "llvm/IRBuilder.h"
+#include "llvm/LLVMContext.h"
#include "llvm/Module.h"
#include "llvm/Analysis/Verifier.h"
-#include "llvm/Support/IRBuilder.h"
#include <cstdio>
#include <string>
#include <map>
tok_def = -2, tok_extern = -3,
// primary
- tok_identifier = -4, tok_number = -5,
+ tok_identifier = -4, tok_number = -5
};
static std::string IdentifierStr; // Filled in if tok_identifier
class NumberExprAST : public ExprAST {
double Val;
public:
- explicit NumberExprAST(double val) : Val(val) {}
+ NumberExprAST(double val) : Val(val) {}
virtual Value *Codegen();
};
class VariableExprAST : public ExprAST {
std::string Name;
public:
- explicit VariableExprAST(const std::string &name) : Name(name) {}
+ VariableExprAST(const std::string &name) : Name(name) {}
virtual Value *Codegen();
};
};
/// PrototypeAST - This class represents the "prototype" for a function,
-/// which captures its argument names as well as if it is an operator.
+/// which captures its name, and its argument names (thus implicitly the number
+/// of arguments the function takes).
class PrototypeAST {
std::string Name;
std::vector<std::string> Args;
//===----------------------------------------------------------------------===//
/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
-/// token the parser it looking at. getNextToken reads another token from the
+/// token the parser is looking at. getNextToken reads another token from the
/// lexer and updates CurTok with its results.
static int CurTok;
static int getNextToken() {
ExprAST *Arg = ParseExpression();
if (!Arg) return 0;
Args.push_back(Arg);
-
+
if (CurTok == ')') break;
-
+
if (CurTok != ',')
return Error("Expected ')' or ',' in argument list");
getNextToken();
//===----------------------------------------------------------------------===//
static Module *TheModule;
-static IRBuilder Builder;
+static IRBuilder<> Builder(getGlobalContext());
static std::map<std::string, Value*> NamedValues;
Value *ErrorV(const char *Str) { Error(Str); return 0; }
Value *NumberExprAST::Codegen() {
- return ConstantFP::get(Type::DoubleTy, APFloat(Val));
+ return ConstantFP::get(getGlobalContext(), APFloat(Val));
}
Value *VariableExprAST::Codegen() {
if (L == 0 || R == 0) return 0;
switch (Op) {
- case '+': return Builder.CreateAdd(L, R, "addtmp");
- case '-': return Builder.CreateSub(L, R, "subtmp");
- case '*': return Builder.CreateMul(L, R, "multmp");
+ case '+': return Builder.CreateFAdd(L, R, "addtmp");
+ case '-': return Builder.CreateFSub(L, R, "subtmp");
+ case '*': return Builder.CreateFMul(L, R, "multmp");
case '<':
L = Builder.CreateFCmpULT(L, R, "cmptmp");
// Convert bool 0/1 to double 0.0 or 1.0
- return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
+ return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
+ "booltmp");
default: return ErrorV("invalid binary operator");
}
}
if (ArgsV.back() == 0) return 0;
}
- return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
+ return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
}
Function *PrototypeAST::Codegen() {
// Make the function type: double(double,double) etc.
- std::vector<const Type*> Doubles(Args.size(), Type::DoubleTy);
- FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
+ std::vector<Type*> Doubles(Args.size(),
+ Type::getDoubleTy(getGlobalContext()));
+ FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
+ Doubles, false);
Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
return 0;
// Create a new basic block to start insertion into.
- BasicBlock *BB = BasicBlock::Create("entry", TheFunction);
+ BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
Builder.SetInsertPoint(BB);
if (Value *RetVal = Body->Codegen()) {
// Finish off the function.
Builder.CreateRet(RetVal);
-
+
// Validate the generated code, checking for consistency.
verifyFunction(*TheFunction);
+
return TheFunction;
}
}
static void HandleTopLevelExpression() {
- // Evaluate a top level expression into an anonymous function.
+ // Evaluate a top-level expression into an anonymous function.
if (FunctionAST *F = ParseTopLevelExpr()) {
if (Function *LF = F->Codegen()) {
fprintf(stderr, "Read top-level expression:");
fprintf(stderr, "ready> ");
switch (CurTok) {
case tok_eof: return;
- case ';': getNextToken(); break; // ignore top level semicolons.
+ case ';': getNextToken(); break; // ignore top-level semicolons.
case tok_def: HandleDefinition(); break;
case tok_extern: HandleExtern(); break;
default: HandleTopLevelExpression(); break;
}
}
-
-
//===----------------------------------------------------------------------===//
// "Library" functions that can be "extern'd" from user code.
//===----------------------------------------------------------------------===//
//===----------------------------------------------------------------------===//
int main() {
- TheModule = new Module("my cool jit");
+ LLVMContext &Context = getGlobalContext();
// Install standard binary operators.
// 1 is lowest precedence.
fprintf(stderr, "ready> ");
getNextToken();
+ // Make the module, which holds all the code.
+ TheModule = new Module("my cool jit", Context);
+
+ // Run the main "interpreter loop" now.
MainLoop();
+
+ // Print out all of the generated code.
TheModule->dump();
+
return 0;
}
</pre>
src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
<a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
- <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
- Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
+ <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br>
+ Last modified: $Date$
</address>
</body>
</html>