X-Git-Url: http://plrg.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FProgrammersManual.html;h=a9daba3ba93db94047a96da34c07a75d764dbb06;hb=28622d621cd2b8dd879899598a2dc824831ddbbd;hp=56751e2bd4ea83471e57c17bd3d9a6040f63efb6;hpb=d8db4eb5d7441f6b78ca4024927734c858071109;p=oota-llvm.git diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html index 56751e2bd4e..a9daba3ba93 100644 --- a/docs/ProgrammersManual.html +++ b/docs/ProgrammersManual.html @@ -1,47 +1,84 @@ - + LLVM Programmer's Manual + - - - - - - - -
  LLVM Programmer's Manual
+ + +
+ LLVM Programmer's Manual +
+
    -
  1. Introduction
  2. +
  3. Introduction
  4. General Information
  5. Important and useful LLVM APIs
  6. +
  7. Picking the Right Data Structure for a Task + +
  8. Helpful Hints for Common Operations +
  9. +--> + +
  10. Advanced Topics +
  11. +
  12. The Core LLVM Class Hierarchy Reference -

    Written by Chris Lattner,Dinakar Dhurjati, and Joel Stanley

    -

+ +
+

Written by Chris Lattner, + Dinakar Dhurjati, + Joel Stanley, and + Reid Spencer

+
+ - - - - - - -
Introduction
- - - - - - - -
General Information
- - - - - - - - -
   The C++ Standard Template -Library
- - - - - - - - -
   Other useful references
- - - - - - - -
Important and useful LLVM -APIs
- - - - - - - - -
   The isa<>, -cast<> and dyn_cast<> templates
- - - - - - - - -
   The DEBUG() macro -& -debug option
- -

-
Fine grained debug info with DEBUG_TYPE() and the -debug-only -option

- - - - - - - - -
   The Statistic -template & -stats option
- - - - - - - -
Helpful Hints for Common -Operations
- - - - - - - - -
   Basic Inspection and -Traversal Routines
+ +
+
+   7646 bitcodewriter   - Number of normal instructions
+    725 bitcodewriter   - Number of oversized instructions
+ 129996 bitcodewriter   - Number of bitcode bytes written
+   2817 raise           - Number of insts DCEd or constprop'd
+   3213 raise           - Number of cast-of-self removed
+   5046 raise           - Number of expression trees converted
+     75 raise           - Number of other getelementptr's formed
+    138 raise           - Number of load/store peepholes
+     42 deadtypeelim    - Number of unused typenames removed from symtab
+    392 funcresolve     - Number of varargs functions resolved
+     27 globaldce       - Number of global variables removed
+      2 adce            - Number of basic blocks removed
+    134 cee             - Number of branches revectored
+     49 cee             - Number of setcc instruction eliminated
+    532 gcse            - Number of loads removed
+   2919 gcse            - Number of instructions removed
+     86 indvars         - Number of canonical indvars added
+     87 indvars         - Number of aux indvars removed
+     25 instcombine     - Number of dead inst eliminate
+    434 instcombine     - Number of insts combined
+    248 licm            - Number of load insts hoisted
+   1298 licm            - Number of insts hoisted to a loop pre-header
+      3 licm            - Number of insts hoisted to multiple loop preds (bad, no loop pre-header)
+     75 mem2reg         - Number of alloca's promoted
+   1444 cfgsimplify     - Number of blocks simplified
+
+
+ +

Obviously, with so many optimizations, having a unified framework for this +stuff is very nice. Making your pass fit well into the framework makes it more +maintainable and useful.

+ + + + +
+ Viewing graphs while debugging code +
+ +
+ +

Several of the important data structures in LLVM are graphs: for example +CFGs made out of LLVM BasicBlocks, CFGs made out of +LLVM MachineBasicBlocks, and +Instruction Selection +DAGs. In many cases, while debugging various parts of the compiler, it is +nice to instantly visualize these graphs.

+ +

LLVM provides several callbacks that are available in a debug build to do +exactly that. If you call the Function::viewCFG() method, for example, +the current LLVM tool will pop up a window containing the CFG for the function +where each basic block is a node in the graph, and each node contains the +instructions in the block. Similarly, there also exists +Function::viewCFGOnly() (does not include the instructions), the +MachineFunction::viewCFG() and MachineFunction::viewCFGOnly(), +and the SelectionDAG::viewGraph() methods. Within GDB, for example, +you can usually use something like call DAG.viewGraph() to pop +up a window. Alternatively, you can sprinkle calls to these functions in your +code in places you want to debug.

+ +

Getting this to work requires a small amount of configuration. On Unix +systems with X11, install the graphviz +toolkit, and make sure 'dot' and 'gv' are in your path. If you are running on +Mac OS/X, download and install the Mac OS/X Graphviz program, and add +/Applications/Graphviz.app/Contents/MacOS/ (or wherever you install +it) to your path. Once in your system and path are set up, rerun the LLVM +configure script and rebuild LLVM to enable this functionality.

+ +

SelectionDAG has been extended to make it easier to locate +interesting nodes in large complex graphs. From gdb, if you +call DAG.setGraphColor(node, "color"), then the +next call DAG.viewGraph() would highlight the node in the +specified color (choices of colors can be found at colors.) More +complex node attributes can be provided with call +DAG.setGraphAttrs(node, "attributes") (choices can be +found at Graph +Attributes.) If you want to restart and clear all the current graph +attributes, then you can call DAG.clearGraphAttrs().

+ +
+ + +
+ Picking the Right Data Structure for a Task +
+ + +
+ +

LLVM has a plethora of data structures in the llvm/ADT/ directory, + and we commonly use STL data structures. This section describes the trade-offs + you should consider when you pick one.

+ +

+The first step is a choose your own adventure: do you want a sequential +container, a set-like container, or a map-like container? The most important +thing when choosing a container is the algorithmic properties of how you plan to +access the container. Based on that, you should use:

+ -

-
Iterating over the BasicBlocks in a Function

-
+ + +
+ Sequential Containers (std::vector, std::list, etc) +
+ +
+There are a variety of sequential containers available for you, based on your +needs. Pick the first in this section that will do what you want. +
+ + +
+ Fixed Size Arrays +
+ +
+

Fixed size arrays are very simple and very fast. They are good if you know +exactly how many elements you have, or you have a (low) upper bound on how many +you have.

+
+ + +
+ Heap Allocated Arrays +
+ +
+

Heap allocated arrays (new[] + delete[]) are also simple. They are good if +the number of elements is variable, if you know how many elements you will need +before the array is allocated, and if the array is usually large (if not, +consider a SmallVector). The cost of a heap +allocated array is the cost of the new/delete (aka malloc/free). Also note that +if you are allocating an array of a type with a constructor, the constructor and +destructors will be run for every element in the array (re-sizable vectors only +construct those elements actually used).

+
+ + +
+ "llvm/ADT/SmallVector.h" +
+ +
+

SmallVector<Type, N> is a simple class that looks and smells +just like vector<Type>: +it supports efficient iteration, lays out elements in memory order (so you can +do pointer arithmetic between elements), supports efficient push_back/pop_back +operations, supports efficient random access to its elements, etc.

+ +

The advantage of SmallVector is that it allocates space for +some number of elements (N) in the object itself. Because of this, if +the SmallVector is dynamically smaller than N, no malloc is performed. This can +be a big win in cases where the malloc/free call is far more expensive than the +code that fiddles around with the elements.

+ +

This is good for vectors that are "usually small" (e.g. the number of +predecessors/successors of a block is usually less than 8). On the other hand, +this makes the size of the SmallVector itself large, so you don't want to +allocate lots of them (doing so will waste a lot of space). As such, +SmallVectors are most useful when on the stack.

+ +

SmallVector also provides a nice portable and efficient replacement for +alloca.

+ +
+ + +
+ <vector> +
+ +
+

+std::vector is well loved and respected. It is useful when SmallVector isn't: +when the size of the vector is often large (thus the small optimization will +rarely be a benefit) or if you will be allocating many instances of the vector +itself (which would waste space for elements that aren't in the container). +vector is also useful when interfacing with code that expects vectors :). +

+ +

One worthwhile note about std::vector: avoid code like this:

+ +
+
+for ( ... ) {
+   std::vector<foo> V;
+   use V;
+}
+
+
+ +

Instead, write this as:

+ +
+
+std::vector<foo> V;
+for ( ... ) {
+   use V;
+   V.clear();
+}
+
+
+ +

Doing so will save (at least) one heap allocation and free per iteration of +the loop.

+ +
+ + +
+ <deque> +
+ +
+

std::deque is, in some senses, a generalized version of std::vector. Like +std::vector, it provides constant time random access and other similar +properties, but it also provides efficient access to the front of the list. It +does not guarantee continuity of elements within memory.

+ +

In exchange for this extra flexibility, std::deque has significantly higher +constant factor costs than std::vector. If possible, use std::vector or +something cheaper.

+
+ + +
+ <list> +
+ +
+

std::list is an extremely inefficient class that is rarely useful. +It performs a heap allocation for every element inserted into it, thus having an +extremely high constant factor, particularly for small data types. std::list +also only supports bidirectional iteration, not random access iteration.

+ +

In exchange for this high cost, std::list supports efficient access to both +ends of the list (like std::deque, but unlike std::vector or SmallVector). In +addition, the iterator invalidation characteristics of std::list are stronger +than that of a vector class: inserting or removing an element into the list does +not invalidate iterator or pointers to other elements in the list.

+
+ + +
+ llvm/ADT/ilist +
+ +
+

ilist<T> implements an 'intrusive' doubly-linked list. It is +intrusive, because it requires the element to store and provide access to the +prev/next pointers for the list.

+ +

ilist has the same drawbacks as std::list, and additionally requires an +ilist_traits implementation for the element type, but it provides some novel +characteristics. In particular, it can efficiently store polymorphic objects, +the traits class is informed when an element is inserted or removed from the +list, and ilists are guaranteed to support a constant-time splice operation. +

+ +

These properties are exactly what we want for things like Instructions and +basic blocks, which is why these are implemented with ilists.

+
+ + +
+ Other Sequential Container options +
+ +
+

Other STL containers are available, such as std::string.

+ +

There are also various STL adapter classes such as std::queue, +std::priority_queue, std::stack, etc. These provide simplified access to an +underlying container but don't affect the cost of the container itself.

+ +
+ + + +
+ Set-Like Containers (std::set, SmallSet, SetVector, etc) +
+ +
+ +

Set-like containers are useful when you need to canonicalize multiple values +into a single representation. There are several different choices for how to do +this, providing various trade-offs.

+ +
+ + + +
+ A sorted 'vector' +
+ +
+ +

If you intend to insert a lot of elements, then do a lot of queries, a +great approach is to use a vector (or other sequential container) with +std::sort+std::unique to remove duplicates. This approach works really well if +your usage pattern has these two distinct phases (insert then query), and can be +coupled with a good choice of sequential container. +

+ +

+This combination provides the several nice properties: the result data is +contiguous in memory (good for cache locality), has few allocations, is easy to +address (iterators in the final vector are just indices or pointers), and can be +efficiently queried with a standard binary or radix search.

+ +
+ + +
+ "llvm/ADT/SmallSet.h" +
+ +
+ +

If you have a set-like data structure that is usually small and whose elements +are reasonably small, a SmallSet<Type, N> is a good choice. This set +has space for N elements in place (thus, if the set is dynamically smaller than +N, no malloc traffic is required) and accesses them with a simple linear search. +When the set grows beyond 'N' elements, it allocates a more expensive representation that +guarantees efficient access (for most types, it falls back to std::set, but for +pointers it uses something far better, SmallPtrSet).

+ +

The magic of this class is that it handles small sets extremely efficiently, +but gracefully handles extremely large sets without loss of efficiency. The +drawback is that the interface is quite small: it supports insertion, queries +and erasing, but does not support iteration.

+ +
+ + +
+ "llvm/ADT/SmallPtrSet.h" +
+ +
+ +

SmallPtrSet has all the advantages of SmallSet (and a SmallSet of pointers is +transparently implemented with a SmallPtrSet), but also supports iterators. If +more than 'N' insertions are performed, a single quadratically +probed hash table is allocated and grows as needed, providing extremely +efficient access (constant time insertion/deleting/queries with low constant +factors) and is very stingy with malloc traffic.

+ +

Note that, unlike std::set, the iterators of SmallPtrSet are invalidated +whenever an insertion occurs. Also, the values visited by the iterators are not +visited in sorted order.

+ +
+ + +
+ "llvm/ADT/FoldingSet.h" +
+ +
+ +

+FoldingSet is an aggregate class that is really good at uniquing +expensive-to-create or polymorphic objects. It is a combination of a chained +hash table with intrusive links (uniqued objects are required to inherit from +FoldingSetNode) that uses SmallVector as part of +its ID process.

+ +

Consider a case where you want to implement a "getOrCreateFoo" method for +a complex object (for example, a node in the code generator). The client has a +description of *what* it wants to generate (it knows the opcode and all the +operands), but we don't want to 'new' a node, then try inserting it into a set +only to find out it already exists, at which point we would have to delete it +and return the node that already exists. +

+ +

To support this style of client, FoldingSet perform a query with a +FoldingSetNodeID (which wraps SmallVector) that can be used to describe the +element that we want to query for. The query either returns the element +matching the ID or it returns an opaque ID that indicates where insertion should +take place. Construction of the ID usually does not require heap traffic.

+ +

Because FoldingSet uses intrusive links, it can support polymorphic objects +in the set (for example, you can have SDNode instances mixed with LoadSDNodes). +Because the elements are individually allocated, pointers to the elements are +stable: inserting or removing elements does not invalidate any pointers to other +elements. +

+ +
+ + +
+ <set> +
+ +
+ +

std::set is a reasonable all-around set class, which is decent at +many things but great at nothing. std::set allocates memory for each element +inserted (thus it is very malloc intensive) and typically stores three pointers +per element in the set (thus adding a large amount of per-element space +overhead). It offers guaranteed log(n) performance, which is not particularly +fast from a complexity standpoint (particularly if the elements of the set are +expensive to compare, like strings), and has extremely high constant factors for +lookup, insertion and removal.

+ +

The advantages of std::set are that its iterators are stable (deleting or +inserting an element from the set does not affect iterators or pointers to other +elements) and that iteration over the set is guaranteed to be in sorted order. +If the elements in the set are large, then the relative overhead of the pointers +and malloc traffic is not a big deal, but if the elements of the set are small, +std::set is almost never a good choice.

+ +
+ + +
+ "llvm/ADT/SetVector.h" +
+ +
+

LLVM's SetVector<Type> is an adapter class that combines your choice of +a set-like container along with a Sequential +Container. The important property +that this provides is efficient insertion with uniquing (duplicate elements are +ignored) with iteration support. It implements this by inserting elements into +both a set-like container and the sequential container, using the set-like +container for uniquing and the sequential container for iteration. +

+ +

The difference between SetVector and other sets is that the order of +iteration is guaranteed to match the order of insertion into the SetVector. +This property is really important for things like sets of pointers. Because +pointer values are non-deterministic (e.g. vary across runs of the program on +different machines), iterating over the pointers in the set will +not be in a well-defined order.

+ +

+The drawback of SetVector is that it requires twice as much space as a normal +set and has the sum of constant factors from the set-like container and the +sequential container that it uses. Use it *only* if you need to iterate over +the elements in a deterministic order. SetVector is also expensive to delete +elements out of (linear time), unless you use it's "pop_back" method, which is +faster. +

+ +

SetVector is an adapter class that defaults to using std::vector and std::set +for the underlying containers, so it is quite expensive. However, +"llvm/ADT/SetVector.h" also provides a SmallSetVector class, which +defaults to using a SmallVector and SmallSet of a specified size. If you use +this, and if your sets are dynamically smaller than N, you will save a lot of +heap traffic.

+ +
+ + +
+ "llvm/ADT/UniqueVector.h" +
+ +
+ +

+UniqueVector is similar to SetVector, but it +retains a unique ID for each element inserted into the set. It internally +contains a map and a vector, and it assigns a unique ID for each value inserted +into the set.

+ +

UniqueVector is very expensive: its cost is the sum of the cost of +maintaining both the map and vector, it has high complexity, high constant +factors, and produces a lot of malloc traffic. It should be avoided.

+ +
+ + + +
+ Other Set-Like Container Options +
+ +
+ +

+The STL provides several other options, such as std::multiset and the various +"hash_set" like containers (whether from C++ TR1 or from the SGI library).

+ +

std::multiset is useful if you're not interested in elimination of +duplicates, but has all the drawbacks of std::set. A sorted vector (where you +don't delete duplicate entries) or some other approach is almost always +better.

+ +

The various hash_set implementations (exposed portably by +"llvm/ADT/hash_set") is a simple chained hashtable. This algorithm is as malloc +intensive as std::set (performing an allocation for each element inserted, +thus having really high constant factors) but (usually) provides O(1) +insertion/deletion of elements. This can be useful if your elements are large +(thus making the constant-factor cost relatively low) or if comparisons are +expensive. Element iteration does not visit elements in a useful order.

+ +
+ + +
+ Map-Like Containers (std::map, DenseMap, etc) +
+ +
+Map-like containers are useful when you want to associate data to a key. As +usual, there are a lot of different ways to do this. :) +
+ + +
+ A sorted 'vector' +
+ +
+ +

+If your usage pattern follows a strict insert-then-query approach, you can +trivially use the same approach as sorted vectors +for set-like containers. The only difference is that your query function +(which uses std::lower_bound to get efficient log(n) lookup) should only compare +the key, not both the key and value. This yields the same advantages as sorted +vectors for sets. +

+
+ + +
+ "llvm/ADT/StringMap.h" +
+ +
+ +

+Strings are commonly used as keys in maps, and they are difficult to support +efficiently: they are variable length, inefficient to hash and compare when +long, expensive to copy, etc. StringMap is a specialized container designed to +cope with these issues. It supports mapping an arbitrary range of bytes to an +arbitrary other object.

+ +

The StringMap implementation uses a quadratically-probed hash table, where +the buckets store a pointer to the heap allocated entries (and some other +stuff). The entries in the map must be heap allocated because the strings are +variable length. The string data (key) and the element object (value) are +stored in the same allocation with the string data immediately after the element +object. This container guarantees the "(char*)(&Value+1)" points +to the key string for a value.

+ +

The StringMap is very fast for several reasons: quadratic probing is very +cache efficient for lookups, the hash value of strings in buckets is not +recomputed when lookup up an element, StringMap rarely has to touch the +memory for unrelated objects when looking up a value (even when hash collisions +happen), hash table growth does not recompute the hash values for strings +already in the table, and each pair in the map is store in a single allocation +(the string data is stored in the same allocation as the Value of a pair).

+ +

StringMap also provides query methods that take byte ranges, so it only ever +copies a string if a value is inserted into the table.

+
+ + +
+ "llvm/ADT/IndexedMap.h" +
+ +
+

+IndexedMap is a specialized container for mapping small dense integers (or +values that can be mapped to small dense integers) to some other type. It is +internally implemented as a vector with a mapping function that maps the keys to +the dense integer range. +

+ +

+This is useful for cases like virtual registers in the LLVM code generator: they +have a dense mapping that is offset by a compile-time constant (the first +virtual register ID).

+ +
+ + +
+ "llvm/ADT/DenseMap.h" +
+ +
+ +

+DenseMap is a simple quadratically probed hash table. It excels at supporting +small keys and values: it uses a single allocation to hold all of the pairs that +are currently inserted in the map. DenseMap is a great way to map pointers to +pointers, or map other small types to each other. +

+ +

+There are several aspects of DenseMap that you should be aware of, however. The +iterators in a densemap are invalidated whenever an insertion occurs, unlike +map. Also, because DenseMap allocates space for a large number of key/value +pairs (it starts with 64 by default), it will waste a lot of space if your keys +or values are large. Finally, you must implement a partial specialization of +DenseMapInfo for the key that you want, if it isn't already supported. This +is required to tell DenseMap about two special marker values (which can never be +inserted into the map) that it needs internally.

+ +
+ + +
+ <map> +
+ +
+ +

+std::map has similar characteristics to std::set: it uses +a single allocation per pair inserted into the map, it offers log(n) lookup with +an extremely large constant factor, imposes a space penalty of 3 pointers per +pair in the map, etc.

+ +

std::map is most useful when your keys or values are very large, if you need +to iterate over the collection in sorted order, or if you need stable iterators +into the map (i.e. they don't get invalidated if an insertion or deletion of +another element takes place).

+ +
+ + +
+ Other Map-Like Container Options +
+ +
+ +

+The STL provides several other options, such as std::multimap and the various +"hash_map" like containers (whether from C++ TR1 or from the SGI library).

+ +

std::multimap is useful if you want to map a key to multiple values, but has +all the drawbacks of std::map. A sorted vector or some other approach is almost +always better.

+ +

The various hash_map implementations (exposed portably by +"llvm/ADT/hash_map") are simple chained hash tables. This algorithm is as +malloc intensive as std::map (performing an allocation for each element +inserted, thus having really high constant factors) but (usually) provides O(1) +insertion/deletion of elements. This can be useful if your elements are large +(thus making the constant-factor cost relatively low) or if comparisons are +expensive. Element iteration does not visit elements in a useful order.

+ +
+ + + +
+ Helpful Hints for Common Operations +
+ + +
+ +

This section describes how to perform some very simple transformations of +LLVM code. This is meant to give examples of common idioms used, showing the +practical side of LLVM transformations.

Because this is a "how-to" section, +you should also read about the main classes that you will be working with. The +Core LLVM Class Hierarchy Reference contains details +and descriptions of the main classes that you should know about.

+ +
+ + + +
+ Basic Inspection and Traversal Routines +
+ +
+ +

The LLVM compiler infrastructure have many different data structures that may +be traversed. Following the example of the C++ standard template library, the +techniques used to traverse these various data structures are all basically the +same. For a enumerable sequence of values, the XXXbegin() function (or +method) returns an iterator to the start of the sequence, the XXXend() +function returns an iterator pointing to one past the last valid element of the +sequence, and there is some XXXiterator data type that is common +between the two operations.

+ +

Because the pattern for iteration is common across many different aspects of +the program representation, the standard template library algorithms may be used +on them, and it is easier to remember how to iterate. First we show a few common +examples of the data structures that need to be traversed. Other data +structures are traversed in very similar ways.

+ +
+ + +
+ Iterating over the BasicBlocks in a Function +
+ +
+ +

It's quite common to have a Function instance that you'd like to +transform in some way; in particular, you'd like to manipulate its +BasicBlocks. To facilitate this, you'll need to iterate over all of +the BasicBlocks that constitute the Function. The following is +an example that prints the name of a BasicBlock and the number of +Instructions it contains:

+ +
+
+// func is a pointer to a Function instance
+for (Function::iterator i = func->begin(), e = func->end(); i != e; ++i)
+  // Print out the name of the basic block if it has one, and then the
+  // number of instructions that it contains
+  llvm::cerr << "Basic block (name=" << i->getName() << ") has "
+             << i->size() << " instructions.\n";
+
+
+ +

Note that i can be used as if it were a pointer for the purposes of invoking member functions of the Instruction class. This is because the indirection operator is overloaded for the iterator classes. In the above code, the expression i->size() is -exactly equivalent to (*i).size() just like you'd expect. - -

-
Iterating over the Instructions in a BasicBlock

- -

-
Iterating over the Instructions in a Function

- -

-
Turning an iterator into a class -pointer (and vice-versa)

-
+ + +
+ Iterating over the Instructions in a BasicBlock +
+ +
+ +

Just like when dealing with BasicBlocks in Functions, it's +easy to iterate over the individual instructions that make up +BasicBlocks. Here's a code snippet that prints out each instruction in +a BasicBlock:

+ +
+
+// blk is a pointer to a BasicBlock instance
+for (BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i)
+   // The next statement works since operator<<(ostream&,...)
+   // is overloaded for Instruction&
+   llvm::cerr << *i << "\n";
+
+
+ +

However, this isn't really the best way to print out the contents of a +BasicBlock! Since the ostream operators are overloaded for virtually +anything you'll care about, you could have just invoked the print routine on the +basic block itself: llvm::cerr << *blk << "\n";.

+ +
+ + +
+ Iterating over the Instructions in a Function +
+ +
+ +

If you're finding that you commonly iterate over a Function's +BasicBlocks and then that BasicBlock's Instructions, +InstIterator should be used instead. You'll need to include llvm/Support/InstIterator.h, +and then instantiate InstIterators explicitly in your code. Here's a +small example that shows how to dump all instructions in a function to the standard error stream:

+ +

+
+#include "llvm/Support/InstIterator.h"
+
+// F is a pointer to a Function instance
+for (inst_iterator i = inst_begin(F), e = inst_end(F); i != e; ++i)
+  llvm::cerr << *i << "\n";
+
+
+ +

Easy, isn't it? You can also use InstIterators to fill a +work list with its initial contents. For example, if you wanted to +initialize a work list to contain all instructions in a Function +F, all you would need to do is something like:

+ +
+
+std::set<Instruction*> worklist;
+worklist.insert(inst_begin(F), inst_end(F));
+
+
+ +

The STL set worklist would now contain all instructions in the +Function pointed to by F.

+ +
+ + +
+ Turning an iterator into a class pointer (and + vice-versa) +
+ +
+ +

Sometimes, it'll be useful to grab a reference (or pointer) to a class instance when all you've got at hand is an iterator. Well, extracting -a reference or a pointer from an iterator is very straightforward. +a reference or a pointer from an iterator is very straight-forward. Assuming that i is a BasicBlock::iterator and j -is a BasicBlock::const_iterator: -

    Instruction& inst = *i;   // grab reference to instruction reference
Instruction* pinst = &*i; // grab pointer to instruction reference
const Instruction& inst = *j;
-However, the iterators you'll be working with in the LLVM framework are -special: they will automatically convert to a ptr-to-instance type -whenever they need to. Instead of dereferencing the iterator and then -taking the address of the result, you can simply assign the iterator to -the proper pointer type and you get the dereference and address-of -operation as a result of the assignment (behind the scenes, this is a -result of overloading casting mechanisms). Thus the last line of the -last example, -
Instruction* pinst = &*i;
-is semantically equivalent to -
Instruction* pinst = i;
-It's also possible to turn a class pointer into the corresponding -iterator. Usually, this conversion is quite inexpensive. The -following code snippet illustrates use of the conversion constructors -provided by LLVM iterators. By using these, you can explicitly grab -the iterator of something without actually obtaining it via iteration -over some structure: -
void printNextInstruction(Instruction* inst) {
BasicBlock::iterator it(inst);
++it; // after this line, it refers to the instruction after *inst.
if (it != inst->getParent()->end()) cerr << *it << "\n";
}
-Of course, this example is strictly pedagogical, because it'd be much -better to explicitly grab the next instruction directly from inst. - -

-
Finding call sites: a slightly -more complex example

-
+ - -

-
Treating calls and invokes the -same way

- -

-
Iterating over def-use & -use-def chains

- - - - - - - - -
   Making simple -changes
- -

-
Important Public Members of the GlobalValue -class

+Superclasses: Constant, +User, Value

+ +

Global values (GlobalVariables or Functions) are the only LLVM values that are +visible in the bodies of all Functions. +Because they are visible at global scope, they are also subject to linking with +other globals defined in different translation units. To control the linking +process, GlobalValues know their linkage rules. Specifically, +GlobalValues know whether they have internal or external linkage, as +defined by the LinkageTypes enumeration.

+ +

If a GlobalValue has internal linkage (equivalent to being +static in C), it is not visible to code outside the current translation +unit, and does not participate in linking. If it has external linkage, it is +visible to external code, and does participate in linking. In addition to +linkage information, GlobalValues keep track of which Module they are currently part of.

+ +

Because GlobalValues are memory objects, they are always referred to +by their address. As such, the Type of a +global is always a pointer to its contents. It is important to remember this +when using the GetElementPtrInst instruction because this pointer must +be dereferenced first. For example, if you have a GlobalVariable (a +subclass of GlobalValue) that is an array of 24 ints, type [24 x +i32], then the GlobalVariable is a pointer to that array. Although +the address of the first element of this array and the value of the +GlobalVariable are the same, they have different types. The +GlobalVariable's type is [24 x i32]. The first element's type +is i32. Because of this, accessing a global value requires you to +dereference the pointer with GetElementPtrInst first, then its elements +can be accessed. This is explained in the LLVM +Language Reference Manual.

+ + + + +
+ Important Public Members of the GlobalValue + class +
+ +
+ - - - - - - - -
   The Function -class
- -

-
Important Public Members of the Function -class

+ +
+ + +
+ The Function class +
+ +
+ +

#include "llvm/Function.h"
doxygen +info: Function Class
+Superclasses: GlobalValue, +Constant, +User, +Value

+ +

The Function class represents a single procedure in LLVM. It is +actually one of the more complex classes in the LLVM heirarchy because it must +keep track of a large amount of data. The Function class keeps track +of a list of BasicBlocks, a list of formal +Arguments, and a +SymbolTable.

+ +

The list of BasicBlocks is the most +commonly used part of Function objects. The list imposes an implicit +ordering of the blocks in the function, which indicate how the code will be +layed out by the backend. Additionally, the first BasicBlock is the implicit entry node for the +Function. It is not legal in LLVM to explicitly branch to this initial +block. There are no implicit exit nodes, and in fact there may be multiple exit +nodes from a single Function. If the BasicBlock list is empty, this indicates that +the Function is actually a function declaration: the actual body of the +function hasn't been linked in yet.

+ +

In addition to a list of BasicBlocks, the +Function class also keeps track of the list of formal Arguments that the function receives. This +container manages the lifetime of the Argument +nodes, just like the BasicBlock list does for +the BasicBlocks.

+ +

The SymbolTable is a very rarely used +LLVM feature that is only used when you have to look up a value by name. Aside +from that, the SymbolTable is used +internally to make sure that there are not conflicts between the names of Instructions, BasicBlocks, or Arguments in the function body.

+ +

Note that Function is a GlobalValue +and therefore also a Constant. The value of the function +is its address (after linking) which is guaranteed to be constant.

+
+ + +
+ Important Public Members of the Function + class +
+ +
+ - - - - - - - -
   The GlobalVariable -class
- -

-
Important Public Members of the GlobalVariable -class

+ +
+ + +
+ The GlobalVariable class +
+ +
+ +

#include "llvm/GlobalVariable.h" +
+doxygen info: GlobalVariable + Class
+Superclasses: GlobalValue, +Constant, +User, +Value

+ +

Global variables are represented with the (suprise suprise) +GlobalVariable class. Like functions, GlobalVariables are also +subclasses of GlobalValue, and as such are +always referenced by their address (global values must live in memory, so their +"name" refers to their constant address). See +GlobalValue for more on this. Global +variables may have an initial value (which must be a +Constant), and if they have an initializer, +they may be marked as "constant" themselves (indicating that their contents +never change at runtime).

+
+ + +
+ Important Public Members of the + GlobalVariable class +
+ +
+ - - - - - - - -
   The Module class
- -

-
Important Public Members of the Module -class

- -

Constructing a Module -is easy. You can optionally provide a name for it (probably based on the -name of the translation unit).

- - - - - - - - -
   The Constant -class and subclasses
- -

-
Important Public Methods

- - - - - - - - -
   The Type class and -Derived Types
- -

-
Important Public Methods

- - - - - - - - -
   The Argument -class
+ +
+ + + +
+ The BasicBlock class +
+ +
+ +

#include "llvm/BasicBlock.h"
+doxygen info: BasicBlock +Class
+Superclass: Value

+ +

This class represents a single entry multiple exit section of the code, +commonly known as a basic block by the compiler community. The +BasicBlock class maintains a list of Instructions, which form the body of the block. +Matching the language definition, the last element of this list of instructions +is always a terminator instruction (a subclass of the TerminatorInst class).

+ +

In addition to tracking the list of instructions that make up the block, the +BasicBlock class also keeps track of the Function that it is embedded into.

+ +

Note that BasicBlocks themselves are Values, because they are referenced by instructions +like branches and can go in the switch tables. BasicBlocks have type +label.

+ +
+ + +
+ Important Public Members of the BasicBlock + class +
+ +
+ +
+ + + +
+ The Argument class +
+ +
+ +

This subclass of Value defines the interface for incoming formal +arguments to a function. A Function maintains a list of its formal +arguments. An argument has a pointer to the parent Function.

+ +
+ -
-
By: Dinakar Dhurjati -and Chris Lattner
-
The LLVM -Compiler Infrastructure
- Last -modified: Fri Nov 7 13:24:22 CST 2003
+
+
+ Valid CSS! + Valid HTML 4.01! + + Dinakar Dhurjati and + Chris Lattner
+ The LLVM Compiler Infrastructure
+ Last modified: $Date$ +
+