+
+<p>
+Once the proper category of container is determined, you can fine tune the
+memory use, constant factors, and cache behaviors of access by intelligently
+picking a member of the category. Note that constant factors and cache behavior
+can be a big deal. If you have a vector that usually only contains a few
+elements (but could contain many), for example, it's much better to use
+<a href="#dss_smallvector">SmallVector</a> than <a href="#dss_vector">vector</a>
+. Doing so avoids (relatively) expensive malloc/free calls, which dwarf the
+cost of adding the elements to the container. </p>
+
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="ds_sequential">Sequential Containers (std::vector, std::list, etc)</a>
+</div>
+
+<div class="doc_text">
+There are a variety of sequential containers available for you, based on your
+needs. Pick the first in this section that will do what you want.
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_fixedarrays">Fixed Size Arrays</a>
+</div>
+
+<div class="doc_text">
+<p>Fixed size arrays are very simple and very fast. They are good if you know
+exactly how many elements you have, or you have a (low) upper bound on how many
+you have.</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_heaparrays">Heap Allocated Arrays</a>
+</div>
+
+<div class="doc_text">
+<p>Heap allocated arrays (new[] + delete[]) are also simple. They are good if
+the number of elements is variable, if you know how many elements you will need
+before the array is allocated, and if the array is usually large (if not,
+consider a <a href="#dss_smallvector">SmallVector</a>). The cost of a heap
+allocated array is the cost of the new/delete (aka malloc/free). Also note that
+if you are allocating an array of a type with a constructor, the constructor and
+destructors will be run for every element in the array (re-sizable vectors only
+construct those elements actually used).</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_smallvector">"llvm/ADT/SmallVector.h"</a>
+</div>
+
+<div class="doc_text">
+<p><tt>SmallVector<Type, N></tt> is a simple class that looks and smells
+just like <tt>vector<Type></tt>:
+it supports efficient iteration, lays out elements in memory order (so you can
+do pointer arithmetic between elements), supports efficient push_back/pop_back
+operations, supports efficient random access to its elements, etc.</p>
+
+<p>The advantage of SmallVector is that it allocates space for
+some number of elements (N) <b>in the object itself</b>. Because of this, if
+the SmallVector is dynamically smaller than N, no malloc is performed. This can
+be a big win in cases where the malloc/free call is far more expensive than the
+code that fiddles around with the elements.</p>
+
+<p>This is good for vectors that are "usually small" (e.g. the number of
+predecessors/successors of a block is usually less than 8). On the other hand,
+this makes the size of the SmallVector itself large, so you don't want to
+allocate lots of them (doing so will waste a lot of space). As such,
+SmallVectors are most useful when on the stack.</p>
+
+<p>SmallVector also provides a nice portable and efficient replacement for
+<tt>alloca</tt>.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_vector"><vector></a>
+</div>
+
+<div class="doc_text">
+<p>
+std::vector is well loved and respected. It is useful when SmallVector isn't:
+when the size of the vector is often large (thus the small optimization will
+rarely be a benefit) or if you will be allocating many instances of the vector
+itself (which would waste space for elements that aren't in the container).
+vector is also useful when interfacing with code that expects vectors :).
+</p>
+
+<p>One worthwhile note about std::vector: avoid code like this:</p>
+
+<div class="doc_code">
+<pre>
+for ( ... ) {
+ std::vector<foo> V;
+ use V;
+}
+</pre>
+</div>
+
+<p>Instead, write this as:</p>
+
+<div class="doc_code">
+<pre>
+std::vector<foo> V;
+for ( ... ) {
+ use V;
+ V.clear();
+}
+</pre>
+</div>
+
+<p>Doing so will save (at least) one heap allocation and free per iteration of
+the loop.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_deque"><deque></a>
+</div>
+
+<div class="doc_text">
+<p>std::deque is, in some senses, a generalized version of std::vector. Like
+std::vector, it provides constant time random access and other similar
+properties, but it also provides efficient access to the front of the list. It
+does not guarantee continuity of elements within memory.</p>
+
+<p>In exchange for this extra flexibility, std::deque has significantly higher
+constant factor costs than std::vector. If possible, use std::vector or
+something cheaper.</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_list"><list></a>
+</div>
+
+<div class="doc_text">
+<p>std::list is an extremely inefficient class that is rarely useful.
+It performs a heap allocation for every element inserted into it, thus having an
+extremely high constant factor, particularly for small data types. std::list
+also only supports bidirectional iteration, not random access iteration.</p>
+
+<p>In exchange for this high cost, std::list supports efficient access to both
+ends of the list (like std::deque, but unlike std::vector or SmallVector). In
+addition, the iterator invalidation characteristics of std::list are stronger
+than that of a vector class: inserting or removing an element into the list does
+not invalidate iterator or pointers to other elements in the list.</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_ilist">llvm/ADT/ilist</a>
+</div>
+
+<div class="doc_text">
+<p><tt>ilist<T></tt> implements an 'intrusive' doubly-linked list. It is
+intrusive, because it requires the element to store and provide access to the
+prev/next pointers for the list.</p>
+
+<p>ilist has the same drawbacks as std::list, and additionally requires an
+ilist_traits implementation for the element type, but it provides some novel
+characteristics. In particular, it can efficiently store polymorphic objects,
+the traits class is informed when an element is inserted or removed from the
+list, and ilists are guaranteed to support a constant-time splice operation.
+</p>
+
+<p>These properties are exactly what we want for things like Instructions and
+basic blocks, which is why these are implemented with ilists.</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_other">Other Sequential Container options</a>
+</div>
+
+<div class="doc_text">
+<p>Other STL containers are available, such as std::string.</p>
+
+<p>There are also various STL adapter classes such as std::queue,
+std::priority_queue, std::stack, etc. These provide simplified access to an
+underlying container but don't affect the cost of the container itself.</p>
+
+</div>
+
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="ds_set">Set-Like Containers (std::set, SmallSet, SetVector, etc)</a>
+</div>
+
+<div class="doc_text">
+
+<p>Set-like containers are useful when you need to canonicalize multiple values
+into a single representation. There are several different choices for how to do
+this, providing various trade-offs.</p>
+
+</div>
+
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_sortedvectorset">A sorted 'vector'</a>
+</div>
+
+<div class="doc_text">
+
+<p>If you intend to insert a lot of elements, then do a lot of queries, a
+great approach is to use a vector (or other sequential container) with
+std::sort+std::unique to remove duplicates. This approach works really well if
+your usage pattern has these two distinct phases (insert then query), and can be
+coupled with a good choice of <a href="#ds_sequential">sequential container</a>.
+</p>
+
+<p>
+This combination provides the several nice properties: the result data is
+contiguous in memory (good for cache locality), has few allocations, is easy to
+address (iterators in the final vector are just indices or pointers), and can be
+efficiently queried with a standard binary or radix search.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_smallset">"llvm/ADT/SmallSet.h"</a>
+</div>
+
+<div class="doc_text">
+
+<p>If you have a set-like data structure that is usually small and whose elements
+are reasonably small, a <tt>SmallSet<Type, N></tt> is a good choice. This set
+has space for N elements in place (thus, if the set is dynamically smaller than
+N, no malloc traffic is required) and accesses them with a simple linear search.
+When the set grows beyond 'N' elements, it allocates a more expensive representation that
+guarantees efficient access (for most types, it falls back to std::set, but for
+pointers it uses something far better, <a
+href="#dss_smallptrset">SmallPtrSet</a>).</p>
+
+<p>The magic of this class is that it handles small sets extremely efficiently,
+but gracefully handles extremely large sets without loss of efficiency. The
+drawback is that the interface is quite small: it supports insertion, queries
+and erasing, but does not support iteration.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_smallptrset">"llvm/ADT/SmallPtrSet.h"</a>
+</div>
+
+<div class="doc_text">
+
+<p>SmallPtrSet has all the advantages of SmallSet (and a SmallSet of pointers is
+transparently implemented with a SmallPtrSet), but also supports iterators. If
+more than 'N' insertions are performed, a single quadratically
+probed hash table is allocated and grows as needed, providing extremely
+efficient access (constant time insertion/deleting/queries with low constant
+factors) and is very stingy with malloc traffic.</p>
+
+<p>Note that, unlike std::set, the iterators of SmallPtrSet are invalidated
+whenever an insertion occurs. Also, the values visited by the iterators are not
+visited in sorted order.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_FoldingSet">"llvm/ADT/FoldingSet.h"</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+FoldingSet is an aggregate class that is really good at uniquing
+expensive-to-create or polymorphic objects. It is a combination of a chained
+hash table with intrusive links (uniqued objects are required to inherit from
+FoldingSetNode) that uses <a href="#dss_smallvector">SmallVector</a> as part of
+its ID process.</p>
+
+<p>Consider a case where you want to implement a "getOrCreateFoo" method for
+a complex object (for example, a node in the code generator). The client has a
+description of *what* it wants to generate (it knows the opcode and all the
+operands), but we don't want to 'new' a node, then try inserting it into a set
+only to find out it already exists, at which point we would have to delete it
+and return the node that already exists.
+</p>
+
+<p>To support this style of client, FoldingSet perform a query with a
+FoldingSetNodeID (which wraps SmallVector) that can be used to describe the
+element that we want to query for. The query either returns the element
+matching the ID or it returns an opaque ID that indicates where insertion should
+take place. Construction of the ID usually does not require heap traffic.</p>
+
+<p>Because FoldingSet uses intrusive links, it can support polymorphic objects
+in the set (for example, you can have SDNode instances mixed with LoadSDNodes).
+Because the elements are individually allocated, pointers to the elements are
+stable: inserting or removing elements does not invalidate any pointers to other
+elements.
+</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_set"><set></a>
+</div>
+
+<div class="doc_text">
+
+<p><tt>std::set</tt> is a reasonable all-around set class, which is decent at
+many things but great at nothing. std::set allocates memory for each element
+inserted (thus it is very malloc intensive) and typically stores three pointers
+per element in the set (thus adding a large amount of per-element space
+overhead). It offers guaranteed log(n) performance, which is not particularly
+fast from a complexity standpoint (particularly if the elements of the set are
+expensive to compare, like strings), and has extremely high constant factors for
+lookup, insertion and removal.</p>
+
+<p>The advantages of std::set are that its iterators are stable (deleting or
+inserting an element from the set does not affect iterators or pointers to other
+elements) and that iteration over the set is guaranteed to be in sorted order.
+If the elements in the set are large, then the relative overhead of the pointers
+and malloc traffic is not a big deal, but if the elements of the set are small,
+std::set is almost never a good choice.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_setvector">"llvm/ADT/SetVector.h"</a>
+</div>
+
+<div class="doc_text">
+<p>LLVM's SetVector<Type> is an adapter class that combines your choice of
+a set-like container along with a <a href="#ds_sequential">Sequential
+Container</a>. The important property
+that this provides is efficient insertion with uniquing (duplicate elements are
+ignored) with iteration support. It implements this by inserting elements into
+both a set-like container and the sequential container, using the set-like
+container for uniquing and the sequential container for iteration.
+</p>
+
+<p>The difference between SetVector and other sets is that the order of
+iteration is guaranteed to match the order of insertion into the SetVector.
+This property is really important for things like sets of pointers. Because
+pointer values are non-deterministic (e.g. vary across runs of the program on
+different machines), iterating over the pointers in the set will
+not be in a well-defined order.</p>
+
+<p>
+The drawback of SetVector is that it requires twice as much space as a normal
+set and has the sum of constant factors from the set-like container and the
+sequential container that it uses. Use it *only* if you need to iterate over
+the elements in a deterministic order. SetVector is also expensive to delete
+elements out of (linear time), unless you use it's "pop_back" method, which is
+faster.
+</p>
+
+<p>SetVector is an adapter class that defaults to using std::vector and std::set
+for the underlying containers, so it is quite expensive. However,
+<tt>"llvm/ADT/SetVector.h"</tt> also provides a SmallSetVector class, which
+defaults to using a SmallVector and SmallSet of a specified size. If you use
+this, and if your sets are dynamically smaller than N, you will save a lot of
+heap traffic.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_uniquevector">"llvm/ADT/UniqueVector.h"</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+UniqueVector is similar to <a href="#dss_setvector">SetVector</a>, but it
+retains a unique ID for each element inserted into the set. It internally
+contains a map and a vector, and it assigns a unique ID for each value inserted
+into the set.</p>
+
+<p>UniqueVector is very expensive: its cost is the sum of the cost of
+maintaining both the map and vector, it has high complexity, high constant
+factors, and produces a lot of malloc traffic. It should be avoided.</p>
+
+</div>
+
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_otherset">Other Set-Like Container Options</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+The STL provides several other options, such as std::multiset and the various
+"hash_set" like containers (whether from C++ TR1 or from the SGI library).</p>
+
+<p>std::multiset is useful if you're not interested in elimination of
+duplicates, but has all the drawbacks of std::set. A sorted vector (where you
+don't delete duplicate entries) or some other approach is almost always
+better.</p>
+
+<p>The various hash_set implementations (exposed portably by
+"llvm/ADT/hash_set") is a simple chained hashtable. This algorithm is as malloc
+intensive as std::set (performing an allocation for each element inserted,
+thus having really high constant factors) but (usually) provides O(1)
+insertion/deletion of elements. This can be useful if your elements are large
+(thus making the constant-factor cost relatively low) or if comparisons are
+expensive. Element iteration does not visit elements in a useful order.</p>
+
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="ds_map">Map-Like Containers (std::map, DenseMap, etc)</a>
+</div>
+
+<div class="doc_text">
+Map-like containers are useful when you want to associate data to a key. As
+usual, there are a lot of different ways to do this. :)
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_sortedvectormap">A sorted 'vector'</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+If your usage pattern follows a strict insert-then-query approach, you can
+trivially use the same approach as <a href="#dss_sortedvectorset">sorted vectors
+for set-like containers</a>. The only difference is that your query function
+(which uses std::lower_bound to get efficient log(n) lookup) should only compare
+the key, not both the key and value. This yields the same advantages as sorted
+vectors for sets.
+</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_stringmap">"llvm/ADT/StringMap.h"</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+Strings are commonly used as keys in maps, and they are difficult to support
+efficiently: they are variable length, inefficient to hash and compare when
+long, expensive to copy, etc. StringMap is a specialized container designed to
+cope with these issues. It supports mapping an arbitrary range of bytes to an
+arbitrary other object.</p>
+
+<p>The StringMap implementation uses a quadratically-probed hash table, where
+the buckets store a pointer to the heap allocated entries (and some other
+stuff). The entries in the map must be heap allocated because the strings are
+variable length. The string data (key) and the element object (value) are
+stored in the same allocation with the string data immediately after the element
+object. This container guarantees the "<tt>(char*)(&Value+1)</tt>" points
+to the key string for a value.</p>
+
+<p>The StringMap is very fast for several reasons: quadratic probing is very
+cache efficient for lookups, the hash value of strings in buckets is not
+recomputed when lookup up an element, StringMap rarely has to touch the
+memory for unrelated objects when looking up a value (even when hash collisions
+happen), hash table growth does not recompute the hash values for strings
+already in the table, and each pair in the map is store in a single allocation
+(the string data is stored in the same allocation as the Value of a pair).</p>
+
+<p>StringMap also provides query methods that take byte ranges, so it only ever
+copies a string if a value is inserted into the table.</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_indexedmap">"llvm/ADT/IndexedMap.h"</a>
+</div>
+
+<div class="doc_text">
+<p>
+IndexedMap is a specialized container for mapping small dense integers (or
+values that can be mapped to small dense integers) to some other type. It is
+internally implemented as a vector with a mapping function that maps the keys to
+the dense integer range.
+</p>
+
+<p>
+This is useful for cases like virtual registers in the LLVM code generator: they
+have a dense mapping that is offset by a compile-time constant (the first
+virtual register ID).</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_densemap">"llvm/ADT/DenseMap.h"</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+DenseMap is a simple quadratically probed hash table. It excels at supporting
+small keys and values: it uses a single allocation to hold all of the pairs that
+are currently inserted in the map. DenseMap is a great way to map pointers to
+pointers, or map other small types to each other.
+</p>
+
+<p>
+There are several aspects of DenseMap that you should be aware of, however. The
+iterators in a densemap are invalidated whenever an insertion occurs, unlike
+map. Also, because DenseMap allocates space for a large number of key/value
+pairs (it starts with 64 by default), it will waste a lot of space if your keys
+or values are large. Finally, you must implement a partial specialization of
+DenseMapKeyInfo for the key that you want, if it isn't already supported. This
+is required to tell DenseMap about two special marker values (which can never be
+inserted into the map) that it needs internally.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_map"><map></a>
+</div>
+
+<div class="doc_text">
+
+<p>
+std::map has similar characteristics to <a href="#dss_set">std::set</a>: it uses
+a single allocation per pair inserted into the map, it offers log(n) lookup with
+an extremely large constant factor, imposes a space penalty of 3 pointers per
+pair in the map, etc.</p>
+
+<p>std::map is most useful when your keys or values are very large, if you need
+to iterate over the collection in sorted order, or if you need stable iterators
+into the map (i.e. they don't get invalidated if an insertion or deletion of
+another element takes place).</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="dss_othermap">Other Map-Like Container Options</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+The STL provides several other options, such as std::multimap and the various
+"hash_map" like containers (whether from C++ TR1 or from the SGI library).</p>
+
+<p>std::multimap is useful if you want to map a key to multiple values, but has
+all the drawbacks of std::map. A sorted vector or some other approach is almost
+always better.</p>
+
+<p>The various hash_map implementations (exposed portably by
+"llvm/ADT/hash_map") are simple chained hash tables. This algorithm is as
+malloc intensive as std::map (performing an allocation for each element
+inserted, thus having really high constant factors) but (usually) provides O(1)
+insertion/deletion of elements. This can be useful if your elements are large
+(thus making the constant-factor cost relatively low) or if comparisons are
+expensive. Element iteration does not visit elements in a useful order.</p>
+
+</div>
+
+
+<!-- *********************************************************************** -->
+<div class="doc_section">
+ <a name="common">Helpful Hints for Common Operations</a>
+</div>
+<!-- *********************************************************************** -->
+
+<div class="doc_text">
+
+<p>This section describes how to perform some very simple transformations of
+LLVM code. This is meant to give examples of common idioms used, showing the
+practical side of LLVM transformations. <p> Because this is a "how-to" section,
+you should also read about the main classes that you will be working with. The
+<a href="#coreclasses">Core LLVM Class Hierarchy Reference</a> contains details
+and descriptions of the main classes that you should know about.</p>
+
+</div>
+
+<!-- NOTE: this section should be heavy on example code -->
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="inspection">Basic Inspection and Traversal Routines</a>
+</div>
+
+<div class="doc_text">
+
+<p>The LLVM compiler infrastructure have many different data structures that may
+be traversed. Following the example of the C++ standard template library, the
+techniques used to traverse these various data structures are all basically the
+same. For a enumerable sequence of values, the <tt>XXXbegin()</tt> function (or
+method) returns an iterator to the start of the sequence, the <tt>XXXend()</tt>
+function returns an iterator pointing to one past the last valid element of the
+sequence, and there is some <tt>XXXiterator</tt> data type that is common
+between the two operations.</p>
+
+<p>Because the pattern for iteration is common across many different aspects of
+the program representation, the standard template library algorithms may be used
+on them, and it is easier to remember how to iterate. First we show a few common
+examples of the data structures that need to be traversed. Other data
+structures are traversed in very similar ways.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="iterate_function">Iterating over the </a><a
+ href="#BasicBlock"><tt>BasicBlock</tt></a>s in a <a
+ href="#Function"><tt>Function</tt></a>
+</div>
+
+<div class="doc_text">
+
+<p>It's quite common to have a <tt>Function</tt> instance that you'd like to
+transform in some way; in particular, you'd like to manipulate its
+<tt>BasicBlock</tt>s. To facilitate this, you'll need to iterate over all of
+the <tt>BasicBlock</tt>s that constitute the <tt>Function</tt>. The following is
+an example that prints the name of a <tt>BasicBlock</tt> and the number of
+<tt>Instruction</tt>s it contains:</p>
+
+<div class="doc_code">
+<pre>
+// <i>func is a pointer to a Function instance</i>
+for (Function::iterator i = func->begin(), e = func->end(); i != e; ++i)
+ // <i>Print out the name of the basic block if it has one, and then the</i>
+ // <i>number of instructions that it contains</i>
+ llvm::cerr << "Basic block (name=" << i->getName() << ") has "
+ << i->size() << " instructions.\n";
+</pre>
+</div>
+
+<p>Note that i can be used as if it were a pointer for the purposes of