<li><a href="#constants">Constants</a>
<ol>
<li><a href="#simpleconstants">Simple Constants</a></li>
- <li><a href="#aggregateconstants">Aggregate Constants</a></li>
+ <li><a href="#complexconstants">Complex Constants</a></li>
<li><a href="#globalconstants">Global Variable and Function Addresses</a></li>
<li><a href="#undefvalues">Undefined Values</a></li>
<li><a href="#constantexprs">Constant Expressions</a></li>
+ <li><a href="#metadata">Embedded Metadata</a></li>
</ol>
</li>
<li><a href="#othervalues">Other Values</a>
'<tt>static</tt>' keyword in C.
</dd>
+ <dt><tt><b><a name="available_externally">available_externally</a></b></tt>:
+ </dt>
+
+ <dd>Globals with "<tt>available_externally</tt>" linkage are never emitted
+ into the object file corresponding to the LLVM module. They exist to
+ allow inlining and other optimizations to take place given knowledge of the
+ definition of the global, which is known to be somewhere outside the module.
+ Globals with <tt>available_externally</tt> linkage are allowed to be discarded
+ at will, and are otherwise the same as <tt>linkonce_odr</tt>. This linkage
+ type is only allowed on definitions, not declarations.</dd>
+
<dt><tt><b><a name="linkage_linkonce">linkonce</a></b></tt>: </dt>
<dd>Globals with "<tt>linkonce</tt>" linkage are merged with other globals of
</dd>
<dt><tt><b><a name="linkage_externweak">extern_weak</a></b></tt>: </dt>
+
<dd>The semantics of this linkage follow the ELF object file model: the
symbol is weak until linked, if not linked, the symbol becomes null instead
of being an undefined reference.
</dd>
+ <dt><tt><b><a name="linkage_linkonce">linkonce_odr</a></b></tt>: </dt>
+ <dt><tt><b><a name="linkage_weak">weak_odr</a></b></tt>: </dt>
+ <dd>Some languages allow differing globals to be merged, such as two
+ functions with different semantics. Other languages, such as <tt>C++</tt>,
+ ensure that only equivalent globals are ever merged (the "one definition
+ rule" - "ODR"). Such languages can use the <tt>linkonce_odr</tt>
+ and <tt>weak_odr</tt> linkage types to indicate that the global will only
+ be merged with equivalent globals. These linkage types are otherwise the
+ same as their non-<tt>odr</tt> versions.
+ </dd>
+
<dt><tt><b><a name="linkage_external">externally visible</a></b></tt>:</dt>
<dd>If none of the above identifiers are used, the global is externally
external (i.e., lacking any linkage declarations), they are accessible
outside of the current module.</p>
<p>It is illegal for a function <i>declaration</i>
-to have any linkage type other than "externally visible", <tt>dllimport</tt>,
+to have any linkage type other than "externally visible", <tt>dllimport</tt>
or <tt>extern_weak</tt>.</p>
-<p>Aliases can have only <tt>external</tt>, <tt>internal</tt> and <tt>weak</tt>
-linkages.</p>
+<p>Aliases can have only <tt>external</tt>, <tt>internal</tt>, <tt>weak</tt>
+or <tt>weak_odr</tt> linkages.</p>
</div>
<!-- ======================================================================= -->
<div class="doc_code">
<pre>
-declare i32 @printf(i8* noalias , ...)
+declare i32 @printf(i8* noalias nocapture, ...)
declare i32 @atoi(i8 zeroext)
declare signext i8 @returns_signed_char()
</pre>
pointer arguments or otherwise accessing any mutable state (e.g. memory, control
registers, etc) visible to caller functions. It does not write through any
pointer arguments (including <tt><a href="#byval">byval</a></tt> arguments) and
-never changes any state visible to callers.</dd>
+never changes any state visible to callers. readnone functions may not throw
+an exception that escapes into the caller.</dd>
<dt><tt><a name="readonly">readonly</a></tt></dt>
<dd>This attribute indicates that the function does not write through any
pointer arguments (including <tt><a href="#byval">byval</a></tt> arguments)
or otherwise modify any state (e.g. memory, control registers, etc) visible to
caller functions. It may dereference pointer arguments and read state that may
-be set in the caller. A readonly function always returns the same value (or
-throws the same exception) when called with the same set of arguments and global
-state.</dd>
+be set in the caller. A readonly function always returns the same value when
+called with the same set of arguments and global
+state. readonly functions may not throw an exception that escapes into the
+caller.</dd>
<dt><tt><a name="ssp">ssp</a></tt></dt>
<dd>This attribute indicates that the function should emit a stack smashing
references (with their equivalent as named type declarations) include:</p>
<pre>
- { \2 * } %x = type { %t* }
+ { \2 * } %x = type { %x* }
{ \2 }* %y = type { %y }*
\1* %z = type %z*
</pre>
</dl>
-<p>The one non-intuitive notation for constants is the optional hexadecimal form
+<p>The one non-intuitive notation for constants is the hexadecimal form
of floating point constants. For example, the form '<tt>double
0x432ff973cafa8000</tt>' is equivalent to (but harder to read than) '<tt>double
4.5e+15</tt>'. The only time hexadecimal floating point constants are required
(and the only time that they are generated by the disassembler) is when a
floating point constant must be emitted but it cannot be represented as a
-decimal floating point number. For example, NaN's, infinities, and other
+decimal floating point number in a reasonable number of digits. For example,
+NaN's, infinities, and other
special values are represented in their IEEE hexadecimal format so that
assembly and disassembly do not cause any bits to change in the constants.</p>
-
+<p>When using the hexadecimal form, constants of types float and double are
+represented using the 16-digit form shown above (which matches the IEEE754
+representation for double); float values must, however, be exactly representable
+as IEE754 single precision.
+Hexadecimal format is always used for long
+double, and there are three forms of long double. The 80-bit
+format used by x86 is represented as <tt>0xK</tt>
+followed by 20 hexadecimal digits.
+The 128-bit format used by PowerPC (two adjacent doubles) is represented
+by <tt>0xM</tt> followed by 32 hexadecimal digits. The IEEE 128-bit
+format is represented
+by <tt>0xL</tt> followed by 32 hexadecimal digits; no currently supported
+target uses this format. Long doubles will only work if they match
+the long double format on your target. All hexadecimal formats are big-endian
+(sign bit at the left).</p>
</div>
<!-- ======================================================================= -->
-<div class="doc_subsection"><a name="aggregateconstants">Aggregate Constants</a>
+<div class="doc_subsection">
+<a name="aggregateconstants"> <!-- old anchor -->
+<a name="complexconstants">Complex Constants</a></a>
</div>
<div class="doc_text">
-<p>Aggregate constants arise from aggregation of simple constants
-and smaller aggregate constants.</p>
+<p>Complex constants are a (potentially recursive) combination of simple
+constants and smaller complex constants.</p>
<dl>
<dt><b>Structure constants</b></dt>
large arrays) and is always exactly equivalent to using explicit zero
initializers.
</dd>
+
+ <dt><b>Metadata node</b></dt>
+
+ <dd>A metadata node is a structure-like constant with the type of an empty
+ struct. For example: "<tt>{ } !{ i32 0, { } !"test" }</tt>". Unlike other
+ constants that are meant to be interpreted as part of the instruction stream,
+ metadata is a place to attach additional information such as debug info.
+ </dd>
</dl>
</div>
<i>really</i> dangerous!</dd>
<dt><b><tt>bitcast ( CST to TYPE )</tt></b></dt>
- <dd>Convert a constant, CST, to another TYPE. The size of CST and TYPE must be
- identical (same number of bits). The conversion is done as if the CST value
- was stored to memory and read back as TYPE. In other words, no bits change
- with this operator, just the type. This can be used for conversion of
- vector types to any other type, as long as they have the same bit width. For
- pointers it is only valid to cast to another pointer type. It is not valid
- to bitcast to or from an aggregate type.
- </dd>
+ <dd>Convert a constant, CST, to another TYPE. The constraints of the operands
+ are the same as those for the <a href="#i_bitcast">bitcast
+ instruction</a>.</dd>
<dt><b><tt>getelementptr ( CSTPTR, IDX0, IDX1, ... )</tt></b></dt>
</dl>
</div>
+<!-- ======================================================================= -->
+<div class="doc_subsection"><a name="metadata">Embedded Metadata</a>
+</div>
+
+<div class="doc_text">
+
+<p>Embedded metadata provides a way to attach arbitrary data to the
+instruction stream without affecting the behaviour of the program. There are
+two metadata primitives, strings and nodes. All metadata has the type of an
+empty struct and is identified in syntax by a preceding exclamation point
+('<tt>!</tt>').
+</p>
+
+<p>A metadata string is a string surrounded by double quotes. It can contain
+any character by escaping non-printable characters with "\xx" where "xx" is
+the two digit hex code. For example: "<tt>!"test\00"</tt>".
+</p>
+
+<p>Metadata nodes are represented with notation similar to structure constants
+(a comma separated list of elements, surrounded by braces and preceeded by an
+exclamation point). For example: "<tt>!{ { } !"test\00", i32 10}</tt>".
+</p>
+
+<p>Optimizations may rely on metadata to provide additional information about
+the program that isn't available in the instructions, or that isn't easily
+computable. Similarly, the code generator may expect a certain metadata format
+to be used to express debugging information.</p>
+</div>
+
<!-- *********************************************************************** -->
<div class="doc_section"> <a name="othervalues">Other Values</a> </div>
<!-- *********************************************************************** -->
<pre>
ret i32 5 <i>; Return an integer value of 5</i>
ret void <i>; Return from a void function</i>
- ret { i32, i8 } { i32 4, i8 2 } <i>; Return an aggregate of values 4 and 2</i>
+ ret { i32, i8 } { i32 4, i8 2 } <i>; Return a struct of values 4 and 2</i>
</pre>
<p>Note that the code generator does not yet fully support large
safe.
</p>
<h5>Semantics:</h5>
-<p>The location of memory pointed to is loaded.</p>
+<p>The location of memory pointed to is loaded. If the value being loaded
+is of scalar type then the number of bytes read does not exceed the minimum
+number of bytes needed to hold all bits of the type. For example, loading an
+<tt>i24</tt> reads at most three bytes. When loading a value of a type like
+<tt>i20</tt> with a size that is not an integral number of bytes, the result
+is undefined if the value was not originally written using a store of the
+same type.</p>
<h5>Examples:</h5>
<pre> %ptr = <a href="#i_alloca">alloca</a> i32 <i>; yields {i32*}:ptr</i>
<a
</p>
<h5>Semantics:</h5>
<p>The contents of memory are updated to contain '<tt><value></tt>'
-at the location specified by the '<tt><pointer></tt>' operand.</p>
+at the location specified by the '<tt><pointer></tt>' operand.
+If '<tt><value></tt>' is of scalar type then the number of bytes
+written does not exceed the minimum number of bytes needed to hold all
+bits of the type. For example, storing an <tt>i24</tt> writes at most
+three bytes. When writing a value of a type like <tt>i20</tt> with a
+size that is not an integral number of bytes, it is unspecified what
+happens to the extra bits that do not belong to the type, but they will
+typically be overwritten.</p>
<h5>Example:</h5>
<pre> %ptr = <a href="#i_alloca">alloca</a> i32 <i>; yields {i32*}:ptr</i>
store i32 3, i32* %ptr <i>; yields {void}</i>
<p>The type of each index argument depends on the type it is indexing into.
When indexing into a (packed) structure, only <tt>i32</tt> integer
<b>constants</b> are allowed. When indexing into an array, pointer or vector,
-only integers of 32 or 64 bits are allowed (also non-constants). 32-bit values
-will be sign extended to 64-bits if required.</p>
+integers of any width are allowed (also non-constants).</p>
<p>For example, let's consider a C code fragment and how it gets
compiled to LLVM:</p>
}
</pre>
-<p>Note that it is undefined to access an array out of bounds: array and
-pointer indexes must always be within the defined bounds of the array type.
-The one exception for this rule is zero length arrays. These arrays are
-defined to be accessible as variable length arrays, which requires access
-beyond the zero'th element.</p>
+<p>Note that it is undefined to access an array out of bounds: array
+and pointer indexes must always be within the defined bounds of the
+array type when accessed with an instruction that dereferences the
+pointer (e.g. a load or store instruction). The one exception for
+this rule is zero length arrays. These arrays are defined to be
+accessible as variable length arrays, which requires access beyond the
+zero'th element.</p>
<p>The getelementptr instruction is often confusing. For some more insight
into how it works, see <a href="GetElementPtr.html">the getelementptr
%vptr = getelementptr {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
<i>; yields i8*:eptr</i>
%eptr = getelementptr [12 x i8]* %aptr, i64 0, i32 1
+ <i>; yields i32*:iptr</i>
+ %iptr = getelementptr [10 x i32]* @arr, i16 0, i16 0
</pre>
</div>