X-Git-Url: http://plrg.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FLangRef.html;h=1b94ab5a9a705faa034ac9f76a3faac702937e03;hb=83821c8941b7e9e70de9d5e76556b07872ac371b;hp=935625dbfaadf4f6814a55a0ff1fa08c18be349d;hpb=ec38da42c8d563d639579d77b883e9ed1cbe2582;p=oota-llvm.git diff --git a/docs/LangRef.html b/docs/LangRef.html index 935625dbfaa..1b94ab5a9a7 100644 --- a/docs/LangRef.html +++ b/docs/LangRef.html @@ -5,7 +5,7 @@
...because the definition of %x does not dominate all of its - uses. The LLVM infrastructure provides a verification pass that may be used - to verify that an LLVM module is well formed. This pass is automatically run - by the parser after parsing input assembly and by the optimizer before it - outputs bitcode. The violations pointed out by the verifier pass indicate - bugs in transformation passes or input to the parser.
+because the definition of %x does not dominate all of its uses. The + LLVM infrastructure provides a verification pass that may be used to verify + that an LLVM module is well formed. This pass is automatically run by the + parser after parsing input assembly and by the optimizer before it outputs + bitcode. The violations pointed out by the verifier pass indicate bugs in + transformation passes or input to the parser.
@@ -430,8 +452,8 @@-add i32 %X, %X ; yields {i32}:%0 -add i32 %0, %0 ; yields {i32}:%1 +%0 = add i32 %X, %X ; yields {i32}:%0 +%1 = add i32 %0, %0 ; yields {i32}:%1 %result = add i32 %1, %1
...and it also shows a convention that we follow in this document. When +
It also shows a convention that we follow in this document. When demonstrating instructions, we will follow an instruction with a comment that defines the type and name of value produced. Comments are shown in italic text.
@@ -474,31 +496,33 @@ the "hello world" module:; Declare the string constant as a global constant... -@.LC0 = internal constant [13 x i8] c"hello world\0A\00" ; [13 x i8]* ++; Declare the string constant as a global constant. +@.LC0 = internal constant [13 x i8] c"hello world\0A\00" ; [13 x i8]* ; External declaration of the puts function -declare i32 @puts(i8 *) ; i32(i8 *)* +declare i32 @puts(i8 *) ; i32(i8 *)* ; Definition of main function -define i32 @main() { ; i32()* - ; Convert [13 x i8]* to i8 *... - %cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0 ; i8 * +define i32 @main() { ; i32()* + ; Convert [13 x i8]* to i8 *... + %cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0 ; i8 * - ; Call puts function to write out the string to stdout... - call i32 @puts(i8 * %cast210) ; i32 - ret i32 0
}
+ ; Call puts function to write out the string to stdout. + call i32 @puts(i8 * %cast210) ; i32 + ret i32 0
} + +; Named metadata +!1 = metadata !{i32 41} +!foo = !{!1, null}
This example is made up of a global variable named - ".LC0", an external declaration of the "puts" function, and + ".LC0", an external declaration of the "puts" function, a function definition for - "main".
+ "main" and named metadata + "foo".In general, a module is made up of a list of global values, where both functions and global variables are global values. Global values are @@ -519,7 +543,7 @@ define i32 @main() { ; i32()* linkage:
__imp_
and the function or variable
name.LLVM function definitions consist of the "define" keyord, an +
LLVM function definitions consist of the "define" keyword, an optional linkage type, an optional visibility style, an optional calling convention, a return type, an optional @@ -836,7 +887,7 @@ define i32 @main() { ; i32()*
LLVM function declarations consist of the "declare" keyword, an optional linkage type, an optional - visibility style, an optional + visibility style, an optional calling convention, a return type, an optional parameter attribute for the return type, a function name, a possibly empty list of arguments, an optional alignment, and an @@ -897,6 +948,27 @@ define [linkage] [visibility]
Named metadata is a collection of metadata. Metadata + nodes (but not metadata strings) and null are the only valid operands for + a named metadata.
+ ++!1 = metadata !{metadata !"one"} +!name = !{null, !1} ++
Currently, only the following parameter attributes are defined:
-define void @f() gc "name" { ... +define void @f() gc "name" { ... }
When constructing the data layout for a given target, LLVM starts with a @@ -1219,7 +1303,7 @@ target datalayout = "layout specification"
Note that the code generator does not yet support large integer types to be - used as function return types. The specific limit on how large a return type - the code generator can currently handle is target-dependent; currently it's - often 64 bits for 32-bit targets and 128 bits for 64-bit targets.
- @@ -1489,9 +1569,9 @@ ClassificationsThe metadata type represents embedded metadata. The only derived type that - may contain metadata is metadata* or a function type that returns or - takes metadata typed parameters, but not pointer to metadata types.
+The metadata type represents embedded metadata. No derived types may be + created from metadata except for function + arguments.
@@ -1513,6 +1593,21 @@ Classifications
Aggregate Types are a subset of derived types that can contain multiple + member types. Arrays, + structs, vectors and + unions are aggregate types.
+ +Note that 'variable sized arrays' can be implemented in LLVM with a zero - length array. Normally, accesses past the end of an array are undefined in - LLVM (e.g. it is illegal to access the 5th element of a 3 element array). As - a special case, however, zero length arrays are recognized to be variable - length. This allows implementation of 'pascal style arrays' with the LLVM - type "{ i32, [0 x float]}", for example.
- -Note that the code generator does not yet support large aggregate types to be - used as function return types. The specific limit on how large an aggregate - return type the code generator can currently handle is target-dependent, and - also dependent on the aggregate element types.
+There is no restriction on indexing beyond the end of the array implied by + a static type (though there are restrictions on indexing beyond the bounds + of an allocated object in some cases). This means that single-dimension + 'variable sized array' addressing can be implemented in LLVM with a zero + length array type. An implementation of 'pascal style arrays' in LLVM could + use the type "{ i32, [0 x float]}", for example.
@@ -1586,13 +1676,13 @@ ClassificationsThe function type can be thought of as a function signature. It consists of a return type and a list of formal parameter types. The return type of a - function type is a scalar type, a void type, or a struct type. If the return - type is a struct type then all struct elements must be of first class types, - and the struct must have at least one element.
+ function type is a scalar type, a void type, a struct type, or a union + type. If the return type is a struct type then all struct elements must be + of first class types, and the struct must have at least one element.- <returntype list> (<parameter list>) + <returntype> (<parameter list>)
...where '<parameter list>' is a comma-separated list of type @@ -1600,8 +1690,8 @@ Classifications which indicates that the function takes a variable number of arguments. Variable argument functions can access their arguments with the variable argument handling intrinsic - functions. '<returntype list>' is a comma-separated list of - first class type specifiers.
+ functions. '<returntype>' is any type except + label.function taking an i32, returning an i32 | |||
float (i16 signext, i32 *) *
+ float (i16, i32 *) *
|
- Pointer to a function that takes
- an i16 that should be sign extended and a
- pointer to i32, returning
- float.
+ | Pointer to a function that takes
+ an i16 and a pointer to i32,
+ returning float.
|
|
i32 (i8*, ...) | -A vararg function that takes at least one - pointer to i8 (char in C), - which returns an integer. This is the signature for printf in + | A vararg function that takes at least one + pointer to i8 (char in C), + which returns an integer. This is the signature for printf in LLVM. | |
{i32, i32} (i32) | -A function taking an i32, returning two - i32 values as an aggregate of type { i32, i32 } + | A function taking an i32, returning a + structure containing two i32 values |
Structures are accessed using 'load and - 'store' by getting a pointer to a field with - the 'getelementptr' instruction.
- +Structures in memory are accessed using 'load' + and 'store' by getting a pointer to a field + with the 'getelementptr' instruction. + Structures in registers are accessed using the + 'extractvalue' and + 'insertvalue' instructions.
{ <type list> } @@ -1668,11 +1759,6 @@ Classifications -Note that the code generator does not yet support large aggregate types to be - used as function return types. The specific limit on how large an aggregate - return type the code generator can currently handle is target-dependent, and - also dependent on the aggregate element types.
- @@ -1713,16 +1799,66 @@ Classifications + + + ++ ++Overview:
+A union type describes an object with size and alignment suitable for + an object of any one of a given set of types (also known as an "untagged" + union). It is similar in concept and usage to a + struct, except that all members of the union + have an offset of zero. The elements of a union may be any type that has a + size. Unions must have at least one member - empty unions are not allowed. +
+ +The size of the union as a whole will be the size of its largest member, + and the alignment requirements of the union as a whole will be the largest + alignment requirement of any member.
+ +Union members are accessed using 'load and + 'store' by getting a pointer to a field with + the 'getelementptr' instruction. + Since all members are at offset zero, the getelementptr instruction does + not affect the address, only the type of the resulting pointer.
+ +Syntax:
++ union { <type list> } ++ +Examples:
++
+ ++ union { i32, i32*, float } +A union of three types: an i32, a pointer to + an i32, and a float. ++ ++ union { float, i32 (i32) * } +A union, where the first element is a float and the + second element is a pointer to a + function that takes an i32, returning + an i32. +Overview:
-As in many languages, the pointer type represents a pointer or reference to - another object, which must live in memory. Pointer types may have an optional - address space attribute defining the target-specific numbered address space - where the pointed-to object resides. The default address space is zero.
+The pointer type is used to specify memory locations. + Pointers are commonly used to reference objects in memory.
+ +Pointer types may have an optional address space attribute defining the + numbered address space where the pointed-to object resides. The default + address space is number zero. The semantics of non-zero address + spaces are target-specific.
Note that LLVM does not permit pointers to void (void*) nor does it permit pointers to labels (label*). Use i8* instead.
@@ -1763,8 +1899,7 @@ ClassificationsA vector type is a simple derived type that represents a vector of elements. Vector types are used when multiple primitive data are operated in parallel using a single instruction (SIMD). A vector type requires a size (number of - elements) and an underlying primitive data type. Vectors must have a power - of two length (1, 2, 4, 8, 16 ...). Vector types are considered + elements) and an underlying primitive data type. Vector types are considered first class.
Syntax:
@@ -1791,11 +1926,6 @@ Classifications -Note that the code generator does not yet support large vector types to be - used as function return types. The specific limit on how large a vector - return type codegen can currently handle is target-dependent; currently it's - often a few times longer than a hardware vector register.
- @@ -1957,6 +2087,14 @@ Classifications the number and types of elements must match those specified by the type. +
The string 'undef' can be used anywhere a constant is expected, and - indicates that the user of the value may recieve an unspecified bit-pattern. + indicates that the user of the value may receive an unspecified bit-pattern. Undefined values may be of any type (other than label or void) and be used anywhere a constant is permitted.
@@ -2061,9 +2200,9 @@ Unsafe: For example, if "%X" has a zero bit, then the output of the 'and' operation will always be a zero, no matter what the corresponding bit from the undef is. As such, it is unsafe to optimize or assume that the result of the and is undef. -However, it is safe to assume that all bits of the undef could be 0, and -optimize the and to 0. Likewise, it is safe to assume that all the bits of -the undef operand to the or could be set, allowing the or to be folded to +However, it is safe to assume that all bits of the undef could be 0, and +optimize the and to 0. Likewise, it is safe to assume that all the bits of +the undef operand to the or could be set, allowing the or to be folded to -1.%A = xor undef, undef - + %B = undef %C = xor %B, %B @@ -2118,7 +2257,7 @@ number of reasons, but the short answer is that an undef "variable" can arbitrarily change its value over its "live range". This is true because the "variable" doesn't actually have a live range. Instead, the value is logically read from arbitrary registers that happen to be around when needed, -so the value is not neccesarily consistent over time. In fact, %A and %C need +so the value is not necessarily consistent over time. In fact, %A and %C need to have the same semantics or the core LLVM "replace all uses with" concept would not hold. @@ -2144,7 +2283,7 @@ does not execute at all. This allows us to delete the divide and all code after it: since the undefined operation "can't happen", the optimizer can assume that it occurs in dead code. - +a: store undef -> %X @@ -2156,13 +2295,41 @@ b: unreachableThese examples reiterate the fdiv example: a store "of" an undefined value -can be assumed to not have any effect: we can assume that the value is +can be assumed to not have any effect: we can assume that the value is overwritten with bits that happen to match what was already there. However, a store "to" an undefined location could clobber arbitrary memory, therefore, it has undefined behavior.
blockaddress(@function, %block)
+ +The 'blockaddress' constant computes the address of the specified + basic block in the specified function, and always has an i8* type. Taking + the address of the entry block is illegal.
+ +This value only has defined behavior when used as an operand to the + 'indirectbr' instruction or for comparisons + against null. Pointer equality tests between labels addresses is undefined + behavior - though, again, comparison against null is ok, and no label is + equal to the null pointer. This may also be passed around as an opaque + pointer sized value as long as the bits are not inspected. This allows + ptrtoint and arithmetic to be performed on these values so long as + the original value is reconstituted before the indirectbr.
+ +Finally, some targets may provide defined semantics when + using the value as the operand to an inline assembly, but that is target + specific. +
+ +Embedded metadata provides a way to attach arbitrary data to the instruction - stream without affecting the behaviour of the program. There are two - metadata primitives, strings and nodes. All metadata has the - metadata type and is identified in syntax by a preceding exclamation - point ('!').
- -A metadata string is a string surrounded by double quotes. It can contain - any character by escaping non-printable characters with "\xx" where "xx" is - the two digit hex code. For example: "!"test\00"".
- -Metadata nodes are represented with notation similar to structure constants - (a comma separated list of elements, surrounded by braces and preceeded by an - exclamation point). For example: "!{ metadata !"test\00", i32 - 10}".
- -A metadata node will attempt to track changes to the values it holds. In the - event that a value is deleted, it will be replaced with a typeless - "null", such as "metadata !{null, i32 10}".
- -Optimizations may rely on metadata to provide additional information about - the program that isn't available in the instructions, or that isn't easily - computable. Similarly, the code generator may expect a certain metadata - format to be used to express debugging information.
- -@@ -2359,11 +2496,98 @@ call void asm sideeffect "eieio", ""()
In some cases inline asms will contain code that will not work unless the + stack is aligned in some way, such as calls or SSE instructions on x86, + yet will not contain code that does that alignment within the asm. + The compiler should make conservative assumptions about what the asm might + contain and should generate its usual stack alignment code in the prologue + if the 'alignstack' keyword is present:
+ ++call void asm alignstack "eieio", ""() ++
If both keywords appear the 'sideeffect' keyword must come + first.
+TODO: The format of the asm and constraints string still need to be documented here. Constraints on what can be done (e.g. duplication, moving, etc need to be documented). This is probably best done by reference to another document that covers inline asm from a holistic perspective.
+The call instructions that wrap inline asm nodes may have a "!srcloc" MDNode + attached to it that contains a constant integer. If present, the code + generator will use the integer as the location cookie value when report + errors through the LLVMContext error reporting mechanisms. This allows a + front-end to corrolate backend errors that occur with inline asm back to the + source code that produced it. For example:
+ ++call void asm sideeffect "something bad", ""(), !srcloc !42 +... +!42 = !{ i32 1234567 } ++
It is up to the front-end to make sense of the magic numbers it places in the + IR.
+ +LLVM IR allows metadata to be attached to instructions in the program that + can convey extra information about the code to the optimizers and code + generator. One example application of metadata is source-level debug + information. There are two metadata primitives: strings and nodes. All + metadata has the metadata type and is identified in syntax by a + preceding exclamation point ('!').
+ +A metadata string is a string surrounded by double quotes. It can contain + any character by escaping non-printable characters with "\xx" where "xx" is + the two digit hex code. For example: "!"test\00"".
+ +Metadata nodes are represented with notation similar to structure constants + (a comma separated list of elements, surrounded by braces and preceded by an + exclamation point). For example: "!{ metadata !"test\00", i32 + 10}". Metadata nodes can have any values as their operand.
+ +A named metadata is a collection of + metadata nodes, which can be looked up in the module symbol table. For + example: "!foo = metadata !{!4, !3}". + +
Metadata can be used as function arguments. Here llvm.dbg.value + function is using two metadata arguments. +
+ call void @llvm.dbg.value(metadata !24, i64 0, metadata !25) ++
Metadata can be attached with an instruction. Here metadata !21 is + attached with add instruction using !dbg identifier. + +
+ %indvar.next = add i64 %indvar, 1, !dbg !21 ++
There are six different terminator instructions: the +
There are seven different terminator instructions: the 'ret' instruction, the 'br' instruction, the 'switch' instruction, the + ''indirectbr' Instruction, the 'invoke' instruction, the 'unwind' instruction, and the 'unreachable' instruction.
@@ -2541,14 +2766,6 @@ Instruction ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2 -Note that the code generator does not yet fully support large - return values. The specific sizes that are currently supported are - dependent on the target. For integers, on 32-bit targets the limit - is often 64 bits, and on 64-bit targets the limit is often 128 bits. - For aggregate types, the current limits are dependent on the element - types; for example targets are often limited to 2 total integer - elements and 2 total floating-point elements.
- @@ -2619,8 +2836,8 @@ IfUnequal:The switch instruction specifies a table of values and destinations. When the 'switch' instruction is executed, this table is searched for the given value. If the value is found, control flow is - transfered to the corresponding destination; otherwise, control flow is - transfered to the default destination.
+ transferred to the corresponding destination; otherwise, control flow is + transferred to the default destination.Depending on properties of the target machine and the particular @@ -2645,6 +2862,55 @@ IfUnequal: + + +
+ ++ indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ] ++ +
The 'indirectbr' instruction implements an indirect branch to a label + within the current function, whose address is specified by + "address". Address must be derived from a blockaddress constant.
+ +The 'address' argument is the address of the label to jump to. The + rest of the arguments indicate the full set of possible destinations that the + address may point to. Blocks are allowed to occur multiple times in the + destination list, though this isn't particularly useful.
+ +This destination list is required so that dataflow analysis has an accurate + understanding of the CFG.
+ +Control transfers to the block specified in the address argument. All + possible destination blocks must be listed in the label list, otherwise this + instruction has undefined behavior. This implies that jumps to labels + defined in other functions have undefined behavior as well.
+ +This is typically implemented with a jump through a register.
+ ++ indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ] ++ +
Note that the code generator does not yet completely support unwind, and +that the invoke/unwind semantics are likely to change in future versions.
+%retval = invoke i32 @Test(i32 15) to label %Continue @@ -2756,6 +3026,9 @@ Instruction
Note that the code generator does not yet completely support unwind, and +that the invoke/unwind semantics are likely to change in future versions.
+ @@ -2979,7 +3252,7 @@ InstructionThe two arguments to the 'mul' instruction must be integer or vector of integer values. Both arguments must have identical types.
- +The value produced is the integer product of the two operands.
@@ -3051,7 +3324,7 @@ InstructionThe 'udiv' instruction returns the quotient of its two operands.
The two arguments to the 'udiv' instruction must be +
The two arguments to the 'udiv' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3086,7 +3359,7 @@ InstructionThe 'sdiv' instruction returns the quotient of its two operands.
The two arguments to the 'sdiv' instruction must be +
The two arguments to the 'sdiv' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3157,7 +3430,7 @@ Instruction division of its two arguments.The two arguments to the 'urem' instruction must be +
The two arguments to the 'urem' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3197,7 +3470,7 @@ Instruction elements must be integers.The two arguments to the 'srem' instruction must be +
The two arguments to the 'srem' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3292,7 +3565,7 @@ InstructionBoth arguments to the 'shl' instruction must be the same integer or vector of integer type. 'op2' is treated as an unsigned value.
- +The value produced is op1 * 2op2 mod 2n, where n is the width of the result. If op2 @@ -3328,7 +3601,7 @@ Instruction operand shifted to the right a specified number of bits with zero fill.
Both arguments to the 'lshr' instruction must be the same +
Both arguments to the 'lshr' instruction must be the same integer or vector of integer type. 'op2' is treated as an unsigned value.
@@ -3368,7 +3641,7 @@ Instruction extension.Both arguments to the 'ashr' instruction must be the same +
Both arguments to the 'ashr' instruction must be the same integer or vector of integer type. 'op2' is treated as an unsigned value.
@@ -3408,7 +3681,7 @@ Instruction operands.The two arguments to the 'and' instruction must be +
The two arguments to the 'and' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3467,7 +3740,7 @@ Instruction two operands.The two arguments to the 'or' instruction must be +
The two arguments to the 'or' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3530,7 +3803,7 @@ Instruction complement" operation, which is the "~" operator in C.The two arguments to the 'xor' instruction must be +
The two arguments to the 'xor' instruction must be integer or vector of integer values. Both arguments must have identical types.
@@ -3578,7 +3851,7 @@ Instruction -- %result = extractelement <4 x i32> %vec, i32 0 ; yields i32 + <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32@@ -3660,7 +3933,7 @@ Instruction
- %result = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32> + <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>@@ -3701,26 +3974,27 @@ Instruction
- %result = shufflevector <4 x i32> %v1, <4 x i32> %v2, + <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32> - %result = shufflevector <4 x i32> %v1, <4 x i32> undef, + <result> = shufflevector <4 x i32> %v1, <4 x i32> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle. - %result = shufflevector <8 x i32> %v1, <8 x i32> undef, + <result> = shufflevector <8 x i32> %v1, <8 x i32> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - %result = shufflevector <4 x i32> %v1, <4 x i32> %v2, + <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>-
LLVM supports several instructions for working with aggregate values.
+LLVM supports several instructions for working with + aggregate values.
The 'extractvalue' instruction extracts the value of a struct field - or array element from an aggregate value.
+The 'extractvalue' instruction extracts the value of a member field + from an aggregate value.
The first operand of an 'extractvalue' instruction is a value - of struct or array type. The - operands are constant indices to specify which value to extract in a similar - manner as indices in a + of struct, union or + array type. The operands are constant indices to + specify which value to extract in a similar manner as indices in a 'getelementptr' instruction.
- %result = extractvalue {i32, float} %agg, 0 ; yields i32 + <result> = extractvalue {i32, float} %agg, 0 ; yields i32@@ -3767,20 +4041,19 @@ Instruction
- <result> = insertvalue <aggregate type> <val>, <ty> <val>, <idx> ; yields <n x <ty>> + <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx> ; yields <aggregate type>
The 'insertvalue' instruction inserts a value into a struct field or - array element in an aggregate.
- +The 'insertvalue' instruction inserts a value into a member field + in an aggregate value.
The first operand of an 'insertvalue' instruction is a value - of struct or array type. The - second operand is a first-class value to insert. The following operands are - constant indices indicating the position at which to insert the value in a - similar manner as indices in a + of struct, union or + array type. The second operand is a first-class + value to insert. The following operands are constant indices indicating + the position at which to insert the value in a similar manner as indices in a 'getelementptr' instruction. The value to insert must have the same type as the value identified by the indices.
@@ -3792,14 +4065,15 @@ Instruction- %result = insertvalue {i32, float} %agg, i32 1, 0 ; yields {i32, float} + %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef} + %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}-
A key design point of an SSA-based representation is how it represents memory. In LLVM, no memory locations are in SSA form, which makes things - very simple. This section describes how to read, write, allocate, and free + very simple. This section describes how to read, write, and allocate memory in LLVM.
- - - -- <result> = malloc <type>[, i32 <NumElements>][, align <alignment>] ; yields {type*}:result -- -
The 'malloc' instruction allocates memory from the system heap and - returns a pointer to it. The object is always allocated in the generic - address space (address space zero).
- -The 'malloc' instruction allocates - sizeof(<type>)*NumElements bytes of memory from the operating - system and returns a pointer of the appropriate type to the program. If - "NumElements" is specified, it is the number of elements allocated, otherwise - "NumElements" is defaulted to be one. If a constant alignment is specified, - the value result of the allocation is guaranteed to be aligned to at least - that boundary. If not specified, or if zero, the target can choose to align - the allocation on any convenient boundary compatible with the type.
- -'type' must be a sized type.
- -Memory is allocated using the system "malloc" function, and a - pointer is returned. The result of a zero byte allocation is undefined. The - result is null if there is insufficient memory available.
- -- %array = malloc [4 x i8] ; yields {[%4 x i8]*}:array - - %size = add i32 2, 2 ; yields {i32}:size = i32 4 - %array1 = malloc i8, i32 4 ; yields {i8*}:array1 - %array2 = malloc [12 x i8], i32 %size ; yields {[12 x i8]*}:array2 - %array3 = malloc i32, i32 4, align 1024 ; yields {i32*}:array3 - %array4 = malloc i32, align 1024 ; yields {i32*}:array4 -- -
Note that the code generator does not yet respect the alignment value.
- -- free <type> <value> ; yields {void} -- -
The 'free' instruction returns memory back to the unused memory heap - to be reallocated in the future.
- -'value' shall be a pointer value that points to a value that was - allocated with the 'malloc' instruction.
- -Access to the memory pointed to by the pointer is no longer defined after - this instruction executes. If the pointer is null, the operation is a - noop.
- -- %array = malloc [4 x i8] ; yields {[4 x i8]*}:array - free [4 x i8]* %array -- -
- <result> = load <ty>* <pointer>[, align <alignment>] - <result> = volatile load <ty>* <pointer>[, align <alignment>] + <result> = load <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] + <result> = volatile load <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] + !<index> = !{ i32 1 }
The optional constant "align" argument specifies the alignment of the +
The optional constant align argument specifies the alignment of the operation (that is, the alignment of the memory address). A value of 0 or an - omitted "align" argument means that the operation has the preferential + omitted align argument means that the operation has the preferential alignment for the target. It is the responsibility of the code emitter to ensure that the alignment information is correct. Overestimating the - alignment results in an undefined behavior. Underestimating the alignment may + alignment results in undefined behavior. Underestimating the alignment may produce less efficient code. An alignment of 1 is always safe.
+The optional !nontemporal metadata must reference a single + metatadata name <index> corresponding to a metadata node with + one i32 entry of value 1. The existence of + the !nontemporal metatadata on the instruction tells the optimizer + and code generator that this load is not expected to be reused in the cache. + The code generator may select special instructions to save cache bandwidth, + such as the MOVNT instruction on x86.
+The location of memory pointed to is loaded. If the value being loaded is of scalar type then the number of bytes read does not exceed the minimum number @@ -4003,8 +4204,8 @@ Instruction
- store <ty> <value>, <ty>* <pointer>[, align <alignment>] ; yields {void} - volatile store <ty> <value>, <ty>* <pointer>[, align <alignment>] ; yields {void} + store <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !] ; yields {void} + volatile store <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal ! ] ; yields {void}
The optional !nontemporal metadata must reference a single metatadata
+ name
The contents of memory are updated to contain '<value>' at the location specified by the '<pointer>' operand. If @@ -4063,8 +4273,8 @@ Instruction
The 'getelementptr' instruction is used to get the address of a - subelement of an aggregate data structure. It performs address calculation - only and does not access memory.
+ subelement of an aggregate data structure. + It performs address calculation only and does not access memory.The first argument is always a pointer, and forms the basis of the @@ -4074,15 +4284,15 @@ Instruction indexes the pointer value given as the first argument, the second index indexes a value of the type pointed to (not necessarily the value directly pointed to, since the first index can be non-zero), etc. The first type - indexed into must be a pointer value, subsequent types can be arrays, vectors - and structs. Note that subsequent types being indexed into can never be - pointers, since that would require loading the pointer before continuing - calculation.
+ indexed into must be a pointer value, subsequent types can be arrays, + vectors, structs and unions. Note that subsequent types being indexed into + can never be pointers, since that would require loading the pointer before + continuing calculation.The type of each index argument depends on the type it is indexing into. - When indexing into a (optionally packed) structure, only i32 integer - constants are allowed. When indexing into an array, pointer or - vector, integers of any width are allowed, and they are not required to be + When indexing into a (optionally packed) structure or union, only i32 + integer constants are allowed. When indexing into an array, pointer + or vector, integers of any width are allowed, and they are not required to be constant.
For example, let's consider a C code fragment and how it gets compiled to @@ -4227,7 +4437,7 @@ entry:
%X = trunc i32 257 to i8 ; yields i8:1 %Y = trunc i32 123 to i1 ; yields i1:true - %Y = trunc i32 122 to i1 ; yields i1:false + %Z = trunc i32 122 to i1 ; yields i1:false@@ -4244,15 +4454,15 @@ entry:
The 'zext' instruction zero extends its operand to type +
The 'zext' instruction zero extends its operand to type ty2.
The 'zext' instruction takes a value to cast, which must be of +
The 'zext' instruction takes a value to cast, which must be of integer type, and a type to cast it to, which must also be of integer type. The bit size of the - value must be smaller than the bit size of the destination type, + value must be smaller than the bit size of the destination type, ty2.
The 'sext' sign extends value to the type ty2.
The 'sext' instruction takes a value to cast, which must be of +
The 'sext' instruction takes a value to cast, which must be of integer type, and a type to cast it to, which must also be of integer type. The bit size of the - value must be smaller than the bit size of the destination type, + value must be smaller than the bit size of the destination type, ty2.
The 'fptrunc' instruction takes a floating point value to cast and a floating point type to cast it to. The size of value must be larger than the size of - ty2. This implies that fptrunc cannot be used to make a + ty2. This implies that fptrunc cannot be used to make a no-op cast.
The 'fptrunc' instruction truncates a value from a larger - floating point type to a smaller + floating point type to a smaller floating point type. If the value cannot fit within the destination type, ty2, then the results are undefined.
@@ -4359,7 +4569,7 @@ entry: floating point value.The 'fpext' instruction takes a +
The 'fpext' instruction takes a floating point value to cast, and a floating point type to cast it to. The source type must be smaller than the destination type.
@@ -4402,7 +4612,7 @@ entry: vector integer type with the same number of elements as tyThe 'fptoui' instruction converts its +
The 'fptoui' instruction converts its floating point operand into the nearest (rounding towards zero) unsigned integer value. If the value cannot fit in ty2, the results are undefined.
@@ -4411,7 +4621,7 @@ entry:%X = fptoui double 123.0 to i32 ; yields i32:123 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1 - %X = fptoui float 1.04E+17 to i8 ; yields undefined:1 + %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1@@ -4428,7 +4638,7 @@ entry:
The 'fptosi' instruction converts +
The 'fptosi' instruction converts floating point value to type ty2.
@@ -4440,7 +4650,7 @@ entry: vector integer type with the same number of elements as tyThe 'fptosi' instruction converts its +
The 'fptosi' instruction converts its floating point operand into the nearest (rounding towards zero) signed integer value. If the value cannot fit in ty2, the results are undefined.
@@ -4449,7 +4659,7 @@ entry:%X = fptosi double -123.0 to i32 ; yields i32:-123 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1 - %X = fptosi float 1.04E+17 to i8 ; yields undefined:1 + %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1@@ -4593,8 +4803,8 @@ entry:
%X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture - %X = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture - %Y = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture + %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture + %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture@@ -4637,7 +4847,7 @@ entry:
%X = bitcast i8 255 to i8 ; yields i8 :-1 %Y = bitcast i32* %x to sint* ; yields sint*:%x - %Z = bitcast <2 x int> %V to i64; ; yields i64: %V + %Z = bitcast <2 x int> %V to i64; ; yields i64: %V@@ -4697,11 +4907,11 @@ entry: result, as follows:
This instruction requires several arguments:
llvm::GuaranteedTailCallOpt
is true
.To learn how to add an intrinsic function, please see the +
To learn how to add an intrinsic function, please see the Extending LLVM Guide.
@@ -5607,7 +5836,7 @@ LLVM.This intrinsic does not modify the behavior of the program. Backends that do - not support this intrinisic may ignore it.
+ not support this intrinsic may ignore it. @@ -5661,17 +5890,14 @@ LLVM.This is an overloaded intrinsic. You can use llvm.memcpy on any - integer bit width. Not all targets support all bit widths however.
+ integer bit width and for different address spaces. Not all targets support + all bit widths however.- declare void @llvm.memcpy.i8(i8 * <dest>, i8 * <src>, - i8 <len>, i32 <align>) - declare void @llvm.memcpy.i16(i8 * <dest>, i8 * <src>, - i16 <len>, i32 <align>) - declare void @llvm.memcpy.i32(i8 * <dest>, i8 * <src>, - i32 <len>, i32 <align>) - declare void @llvm.memcpy.i64(i8 * <dest>, i8 * <src>, - i64 <len>, i32 <align>) + declare void @llvm.memcpy.p0i8.p0i8.i32(i8 * <dest>, i8 * <src>, + i32 <len>, i32 <align>, i1 <isvolatile>) + declare void @llvm.memcpy.p0i8.p0i8.i64(i8 * <dest>, i8 * <src>, + i64 <len>, i32 <align>, i1 <isvolatile>)
Note that, unlike the standard libc function, the llvm.memcpy.* - intrinsics do not return a value, and takes an extra alignment argument.
+ intrinsics do not return a value, takes extra alignment/isvolatile arguments + and the pointers can be in specified address spaces.The first argument is a pointer to the destination, the second is a pointer to the source. The third argument is an integer argument specifying the - number of bytes to copy, and the fourth argument is the alignment of the - source and destination locations.
+ number of bytes to copy, the fourth argument is the alignment of the + source and destination locations, and the fifth is a boolean indicating a + volatile access. -If the call to this intrinisic has an alignment value that is not 0 or 1, +
If the call to this intrinsic has an alignment value that is not 0 or 1, then the caller guarantees that both the source and destination pointers are aligned to that boundary.
+Volatile accesses should not be deleted if dead, but the access behavior is + not very cleanly specified and it is unwise to depend on it.
+The 'llvm.memcpy.*' intrinsics copy a block of memory from the source location to the destination location, which are not allowed to overlap. It copies "len" bytes of memory over. If the argument is known to @@ -5709,17 +5942,14 @@ LLVM.
This is an overloaded intrinsic. You can use llvm.memmove on any integer bit - width. Not all targets support all bit widths however.
+ width and for different address space. Not all targets support all bit + widths however.- declare void @llvm.memmove.i8(i8 * <dest>, i8 * <src>, - i8 <len>, i32 <align>) - declare void @llvm.memmove.i16(i8 * <dest>, i8 * <src>, - i16 <len>, i32 <align>) - declare void @llvm.memmove.i32(i8 * <dest>, i8 * <src>, - i32 <len>, i32 <align>) - declare void @llvm.memmove.i64(i8 * <dest>, i8 * <src>, - i64 <len>, i32 <align>) + declare void @llvm.memmove.p0i8.p0i8.i32(i8 * <dest>, i8 * <src>, + i32 <len>, i32 <align>, i1 <isvolatile>) + declare void @llvm.memmove.p0i8.p0i8.i64(i8 * <dest>, i8 * <src>, + i64 <len>, i32 <align>, i1 <isvolatile>)
Note that, unlike the standard libc function, the llvm.memmove.* - intrinsics do not return a value, and takes an extra alignment argument.
+ intrinsics do not return a value, takes extra alignment/isvolatile arguments + and the pointers can be in specified address spaces.The first argument is a pointer to the destination, the second is a pointer to the source. The third argument is an integer argument specifying the - number of bytes to copy, and the fourth argument is the alignment of the - source and destination locations.
+ number of bytes to copy, the fourth argument is the alignment of the + source and destination locations, and the fifth is a boolean indicating a + volatile access. -If the call to this intrinisic has an alignment value that is not 0 or 1, +
If the call to this intrinsic has an alignment value that is not 0 or 1, then the caller guarantees that the source and destination pointers are aligned to that boundary.
+Volatile accesses should not be deleted if dead, but the access behavior is + not very cleanly specified and it is unwise to depend on it.
+The 'llvm.memmove.*' intrinsics copy a block of memory from the source location to the destination location, which may overlap. It copies "len" bytes of memory over. If the argument is known to be aligned to some @@ -5759,17 +5996,14 @@ LLVM.
This is an overloaded intrinsic. You can use llvm.memset on any integer bit - width. Not all targets support all bit widths however.
+ width and for different address spaces. Not all targets support all bit + widths however.- declare void @llvm.memset.i8(i8 * <dest>, i8 <val>, - i8 <len>, i32 <align>) - declare void @llvm.memset.i16(i8 * <dest>, i8 <val>, - i16 <len>, i32 <align>) - declare void @llvm.memset.i32(i8 * <dest>, i8 <val>, - i32 <len>, i32 <align>) - declare void @llvm.memset.i64(i8 * <dest>, i8 <val>, - i64 <len>, i32 <align>) + declare void @llvm.memset.p0i8.i32(i8 * <dest>, i8 <val>, + i32 <len>, i32 <align>, i1 <isvolatile>) + declare void @llvm.memset.p0i8.i64(i8 * <dest>, i8 <val>, + i64 <len>, i32 <align>, i1 <isvolatile>)
Note that, unlike the standard libc function, the llvm.memset - intrinsic does not return a value, and takes an extra alignment argument.
+ intrinsic does not return a value, takes extra alignment/volatile arguments, + and the destination can be in an arbitrary address space.The first argument is a pointer to the destination to fill, the second is the @@ -5785,10 +6020,13 @@ LLVM.
specifying the number of bytes to fill, and the fourth argument is the known alignment of destination location. -If the call to this intrinisic has an alignment value that is not 0 or 1, +
If the call to this intrinsic has an alignment value that is not 0 or 1, then the caller guarantees that the destination pointer is aligned to that boundary.
+Volatile accesses should not be deleted if dead, but the access behavior is + not very cleanly specified and it is unwise to depend on it.
+The 'llvm.memset.*' intrinsics fill "len" bytes of memory starting at the destination location. If the argument is known to be aligned to some @@ -6408,6 +6646,97 @@ LLVM.
+ + + +Half precision floating point is a storage-only format. This means that it is + a dense encoding (in memory) but does not support computation in the + format.
+ +This means that code must first load the half-precision floating point + value as an i16, then convert it to float with llvm.convert.from.fp16. + Computation can then be performed on the float value (including extending to + double etc). To store the value back to memory, it is first converted to + float if needed, then converted to i16 with + llvm.convert.to.fp16, then + storing as an i16 value.
++ declare i16 @llvm.convert.to.fp16(f32 %a) ++ +
The 'llvm.convert.to.fp16' intrinsic function performs + a conversion from single precision floating point format to half precision + floating point format.
+ +The intrinsic function contains single argument - the value to be + converted.
+ +The 'llvm.convert.to.fp16' intrinsic function performs + a conversion from single precision floating point format to half precision + floating point format. The return value is an i16 which + contains the converted number.
+ ++ %res = call i16 @llvm.convert.to.fp16(f32 %a) + store i16 %res, i16* @x, align 2 ++ +
+ declare f32 @llvm.convert.from.fp16(i16 %a) ++ +
The 'llvm.convert.from.fp16' intrinsic function performs + a conversion from half precision floating point format to single precision + floating point format.
+ +The intrinsic function contains single argument - the value to be + converted.
+ +The 'llvm.convert.from.fp16' intrinsic function performs a + conversion from half single precision floating point format to single + precision floating point format. The input half-float value is represented by + an i16 value.
+ ++ %a = load i16* @x, align 2 + %res = call f32 @llvm.convert.from.fp16(i16 %a) ++ +
The llvm.memory.barrier intrinsic requires five boolean arguments. The first four arguments enables a specific barrier as listed below. The - fith argument specifies that the barrier applies to io or device or uncached + fifth argument specifies that the barrier applies to io or device or uncached memory.
-%ptr = malloc i32 +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* store i32 4, %ptr %result1 = load i32* %ptr ; yields {i32}:result1 = 4 @@ -6649,7 +6979,8 @@ LLVM.Examples:
-%ptr = malloc i32 +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* store i32 4, %ptr %val1 = add i32 4, 4 @@ -6704,7 +7035,8 @@ LLVM.Examples:
-%ptr = malloc i32 +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* store i32 4, %ptr %val1 = add i32 4, 4 @@ -6759,8 +7091,9 @@ LLVM.Examples:
-%ptr = malloc i32 - store i32 4, %ptr +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* + store i32 4, %ptr %result1 = call i32 @llvm.atomic.load.add.i32.p0i32( i32* %ptr, i32 4 ) ; yields {i32}:result1 = 4 %result2 = call i32 @llvm.atomic.load.add.i32.p0i32( i32* %ptr, i32 2 ) @@ -6793,7 +7126,7 @@ LLVM.Overview:
-This intrinsic subtracts delta to the value stored in memory at +
This intrinsic subtracts delta to the value stored in memory at ptr. It yields the original value at ptr.
Arguments:
@@ -6810,8 +7143,9 @@ LLVM.Examples:
-%ptr = malloc i32 - store i32 8, %ptr +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* + store i32 8, %ptr %result1 = call i32 @llvm.atomic.load.sub.i32.p0i32( i32* %ptr, i32 4 ) ; yields {i32}:result1 = 8 %result2 = call i32 @llvm.atomic.load.sub.i32.p0i32( i32* %ptr, i32 2 ) @@ -6887,8 +7221,9 @@ LLVM.Examples:
-%ptr = malloc i32 - store i32 0x0F0F, %ptr +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* + store i32 0x0F0F, %ptr %result0 = call i32 @llvm.atomic.load.nand.i32.p0i32( i32* %ptr, i32 0xFF ) ; yields {i32}:result0 = 0x0F0F %result1 = call i32 @llvm.atomic.load.and.i32.p0i32( i32* %ptr, i32 0xFF ) @@ -6947,7 +7282,7 @@ LLVM.Overview:
-These intrinsics takes the signed or unsigned minimum or maximum of +
These intrinsics takes the signed or unsigned minimum or maximum of delta and the value stored in memory at ptr. It yields the original value at ptr.
@@ -6965,8 +7300,9 @@ LLVM.Examples:
-%ptr = malloc i32 - store i32 7, %ptr +%mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32)) +%ptr = bitcast i8* %mallocP to i32* + store i32 7, %ptr %result0 = call i32 @llvm.atomic.load.min.i32.p0i32( i32* %ptr, i32 -2 ) ; yields {i32}:result0 = 7 %result1 = call i32 @llvm.atomic.load.max.i32.p0i32( i32* %ptr, i32 8 ) @@ -6980,6 +7316,133 @@ LLVM.
This class of intrinsics exists to information about the lifetime of memory + objects and ranges where variables are immutable.
+ ++ declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>) ++ +
The 'llvm.lifetime.start' intrinsic specifies the start of a memory + object's lifetime.
+ +The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.
+ +This intrinsic indicates that before this point in the code, the value of the + memory pointed to by ptr is dead. This means that it is known to + never be used and has an undefined value. A load from the pointer that + precedes this intrinsic can be replaced with + 'undef'.
+ ++ declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>) ++ +
The 'llvm.lifetime.end' intrinsic specifies the end of a memory + object's lifetime.
+ +The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.
+ +This intrinsic indicates that after this point in the code, the value of the + memory pointed to by ptr is dead. This means that it is known to + never be used and has an undefined value. Any stores into the memory object + following this intrinsic may be removed as dead. + +
+ declare {}* @llvm.invariant.start(i64 <size>, i8* nocapture <ptr>) readonly ++ +
The 'llvm.invariant.start' intrinsic specifies that the contents of + a memory object will not change.
+ +The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.
+ +This intrinsic indicates that until an llvm.invariant.end that uses + the return value, the referenced memory location is constant and + unchanging.
+ ++ declare void @llvm.invariant.end({}* <start>, i64 <size>, i8* nocapture <ptr>) ++ +
The 'llvm.invariant.end' intrinsic specifies that the contents of + a memory object are mutable.
+ +The first argument is the matching llvm.invariant.start intrinsic. + The second argument is a constant integer representing the size of the + object, or -1 if it is variable sized and the third argument is a pointer + to the object.
+ +This intrinsic indicates that the memory is mutable again.
+ ++ declare i32 @llvm.objectsize.i32( i8* <object>, i1 <type> ) + declare i64 @llvm.objectsize.i64( i8* <object>, i1 <type> ) ++ +
The llvm.objectsize intrinsic is designed to provide information + to the optimizers to discover at compile time either a) when an + operation like memcpy will either overflow a buffer that corresponds to + an object, or b) to determine that a runtime check for overflow isn't + necessary. An object in this context means an allocation of a + specific class, structure, array, or other object.
+ +The llvm.objectsize intrinsic takes two arguments. The first + argument is a pointer to or into the object. The second argument + is a boolean 0 or 1. This argument determines whether you want the + maximum (0) or minimum (1) bytes remaining. This needs to be a literal 0 or + 1, variables are not allowed.
+ +The llvm.objectsize intrinsic is lowered to either a constant + representing the size of the object concerned or i32/i64 -1 or 0 + (depending on the type argument if the size cannot be determined + at compile time.
+ +