From: Chris Lattner Date: Sat, 8 Apr 2006 23:07:04 +0000 (+0000) Subject: Move the vector instructions to their own subsection. X-Git-Url: http://plrg.eecs.uci.edu/git/?a=commitdiff_plain;h=3df241e4b2b93329fdeea16a4eb366347442f99f;p=oota-llvm.git Move the vector instructions to their own subsection. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27545 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/docs/LangRef.html b/docs/LangRef.html index af50b9c6b45..bf638843571 100644 --- a/docs/LangRef.html +++ b/docs/LangRef.html @@ -91,6 +91,16 @@
  • 'shr' Instruction
  • +
  • Vector Operations +
      +
    1. 'extractelement' Instruction
    2. +
    3. 'insertelement' Instruction
    4. +
    5. 'shufflevector' Instruction
    6. +
    7. 'vsetint' Instruction
    8. +
    9. 'vsetfp' Instruction
    10. +
    11. 'vselect' Instruction
    12. +
    +
  • Memory Access Operations
    1. 'malloc' Instruction
    2. @@ -106,12 +116,6 @@
    3. 'phi' Instruction
    4. 'cast .. to' Instruction
    5. 'select' Instruction
    6. -
    7. 'vsetint' Instruction
    8. -
    9. 'vsetfp' Instruction
    10. -
    11. 'vselect' Instruction
    12. -
    13. 'extractelement' Instruction
    14. -
    15. 'insertelement' Instruction
    16. -
    17. 'shufflevector' Instruction
    18. 'call' Instruction
    19. 'va_arg' Instruction
    @@ -1896,21 +1900,23 @@ positions.

    -

    A key design point of an SSA-based representation is how it -represents memory. In LLVM, no memory locations are in SSA form, which -makes things very simple. This section describes how to read, write, -allocate, and free memory in LLVM.

    +

    LLVM supports several instructions to represent vector operations in a +target-independent manner. This instructions cover the element-access and +vector-specific operations needed to process vectors effectively. While LLVM +does directly support these vector operations, many sophisticated algorithms +will want to use target-specific intrinsics to take full advantage of a specific +target.

    @@ -1918,48 +1924,45 @@ allocate, and free memory in LLVM.

    Syntax:
    -  <result> = malloc <type>[, uint <NumElements>][, align <alignment>]     ; yields {type*}:result
    +  <result> = extractelement <n x <ty>> <val>, uint <idx>    ; yields <ty>
     
    Overview:
    -

    The 'malloc' instruction allocates memory from the system -heap and returns a pointer to it.

    +

    +The 'extractelement' instruction extracts a single scalar +element from a packed vector at a specified index. +

    -
    Arguments:
    -

    The 'malloc' instruction allocates -sizeof(<type>)*NumElements -bytes of memory from the operating system and returns a pointer of the -appropriate type to the program. If "NumElements" is specified, it is the -number of elements allocated. If an alignment is specified, the value result -of the allocation is guaranteed to be aligned to at least that boundary. If -not specified, or if zero, the target can choose to align the allocation on any -convenient boundary.

    +
    Arguments:
    -

    'type' must be a sized type.

    +

    +The first operand of an 'extractelement' instruction is a +value of packed type. The second operand is +an index indicating the position from which to extract the element. +The index may be a variable.

    Semantics:
    -

    Memory is allocated using the system "malloc" function, and -a pointer is returned.

    +

    +The result is a scalar of the same type as the element type of +val. Its value is the value at position idx of +val. If idx exceeds the length of val, the +results are undefined. +

    Example:
    -  %array  = malloc [4 x ubyte ]                    ; yields {[%4 x ubyte]*}:array
    -
    -  %size   = add uint 2, 2                          ; yields {uint}:size = uint 4
    -  %array1 = malloc ubyte, uint 4                   ; yields {ubyte*}:array1
    -  %array2 = malloc [12 x ubyte], uint %size        ; yields {[12 x ubyte]*}:array2
    -  %array3 = malloc int, uint 4, align 1024         ; yields {int*}:array3
    -  %array4 = malloc int, align 1024                 ; yields {int*}:array4
    +  %result = extractelement <4 x int> %vec, uint 0    ; yields int
     
    +
    @@ -1967,36 +1970,45 @@ a pointer is returned.

    Syntax:
    -  free <type> <value>                              ; yields {void}
    +  <result> = insertelement <n x <ty>> <val>, <ty> <elt>, uint <idx>    ; yields <n x <ty>>
     
    Overview:
    -

    The 'free' instruction returns memory back to the unused -memory heap to be reallocated in the future.

    +

    +The 'insertelement' instruction inserts a scalar +element into a packed vector at a specified index. +

    +
    Arguments:
    -

    'value' shall be a pointer value that points to a value -that was allocated with the 'malloc' -instruction.

    +

    +The first operand of an 'insertelement' instruction is a +value of packed type. The second operand is a +scalar value whose type must equal the element type of the first +operand. The third operand is an index indicating the position at +which to insert the value. The index may be a variable.

    Semantics:
    -

    Access to the memory pointed to by the pointer is no longer defined -after this instruction executes.

    +

    +The result is a packed vector of the same type as val. Its +element values are those of val except at position +idx, where it gets the value elt. If idx +exceeds the length of val, the results are undefined. +

    Example:
    -  %array  = malloc [4 x ubyte]                    ; yields {[4 x ubyte]*}:array
    -            free   [4 x ubyte]* %array
    +  %result = insertelement <4 x int> %vec, int 1, uint 0    ; yields <4 x int>
     
    @@ -2004,243 +2016,255 @@ after this instruction executes.

    Syntax:
    -  <result> = alloca <type>[, uint <NumElements>][, align <alignment>]     ; yields {type*}:result
    +  <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <n x uint> <mask>    ; yields <n x <ty>>
     
    Overview:
    -

    The 'alloca' instruction allocates memory on the current -stack frame of the procedure that is live until the current function -returns to its caller.

    +

    +The 'shufflevector' instruction constructs a permutation of elements +from two input vectors, returning a vector of the same type. +

    Arguments:
    -

    The 'alloca' instruction allocates sizeof(<type>)*NumElements -bytes of memory on the runtime stack, returning a pointer of the -appropriate type to the program. If "NumElements" is specified, it is the -number of elements allocated. If an alignment is specified, the value result -of the allocation is guaranteed to be aligned to at least that boundary. If -not specified, or if zero, the target can choose to align the allocation on any -convenient boundary.

    +

    +The first two operands of a 'shufflevector' instruction are vectors +with types that match each other and types that match the result of the +instruction. The third argument is a shuffle mask, which has the same number +of elements as the other vector type, but whose element type is always 'uint'. +

    -

    'type' may be any sized type.

    +

    +The shuffle mask operand is required to be a constant vector with either +constant integer or undef values. +

    Semantics:
    -

    Memory is allocated; a pointer is returned. 'alloca'd -memory is automatically released when the function returns. The 'alloca' -instruction is commonly used to represent automatic variables that must -have an address available. When the function returns (either with the ret or unwind -instructions), the memory is reclaimed.

    +

    +The elements of the two input vectors are numbered from left to right across +both of the vectors. The shuffle mask operand specifies, for each element of +the result vector, which element of the two input registers the result element +gets. The element selector may be undef (meaning "don't care") and the second +operand may be undef if performing a shuffle from only one vector. +

    Example:
    -  %ptr = alloca int                              ; yields {int*}:ptr
    -  %ptr = alloca int, uint 4                      ; yields {int*}:ptr
    -  %ptr = alloca int, uint 4, align 1024          ; yields {int*}:ptr
    -  %ptr = alloca int, align 1024                  ; yields {int*}:ptr
    +  %result = shufflevector <4 x int> %v1, <4 x int> %v2, 
    +                          <4 x uint> <uint 0, uint 4, uint 1, uint 5>    ; yields <4 x int>
    +  %result = shufflevector <4 x int> %v1, <4 x int> undef, 
    +                          <4 x uint> <uint 0, uint 1, uint 2, uint 3>  ; yields <4 x int> - Identity shuffle.
     
    + -
    'load' +
    Syntax:
    -
      <result> = load <ty>* <pointer>
    <result> = volatile load <ty>* <pointer>
    -
    Overview:
    -

    The 'load' instruction is used to read from memory.

    -
    Arguments:
    -

    The argument to the 'load' instruction specifies the memory -address from which to load. The pointer must point to a first class type. If the load is -marked as volatile, then the optimizer is not allowed to modify -the number or order of execution of this load with other -volatile load and store -instructions.

    -
    Semantics:
    -

    The location of memory pointed to is loaded.

    -
    Examples:
    -
      %ptr = alloca int                               ; yields {int*}:ptr
    -  store int 3, int* %ptr                          ; yields {void}
    -  %val = load int* %ptr                           ; yields {int}:val = int 3
    -
    -
    - - -
    Syntax:
    -
      store <ty> <value>, <ty>* <pointer>                   ; yields {void}
    -  volatile store <ty> <value>, <ty>* <pointer>                   ; yields {void}
    +
    <result> = vsetint <op>, <n x <ty>> <var1>, <var2>   ; yields <n x bool>
     
    +
    Overview:
    -

    The 'store' instruction is used to write to memory.

    + +

    The 'vsetint' instruction takes two integer vectors and +returns a vector of boolean values representing, at each position, the +result of the comparison between the values at that position in the +two operands.

    +
    Arguments:
    -

    There are two arguments to the 'store' instruction: a value -to store and an address in which to store it. The type of the '<pointer>' -operand must be a pointer to the type of the '<value>' -operand. If the store is marked as volatile, then the -optimizer is not allowed to modify the number or order of execution of -this store with other volatile load and store instructions.

    + +

    The arguments to a 'vsetint' instruction are a comparison +operation and two value arguments. The value arguments must be of integral packed type, +and they must have identical types. The operation argument must be +one of eq, ne, slt, sgt, +sle, sge, ult, ugt, ule, +uge, true, and false. The result is a +packed bool value with the same length as each operand.

    +
    Semantics:
    -

    The contents of memory are updated to contain '<value>' -at the location specified by the '<pointer>' operand.

    + +

    The following table shows the semantics of 'vsetint'. For +each position of the result, the comparison is done on the +corresponding positions of the two value arguments. Note that the +signedness of the comparison depends on the comparison opcode and +not on the signedness of the value operands. E.g., vsetint +slt <4 x unsigned> %x, %y does an elementwise signed +comparison of %x and %y.

    + + + + + + + + + + + + + + + + + +
    OperationResult is true iffComparison is
    eqvar1 == var2--
    nevar1 != var2--
    sltvar1 < var2signed
    sgtvar1 > var2signed
    slevar1 <= var2signed
    sgevar1 >= var2signed
    ultvar1 < var2unsigned
    ugtvar1 > var2unsigned
    ulevar1 <= var2unsigned
    ugevar1 >= var2unsigned
    truealways--
    falsenever--
    +
    Example:
    -
      %ptr = alloca int                               ; yields {int*}:ptr
    -  store int 3, int* %ptr                          ; yields {void}
    -  %val = load int* %ptr                           ; yields {int}:val = int 3
    +
      <result> = vsetint eq <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = false, false
    +  <result> = vsetint ne <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = true, true
    +  <result> = vsetint slt <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = true, false
    +  <result> = vsetint sgt <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = false, true
    +  <result> = vsetint sle <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = true, false
    +  <result> = vsetint sge <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = false, true
     
    - - + +
    Syntax:
    -
    -  <result> = getelementptr <ty>* <ptrval>{, <ty> <idx>}*
    +
    <result> = vsetfp <op>, <n x <ty>> <var1>, <var2>   ; yields <n x bool>
     
    Overview:
    -

    -The 'getelementptr' instruction is used to get the address of a -subelement of an aggregate data structure.

    +

    The 'vsetfp' instruction takes two floating point vector +arguments and returns a vector of boolean values representing, at each +position, the result of the comparison between the values at that +position in the two operands.

    Arguments:
    -

    This instruction takes a list of integer constants that indicate what -elements of the aggregate object to index to. The actual types of the arguments -provided depend on the type of the first pointer argument. The -'getelementptr' instruction is used to index down through the type -levels of a structure or to a specific index in an array. When indexing into a -structure, only uint -integer constants are allowed. When indexing into an array or pointer, -int and long indexes are allowed of any sign.

    +

    The arguments to a 'vsetfp' instruction are a comparison +operation and two value arguments. The value arguments must be of floating point packed +type, and they must have identical types. The operation argument must +be one of eq, ne, lt, gt, +le, ge, oeq, one, olt, +ogt, ole, oge, ueq, une, +ult, ugt, ule, uge, o, +u, true, and false. The result is a packed +bool value with the same length as each operand.

    -

    For example, let's consider a C code fragment and how it gets -compiled to LLVM:

    +
    Semantics:
    -
    -  struct RT {
    -    char A;
    -    int B[10][20];
    -    char C;
    -  };
    -  struct ST {
    -    int X;
    -    double Y;
    -    struct RT Z;
    -  };
    +

    The following table shows the semantics of 'vsetfp' for +floating point types. If either operand is a floating point Not a +Number (NaN) value, the operation is unordered, and the value in the +first column below is produced at that position. Otherwise, the +operation is ordered, and the value in the second column is +produced.

    - int *foo(struct ST *s) { - return &s[1].Z.B[5][13]; - } + + + + + + + + + + + + + + + + + + + + + + + + + + +
    OperationIf unorderedOtherwise true iff
    equndefinedvar1 == var2
    neundefinedvar1 != var2
    ltundefinedvar1 < var2
    gtundefinedvar1 > var2
    leundefinedvar1 <= var2
    geundefinedvar1 >= var2
    oeqfalsevar1 == var2
    onefalsevar1 != var2
    oltfalsevar1 < var2
    ogtfalsevar1 > var2
    olefalsevar1 <= var2
    ogefalsevar1 >= var2
    ueqtruevar1 == var2
    unetruevar1 != var2
    ulttruevar1 < var2
    ugttruevar1 > var2
    uletruevar1 <= var2
    ugetruevar1 >= var2
    ofalsealways
    utruenever
    truetruealways
    falsefalsenever
    + +
    Example:
    +
      <result> = vsetfp eq <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = false, false
    +  <result> = vsetfp ne <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = true, true
    +  <result> = vsetfp lt <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = true, false
    +  <result> = vsetfp gt <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = false, true
    +  <result> = vsetfp le <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = true, false
    +  <result> = vsetfp ge <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = false, true
     
    +
    -

    The LLVM code generated by the GCC frontend is:

    + + -
    -  %RT = type { sbyte, [10 x [20 x int]], sbyte }
    -  %ST = type { int, double, %RT }
    +
    - implementation +
    Syntax:
    - int* %foo(%ST* %s) { - entry: - %reg = getelementptr %ST* %s, int 1, uint 2, uint 1, int 5, int 13 - ret int* %reg - } +
    +  <result> = vselect <n x bool> <cond>, <n x <ty>> <val1>, <n x <ty>> <val2> ; yields <n x <ty>>
     
    -
    Semantics:
    +
    Overview:
    -

    The index types specified for the 'getelementptr' instruction depend -on the pointer type that is being indexed into. Pointer -and array types require uint, int, -ulong, or long values, and structure -types require uint constants.

    +

    +The 'vselect' instruction chooses one value at each position +of a vector based on a condition. +

    -

    In the example above, the first index is indexing into the '%ST*' -type, which is a pointer, yielding a '%ST' = '{ int, double, %RT -}' type, a structure. The second index indexes into the third element of -the structure, yielding a '%RT' = '{ sbyte, [10 x [20 x int]], -sbyte }' type, another structure. The third index indexes into the second -element of the structure, yielding a '[10 x [20 x int]]' type, an -array. The two dimensions of the array are subscripted into, yielding an -'int' type. The 'getelementptr' instruction returns a pointer -to this element, thus computing a value of 'int*' type.

    -

    Note that it is perfectly legal to index partially through a -structure, returning a pointer to an inner element. Because of this, -the LLVM code for the given testcase is equivalent to:

    +
    Arguments:
    -
    -  int* %foo(%ST* %s) {
    -    %t1 = getelementptr %ST* %s, int 1                        ; yields %ST*:%t1
    -    %t2 = getelementptr %ST* %t1, int 0, uint 2               ; yields %RT*:%t2
    -    %t3 = getelementptr %RT* %t2, int 0, uint 1               ; yields [10 x [20 x int]]*:%t3
    -    %t4 = getelementptr [10 x [20 x int]]* %t3, int 0, int 5  ; yields [20 x int]*:%t4
    -    %t5 = getelementptr [20 x int]* %t4, int 0, int 13        ; yields int*:%t5
    -    ret int* %t5
    -  }
    -
    +

    +The 'vselect' instruction requires a packed bool value indicating the +condition at each vector position, and two values of the same packed +type. All three operands must have the same length. The type of the +result is the same as the type of the two value operands.

    -

    Note that it is undefined to access an array out of bounds: array and -pointer indexes must always be within the defined bounds of the array type. -The one exception for this rules is zero length arrays. These arrays are -defined to be accessible as variable length arrays, which requires access -beyond the zero'th element.

    +
    Semantics:
    + +

    +At each position where the bool vector is true, that position +of the result gets its value from the first value argument; otherwise, +it gets its value from the second value argument. +

    Example:
    -    ; yields [12 x ubyte]*:aptr
    -    %aptr = getelementptr {int, [12 x ubyte]}* %sptr, long 0, uint 1
    +  %X = vselect bool <2 x bool> <bool true, bool false>, <2 x ubyte> <ubyte 17, ubyte 17>, 
    +    <2 x ubyte> <ubyte 42, ubyte 42>      ; yields <2 x ubyte>:17, 42
     
    -
    + + + - -
    -

    The instructions in this category are the "miscellaneous" -instructions, which defy better classification.

    + - - +
    -
    Syntax:
    -
      <result> = phi <ty> [ <val0>, <label0>], ...
    -
    Overview:
    -

    The 'phi' instruction is used to implement the φ node in -the SSA graph representing the function.

    -
    Arguments:
    -

    The type of the incoming values are specified with the first type -field. After this, the 'phi' instruction takes a list of pairs -as arguments, with one pair for each predecessor basic block of the -current block. Only values of first class -type may be used as the value arguments to the PHI node. Only labels -may be used as the label arguments.

    -

    There must be no non-phi instructions between the start of a basic -block and the PHI instructions: i.e. PHI instructions must be first in -a basic block.

    -
    Semantics:
    -

    At runtime, the 'phi' instruction logically takes on the -value specified by the parameter, depending on which basic block we -came from in the last terminator instruction.

    -
    Example:
    -
    Loop:       ; Infinite loop that counts from 0 on up...
    %indvar = phi uint [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
    %nextindvar = add uint %indvar, 1
    br label %Loop
    + +

    A key design point of an SSA-based representation is how it +represents memory. In LLVM, no memory locations are in SSA form, which +makes things very simple. This section describes how to read, write, +allocate, and free memory in LLVM.

    +
    @@ -2248,58 +2272,48 @@ came from in the last terminator instruction.

    Syntax:
    -  <result> = cast <ty> <value> to <ty2>             ; yields ty2
    +  <result> = malloc <type>[, uint <NumElements>][, align <alignment>]     ; yields {type*}:result
     
    Overview:
    -

    -The 'cast' instruction is used as the primitive means to convert -integers to floating point, change data type sizes, and break type safety (by -casting pointers). -

    - +

    The 'malloc' instruction allocates memory from the system +heap and returns a pointer to it.

    Arguments:
    -

    -The 'cast' instruction takes a value to cast, which must be a first -class value, and a type to cast it to, which must also be a first class type. -

    - -
    Semantics:
    +

    The 'malloc' instruction allocates +sizeof(<type>)*NumElements +bytes of memory from the operating system and returns a pointer of the +appropriate type to the program. If "NumElements" is specified, it is the +number of elements allocated. If an alignment is specified, the value result +of the allocation is guaranteed to be aligned to at least that boundary. If +not specified, or if zero, the target can choose to align the allocation on any +convenient boundary.

    -

    -This instruction follows the C rules for explicit casts when determining how the -data being cast must change to fit in its new container. -

    +

    'type' must be a sized type.

    -

    -When casting to bool, any value that would be considered true in the context of -a C 'if' condition is converted to the boolean 'true' values, -all else are 'false'. -

    +
    Semantics:
    -

    -When extending an integral value from a type of one signness to another (for -example 'sbyte' to 'ulong'), the value is sign-extended if the -source value is signed, and zero-extended if the source value is -unsigned. bool values are always zero extended into either zero or -one. -

    +

    Memory is allocated using the system "malloc" function, and +a pointer is returned.

    Example:
    -  %X = cast int 257 to ubyte              ; yields ubyte:1
    -  %Y = cast int 123 to bool               ; yields bool:true
    +  %array  = malloc [4 x ubyte ]                    ; yields {[%4 x ubyte]*}:array
    +
    +  %size   = add uint 2, 2                          ; yields {uint}:size = uint 4
    +  %array1 = malloc ubyte, uint 4                   ; yields {ubyte*}:array1
    +  %array2 = malloc [12 x ubyte], uint %size        ; yields {[12 x ubyte]*}:array2
    +  %array3 = malloc int, uint 4, align 1024         ; yields {int*}:array3
    +  %array4 = malloc int, align 1024                 ; yields {int*}:array4
     
    @@ -2307,271 +2321,280 @@ one.
    Syntax:
    -  <result> = select bool <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
    +  free <type> <value>                              ; yields {void}
     
    Overview:
    -

    -The 'select' instruction is used to choose one value based on a -condition, without branching. -

    - +

    The 'free' instruction returns memory back to the unused +memory heap to be reallocated in the future.

    Arguments:
    -

    -The 'select' instruction requires a boolean value indicating the condition, and two values of the same first class type. -

    +

    'value' shall be a pointer value that points to a value +that was allocated with the 'malloc' +instruction.

    Semantics:
    -

    -If the boolean condition evaluates to true, the instruction returns the first -value argument; otherwise, it returns the second value argument. -

    +

    Access to the memory pointed to by the pointer is no longer defined +after this instruction executes.

    Example:
    -  %X = select bool true, ubyte 17, ubyte 42          ; yields ubyte:17
    +  %array  = malloc [4 x ubyte]                    ; yields {[4 x ubyte]*}:array
    +            free   [4 x ubyte]* %array
     
    - + +
    +
    Syntax:
    -
    <result> = vsetint <op>, <n x <ty>> <var1>, <var2>   ; yields <n x bool>
    +
    +
    +  <result> = alloca <type>[, uint <NumElements>][, align <alignment>]     ; yields {type*}:result
     
    Overview:
    -

    The 'vsetint' instruction takes two integer vectors and -returns a vector of boolean values representing, at each position, the -result of the comparison between the values at that position in the -two operands.

    +

    The 'alloca' instruction allocates memory on the current +stack frame of the procedure that is live until the current function +returns to its caller.

    Arguments:
    -

    The arguments to a 'vsetint' instruction are a comparison -operation and two value arguments. The value arguments must be of integral packed type, -and they must have identical types. The operation argument must be -one of eq, ne, slt, sgt, -sle, sge, ult, ugt, ule, -uge, true, and false. The result is a -packed bool value with the same length as each operand.

    +

    The 'alloca' instruction allocates sizeof(<type>)*NumElements +bytes of memory on the runtime stack, returning a pointer of the +appropriate type to the program. If "NumElements" is specified, it is the +number of elements allocated. If an alignment is specified, the value result +of the allocation is guaranteed to be aligned to at least that boundary. If +not specified, or if zero, the target can choose to align the allocation on any +convenient boundary.

    -
    Semantics:
    +

    'type' may be any sized type.

    -

    The following table shows the semantics of 'vsetint'. For -each position of the result, the comparison is done on the -corresponding positions of the two value arguments. Note that the -signedness of the comparison depends on the comparison opcode and -not on the signedness of the value operands. E.g., vsetint -slt <4 x unsigned> %x, %y does an elementwise signed -comparison of %x and %y.

    +
    Semantics:
    - - - - - - - - - - - - - - - - -
    OperationResult is true iffComparison is
    eqvar1 == var2--
    nevar1 != var2--
    sltvar1 < var2signed
    sgtvar1 > var2signed
    slevar1 <= var2signed
    sgevar1 >= var2signed
    ultvar1 < var2unsigned
    ugtvar1 > var2unsigned
    ulevar1 <= var2unsigned
    ugevar1 >= var2unsigned
    truealways--
    falsenever--
    +

    Memory is allocated; a pointer is returned. 'alloca'd +memory is automatically released when the function returns. The 'alloca' +instruction is commonly used to represent automatic variables that must +have an address available. When the function returns (either with the ret or unwind +instructions), the memory is reclaimed.

    Example:
    -
      <result> = vsetint eq <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = false, false
    -  <result> = vsetint ne <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = true, true
    -  <result> = vsetint slt <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = true, false
    -  <result> = vsetint sgt <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = false, true
    -  <result> = vsetint sle <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = true, false
    -  <result> = vsetint sge <2 x int> <int 0, int 1>, <int 1, int 0>      ; yields {<2 x bool>}:result = false, true
    +
    +
    +  %ptr = alloca int                              ; yields {int*}:ptr
    +  %ptr = alloca int, uint 4                      ; yields {int*}:ptr
    +  %ptr = alloca int, uint 4, align 1024          ; yields {int*}:ptr
    +  %ptr = alloca int, align 1024                  ; yields {int*}:ptr
     
    -
    'vsetfp' +
    Syntax:
    -
    <result> = vsetfp <op>, <n x <ty>> <var1>, <var2>   ; yields <n x bool>
    -
    - +
      <result> = load <ty>* <pointer>
    <result> = volatile load <ty>* <pointer>
    Overview:
    - -

    The 'vsetfp' instruction takes two floating point vector -arguments and returns a vector of boolean values representing, at each -position, the result of the comparison between the values at that -position in the two operands.

    - +

    The 'load' instruction is used to read from memory.

    Arguments:
    - -

    The arguments to a 'vsetfp' instruction are a comparison -operation and two value arguments. The value arguments must be of floating point packed -type, and they must have identical types. The operation argument must -be one of eq, ne, lt, gt, -le, ge, oeq, one, olt, -ogt, ole, oge, ueq, une, -ult, ugt, ule, uge, o, -u, true, and false. The result is a packed -bool value with the same length as each operand.

    - +

    The argument to the 'load' instruction specifies the memory +address from which to load. The pointer must point to a first class type. If the load is +marked as volatile, then the optimizer is not allowed to modify +the number or order of execution of this load with other +volatile load and store +instructions.

    Semantics:
    - -

    The following table shows the semantics of 'vsetfp' for -floating point types. If either operand is a floating point Not a -Number (NaN) value, the operation is unordered, and the value in the -first column below is produced at that position. Otherwise, the -operation is ordered, and the value in the second column is -produced.

    - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    OperationIf unorderedOtherwise true iff
    equndefinedvar1 == var2
    neundefinedvar1 != var2
    ltundefinedvar1 < var2
    gtundefinedvar1 > var2
    leundefinedvar1 <= var2
    geundefinedvar1 >= var2
    oeqfalsevar1 == var2
    onefalsevar1 != var2
    oltfalsevar1 < var2
    ogtfalsevar1 > var2
    olefalsevar1 <= var2
    ogefalsevar1 >= var2
    ueqtruevar1 == var2
    unetruevar1 != var2
    ulttruevar1 < var2
    ugttruevar1 > var2
    uletruevar1 <= var2
    ugetruevar1 >= var2
    ofalsealways
    utruenever
    truetruealways
    falsefalsenever
    - +

    The location of memory pointed to is loaded.

    +
    Examples:
    +
      %ptr = alloca int                               ; yields {int*}:ptr
    +  store int 3, int* %ptr                          ; yields {void}
    +  %val = load int* %ptr                           ; yields {int}:val = int 3
    +
    +
    + + +
    Syntax:
    +
      store <ty> <value>, <ty>* <pointer>                   ; yields {void}
    +  volatile store <ty> <value>, <ty>* <pointer>                   ; yields {void}
    +
    +
    Overview:
    +

    The 'store' instruction is used to write to memory.

    +
    Arguments:
    +

    There are two arguments to the 'store' instruction: a value +to store and an address in which to store it. The type of the '<pointer>' +operand must be a pointer to the type of the '<value>' +operand. If the store is marked as volatile, then the +optimizer is not allowed to modify the number or order of execution of +this store with other volatile load and store instructions.

    +
    Semantics:
    +

    The contents of memory are updated to contain '<value>' +at the location specified by the '<pointer>' operand.

    Example:
    -
      <result> = vsetfp eq <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = false, false
    -  <result> = vsetfp ne <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = true, true
    -  <result> = vsetfp lt <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = true, false
    -  <result> = vsetfp gt <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = false, true
    -  <result> = vsetfp le <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = true, false
    -  <result> = vsetfp ge <2 x float> <float 0.0, float 1.0>, <float 1.0, float 0.0>      ; yields {<2 x bool>}:result = false, true
    +
      %ptr = alloca int                               ; yields {int*}:ptr
    +  store int 3, int* %ptr                          ; yields {void}
    +  %val = load int* %ptr                           ; yields {int}:val = int 3
     
    -
    -
    -
    Syntax:
    -
    -  <result> = vselect <n x bool> <cond>, <n x <ty>> <val1>, <n x <ty>> <val2> ; yields <n x <ty>>
    +  <result> = getelementptr <ty>* <ptrval>{, <ty> <idx>}*
     
    Overview:

    -The 'vselect' instruction chooses one value at each position -of a vector based on a condition. -

    - +The 'getelementptr' instruction is used to get the address of a +subelement of an aggregate data structure.

    Arguments:
    -

    -The 'vselect' instruction requires a packed bool value indicating the -condition at each vector position, and two values of the same packed -type. All three operands must have the same length. The type of the -result is the same as the type of the two value operands.

    - -
    Semantics:
    - -

    -At each position where the bool vector is true, that position -of the result gets its value from the first value argument; otherwise, -it gets its value from the second value argument. -

    +

    This instruction takes a list of integer constants that indicate what +elements of the aggregate object to index to. The actual types of the arguments +provided depend on the type of the first pointer argument. The +'getelementptr' instruction is used to index down through the type +levels of a structure or to a specific index in an array. When indexing into a +structure, only uint +integer constants are allowed. When indexing into an array or pointer, +int and long indexes are allowed of any sign.

    -
    Example:
    +

    For example, let's consider a C code fragment and how it gets +compiled to LLVM:

    -  %X = vselect bool <2 x bool> <bool true, bool false>, <2 x ubyte> <ubyte 17, ubyte 17>, 
    -    <2 x ubyte> <ubyte 42, ubyte 42>      ; yields <2 x ubyte>:17, 42
    +  struct RT {
    +    char A;
    +    int B[10][20];
    +    char C;
    +  };
    +  struct ST {
    +    int X;
    +    double Y;
    +    struct RT Z;
    +  };
    +
    +  int *foo(struct ST *s) {
    +    return &s[1].Z.B[5][13];
    +  }
     
    -
    - - +

    The LLVM code generated by the GCC frontend is:

    -
    +
    +  %RT = type { sbyte, [10 x [20 x int]], sbyte }
    +  %ST = type { int, double, %RT }
     
    -
    Syntax:
    + implementation -
    -  <result> = extractelement <n x <ty>> <val>, uint <idx>    ; yields <ty>
    +  int* %foo(%ST* %s) {
    +  entry:
    +    %reg = getelementptr %ST* %s, int 1, uint 2, uint 1, int 5, int 13
    +    ret int* %reg
    +  }
     
    -
    Overview:
    - -

    -The 'extractelement' instruction extracts a single scalar -element from a packed vector at a specified index. -

    +
    Semantics:
    +

    The index types specified for the 'getelementptr' instruction depend +on the pointer type that is being indexed into. Pointer +and array types require uint, int, +ulong, or long values, and structure +types require uint constants.

    -
    Arguments:
    +

    In the example above, the first index is indexing into the '%ST*' +type, which is a pointer, yielding a '%ST' = '{ int, double, %RT +}' type, a structure. The second index indexes into the third element of +the structure, yielding a '%RT' = '{ sbyte, [10 x [20 x int]], +sbyte }' type, another structure. The third index indexes into the second +element of the structure, yielding a '[10 x [20 x int]]' type, an +array. The two dimensions of the array are subscripted into, yielding an +'int' type. The 'getelementptr' instruction returns a pointer +to this element, thus computing a value of 'int*' type.

    -

    -The first operand of an 'extractelement' instruction is a -value of packed type. The second operand is -an index indicating the position from which to extract the element. -The index may be a variable.

    +

    Note that it is perfectly legal to index partially through a +structure, returning a pointer to an inner element. Because of this, +the LLVM code for the given testcase is equivalent to:

    -
    Semantics:
    +
    +  int* %foo(%ST* %s) {
    +    %t1 = getelementptr %ST* %s, int 1                        ; yields %ST*:%t1
    +    %t2 = getelementptr %ST* %t1, int 0, uint 2               ; yields %RT*:%t2
    +    %t3 = getelementptr %RT* %t2, int 0, uint 1               ; yields [10 x [20 x int]]*:%t3
    +    %t4 = getelementptr [10 x [20 x int]]* %t3, int 0, int 5  ; yields [20 x int]*:%t4
    +    %t5 = getelementptr [20 x int]* %t4, int 0, int 13        ; yields int*:%t5
    +    ret int* %t5
    +  }
    +
    -

    -The result is a scalar of the same type as the element type of -val. Its value is the value at position idx of -val. If idx exceeds the length of val, the -results are undefined. -

    +

    Note that it is undefined to access an array out of bounds: array and +pointer indexes must always be within the defined bounds of the array type. +The one exception for this rules is zero length arrays. These arrays are +defined to be accessible as variable length arrays, which requires access +beyond the zero'th element.

    Example:
    -  %result = extractelement <4 x int> %vec, uint 0    ; yields int
    +    ; yields [12 x ubyte]*:aptr
    +    %aptr = getelementptr {int, [12 x ubyte]}* %sptr, long 0, uint 1
     
    -
    +
    + + +
    +

    The instructions in this category are the "miscellaneous" +instructions, which defy better classification.

    +
    + + +
    +
    Syntax:
    +
      <result> = phi <ty> [ <val0>, <label0>], ...
    +
    Overview:
    +

    The 'phi' instruction is used to implement the φ node in +the SSA graph representing the function.

    +
    Arguments:
    +

    The type of the incoming values are specified with the first type +field. After this, the 'phi' instruction takes a list of pairs +as arguments, with one pair for each predecessor basic block of the +current block. Only values of first class +type may be used as the value arguments to the PHI node. Only labels +may be used as the label arguments.

    +

    There must be no non-phi instructions between the start of a basic +block and the PHI instructions: i.e. PHI instructions must be first in +a basic block.

    +
    Semantics:
    +

    At runtime, the 'phi' instruction logically takes on the +value specified by the parameter, depending on which basic block we +came from in the last terminator instruction.

    +
    Example:
    +
    Loop:       ; Infinite loop that counts from 0 on up...
    %indvar = phi uint [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
    %nextindvar = add uint %indvar, 1
    br label %Loop
    +
    @@ -2579,45 +2602,58 @@ results are undefined.
    Syntax:
    -  <result> = insertelement <n x <ty>> <val>, <ty> <elt>, uint <idx>    ; yields <n x <ty>>
    +  <result> = cast <ty> <value> to <ty2>             ; yields ty2
     
    Overview:

    -The 'insertelement' instruction inserts a scalar -element into a packed vector at a specified index. +The 'cast' instruction is used as the primitive means to convert +integers to floating point, change data type sizes, and break type safety (by +casting pointers).

    Arguments:

    -The first operand of an 'insertelement' instruction is a -value of packed type. The second operand is a -scalar value whose type must equal the element type of the first -operand. The third operand is an index indicating the position at -which to insert the value. The index may be a variable.

    +The 'cast' instruction takes a value to cast, which must be a first +class value, and a type to cast it to, which must also be a first class type. +

    Semantics:

    -The result is a packed vector of the same type as val. Its -element values are those of val except at position -idx, where it gets the value elt. If idx -exceeds the length of val, the results are undefined. +This instruction follows the C rules for explicit casts when determining how the +data being cast must change to fit in its new container. +

    + +

    +When casting to bool, any value that would be considered true in the context of +a C 'if' condition is converted to the boolean 'true' values, +all else are 'false'. +

    + +

    +When extending an integral value from a type of one signness to another (for +example 'sbyte' to 'ulong'), the value is sign-extended if the +source value is signed, and zero-extended if the source value is +unsigned. bool values are always zero extended into either zero or +one.

    Example:
    -  %result = insertelement <4 x int> %vec, int 1, uint 0    ; yields <4 x int>
    +  %X = cast int 257 to ubyte              ; yields ubyte:1
    +  %Y = cast int 123 to bool               ; yields bool:true
     
    @@ -2625,47 +2661,34 @@ exceeds the length of val, the results are undefined.
    Syntax:
    -  <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <n x uint> <mask>    ; yields <n x <ty>>
    +  <result> = select bool <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
     
    Overview:

    -The 'shufflevector' instruction constructs a permutation of elements -from two input vectors, returning a vector of the same type. +The 'select' instruction is used to choose one value based on a +condition, without branching.

    -
    Arguments:
    -

    -The first two operands of a 'shufflevector' instruction are vectors -with types that match each other and types that match the result of the -instruction. The third argument is a shuffle mask, which has the same number -of elements as the other vector type, but whose element type is always 'uint'. -

    +
    Arguments:

    -The shuffle mask operand is required to be a constant vector with either -constant integer or undef values. +The 'select' instruction requires a boolean value indicating the condition, and two values of the same first class type.

    Semantics:

    -The elements of the two input vectors are numbered from left to right across -both of the vectors. The shuffle mask operand specifies, for each element of -the result vector, which element of the two input registers the result element -gets. The element selector may be undef (meaning "don't care") and the second -operand may be undef if performing a shuffle from only one vector. +If the boolean condition evaluates to true, the instruction returns the first +value argument; otherwise, it returns the second value argument.

    Example:
    -  %result = shufflevector <4 x int> %v1, <4 x int> %v2, 
    -                          <4 x uint> <uint 0, uint 4, uint 1, uint 5>    ; yields <4 x int>
    -  %result = shufflevector <4 x int> %v1, <4 x int> undef, 
    -                          <4 x uint> <uint 0, uint 1, uint 2, uint 3>  ; yields <4 x int> - Identity shuffle.
    +  %X = select bool true, ubyte 17, ubyte 42          ; yields ubyte:17