X-Git-Url: http://plrg.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FLangRef.html;h=fed2f80696c3cbcfb4b06dbaee8d5ae39cea1f51;hb=778086caf71596d61b70db168e9f4b6598049cf0;hp=0ff7dd3338c73dbf40da1c4ba4985d7218174dcc;hpb=871eb0ab23a5d47da71c5584fae7ab9f64a0a4f6;p=oota-llvm.git diff --git a/docs/LangRef.html b/docs/LangRef.html index 0ff7dd3338c..fed2f80696c 100644 --- a/docs/LangRef.html +++ b/docs/LangRef.html @@ -5,7 +5,7 @@ LLVM Assembly Language Reference Manual - @@ -20,32 +20,52 @@
  • High Level Structure
    1. Module Structure
    2. -
    3. Linkage Types
    4. +
    5. Linkage Types +
        +
      1. 'private' Linkage
      2. +
      3. 'linker_private' Linkage
      4. +
      5. 'internal' Linkage
      6. +
      7. 'available_externally' Linkage
      8. +
      9. 'linkonce' Linkage
      10. +
      11. 'common' Linkage
      12. +
      13. 'weak' Linkage
      14. +
      15. 'appending' Linkage
      16. +
      17. 'extern_weak' Linkage
      18. +
      19. 'linkonce_odr' Linkage
      20. +
      21. 'weak_odr' Linkage
      22. +
      23. 'externally visible' Linkage
      24. +
      25. 'dllimport' Linkage
      26. +
      27. 'dllexport' Linkage
      28. +
      +
    6. Calling Conventions
    7. Named Types
    8. Global Variables
    9. Functions
    10. Aliases
    11. +
    12. Named Metadata
    13. Parameter Attributes
    14. Function Attributes
    15. Garbage Collector Names
    16. Module-Level Inline Assembly
    17. Data Layout
    18. +
    19. Pointer Aliasing Rules
  • Type System
    1. Type Classifications
    2. -
    3. Primitive Types +
    4. Primitive Types
        +
      1. Integer Type
      2. Floating Point Types
      3. Void Type
      4. Label Type
      5. +
      6. Metadata Type
    5. Derived Types
        -
      1. Integer Type
      2. Array Type
      3. Function Type
      4. Pointer Type
      5. @@ -64,13 +84,25 @@
      6. Complex Constants
      7. Global Variable and Function Addresses
      8. Undefined Values
      9. +
      10. Addresses of Basic Blocks
      11. Constant Expressions
      12. -
      13. Embedded Metadata
    6. Other Values
      1. Inline Assembler Expressions
      2. +
      3. Metadata Nodes and Metadata Strings
      4. +
      +
    7. +
    8. Intrinsic Global Variables +
        +
      1. The 'llvm.used' Global Variable
      2. +
      3. The 'llvm.compiler.used' + Global Variable
      4. +
      5. The 'llvm.global_ctors' + Global Variable
      6. +
      7. The 'llvm.global_dtors' + Global Variable
    9. Instruction Reference @@ -80,6 +112,7 @@
    10. 'ret' Instruction
    11. 'br' Instruction
    12. 'switch' Instruction
    13. +
    14. 'indirectbr' Instruction
    15. 'invoke' Instruction
    16. 'unwind' Instruction
    17. 'unreachable' Instruction
    18. @@ -88,8 +121,11 @@
    19. Binary Operations
      1. 'add' Instruction
      2. +
      3. 'fadd' Instruction
      4. 'sub' Instruction
      5. +
      6. 'fsub' Instruction
      7. 'mul' Instruction
      8. +
      9. 'fmul' Instruction
      10. 'udiv' Instruction
      11. 'sdiv' Instruction
      12. 'fdiv' Instruction
      13. @@ -123,8 +159,6 @@
      14. Memory Access and Addressing Operations
          -
        1. 'malloc' Instruction
        2. -
        3. 'free' Instruction
        4. 'alloca' Instruction
        5. 'load' Instruction
        6. 'store' Instruction
        7. @@ -151,8 +185,6 @@
          1. 'icmp' Instruction
          2. 'fcmp' Instruction
          3. -
          4. 'vicmp' Instruction
          5. -
          6. 'vfcmp' Instruction
          7. 'phi' Instruction
          8. 'select' Instruction
          9. 'call' Instruction
          10. @@ -206,8 +238,6 @@
          11. 'llvm.ctpop.*' Intrinsic
          12. 'llvm.ctlz.*' Intrinsic
          13. 'llvm.cttz.*' Intrinsic
          14. -
          15. 'llvm.part.select.*' Intrinsic
          16. -
          17. 'llvm.part.set.*' Intrinsic
        8. Arithmetic with Overflow Intrinsics @@ -244,6 +274,14 @@
        9. llvm.atomic.load.umin
      15. +
      16. Memory Use Markers +
          +
        1. llvm.lifetime.start
        2. +
        3. llvm.lifetime.end
        4. +
        5. llvm.invariant.start
        6. +
        7. llvm.invariant.end
        8. +
        +
      17. General intrinsics
        1. @@ -254,6 +292,8 @@ 'llvm.trap' Intrinsic
        2. 'llvm.stackprotector' Intrinsic
        3. +
        4. + 'llvm.objectsize' Intrinsic
      @@ -270,12 +310,13 @@
      -

      This document is a reference manual for the LLVM assembly language. -LLVM is a Static Single Assignment (SSA) based representation that provides -type safety, low-level operations, flexibility, and the capability of -representing 'all' high-level languages cleanly. It is the common code -representation used throughout all phases of the LLVM compilation -strategy.

      + +

      This document is a reference manual for the LLVM assembly language. LLVM is + a Static Single Assignment (SSA) based representation that provides type + safety, low-level operations, flexibility, and the capability of representing + 'all' high-level languages cleanly. It is the common code representation + used throughout all phases of the LLVM compilation strategy.

      +
      @@ -284,26 +325,24 @@ strategy.

      -

      The LLVM code representation is designed to be used in three -different forms: as an in-memory compiler IR, as an on-disk bitcode -representation (suitable for fast loading by a Just-In-Time compiler), -and as a human readable assembly language representation. This allows -LLVM to provide a powerful intermediate representation for efficient -compiler transformations and analysis, while providing a natural means -to debug and visualize the transformations. The three different forms -of LLVM are all equivalent. This document describes the human readable -representation and notation.

      +

      The LLVM code representation is designed to be used in three different forms: + as an in-memory compiler IR, as an on-disk bitcode representation (suitable + for fast loading by a Just-In-Time compiler), and as a human readable + assembly language representation. This allows LLVM to provide a powerful + intermediate representation for efficient compiler transformations and + analysis, while providing a natural means to debug and visualize the + transformations. The three different forms of LLVM are all equivalent. This + document describes the human readable representation and notation.

      -

      The LLVM representation aims to be light-weight and low-level -while being expressive, typed, and extensible at the same time. It -aims to be a "universal IR" of sorts, by being at a low enough level -that high-level ideas may be cleanly mapped to it (similar to how -microprocessors are "universal IR's", allowing many source languages to -be mapped to them). By providing type information, LLVM can be used as -the target of optimizations: for example, through pointer analysis, it -can be proven that a C automatic variable is never accessed outside of -the current function... allowing it to be promoted to a simple SSA -value instead of a memory location.

      +

      The LLVM representation aims to be light-weight and low-level while being + expressive, typed, and extensible at the same time. It aims to be a + "universal IR" of sorts, by being at a low enough level that high-level ideas + may be cleanly mapped to it (similar to how microprocessors are "universal + IR's", allowing many source languages to be mapped to them). By providing + type information, LLVM can be used as the target of optimizations: for + example, through pointer analysis, it can be proven that a C automatic + variable is never accessed outside of the current function, allowing it to + be promoted to a simple SSA value instead of a memory location.

      @@ -312,10 +351,10 @@ value instead of a memory location.

      -

      It is important to note that this document describes 'well formed' -LLVM assembly language. There is a difference between what the parser -accepts and what is considered 'well formed'. For example, the -following instruction is syntactically okay, but not well formed:

      +

      It is important to note that this document describes 'well formed' LLVM + assembly language. There is a difference between what the parser accepts and + what is considered 'well formed'. For example, the following instruction is + syntactically okay, but not well formed:

      @@ -323,13 +362,13 @@ following instruction is syntactically okay, but not well formed:

      -

      ...because the definition of %x does not dominate all of -its uses. The LLVM infrastructure provides a verification pass that may -be used to verify that an LLVM module is well formed. This pass is -automatically run by the parser after parsing input assembly and by -the optimizer before it outputs bitcode. The violations pointed out -by the verifier pass indicate bugs in transformation passes or input to -the parser.

      +

      because the definition of %x does not dominate all of its uses. The + LLVM infrastructure provides a verification pass that may be used to verify + that an LLVM module is well formed. This pass is automatically run by the + parser after parsing input assembly and by the optimizer before it outputs + bitcode. The violations pointed out by the verifier pass indicate bugs in + transformation passes or input to the parser.

      +
      @@ -340,44 +379,47 @@ the parser.

      -

      LLVM identifiers come in two basic types: global and local. Global - identifiers (functions, global variables) begin with the @ character. Local - identifiers (register names, types) begin with the % character. Additionally, - there are three different formats for identifiers, for different purposes:

      +

      LLVM identifiers come in two basic types: global and local. Global + identifiers (functions, global variables) begin with the '@' + character. Local identifiers (register names, types) begin with + the '%' character. Additionally, there are three different formats + for identifiers, for different purposes:

      1. Named values are represented as a string of characters with their prefix. - For example, %foo, @DivisionByZero, %a.really.long.identifier. The actual - regular expression used is '[%@][a-zA-Z$._][a-zA-Z$._0-9]*'. - Identifiers which require other characters in their names can be surrounded - with quotes. Special characters may be escaped using "\xx" where xx is the - ASCII code for the character in hexadecimal. In this way, any character can - be used in a name value, even quotes themselves. + For example, %foo, @DivisionByZero, + %a.really.long.identifier. The actual regular expression used is + '[%@][a-zA-Z$._][a-zA-Z$._0-9]*'. Identifiers which require + other characters in their names can be surrounded with quotes. Special + characters may be escaped using "\xx" where xx is the + ASCII code for the character in hexadecimal. In this way, any character + can be used in a name value, even quotes themselves.
      2. Unnamed values are represented as an unsigned numeric value with their - prefix. For example, %12, @2, %44.
      3. + prefix. For example, %12, @2, %44.
      4. Constants, which are described in a section about - constants, below.
      5. + constants, below.

      LLVM requires that values start with a prefix for two reasons: Compilers -don't need to worry about name clashes with reserved words, and the set of -reserved words may be expanded in the future without penalty. Additionally, -unnamed identifiers allow a compiler to quickly come up with a temporary -variable without having to avoid symbol table conflicts.

      + don't need to worry about name clashes with reserved words, and the set of + reserved words may be expanded in the future without penalty. Additionally, + unnamed identifiers allow a compiler to quickly come up with a temporary + variable without having to avoid symbol table conflicts.

      Reserved words in LLVM are very similar to reserved words in other -languages. There are keywords for different opcodes -('add', - 'bitcast', - 'ret', etc...), for primitive type names ('void', 'i32', etc...), -and others. These reserved words cannot conflict with variable names, because -none of them start with a prefix character ('%' or '@').

      + languages. There are keywords for different opcodes + ('add', + 'bitcast', + 'ret', etc...), for primitive type names + ('void', + 'i32', etc...), and others. These + reserved words cannot conflict with variable names, because none of them + start with a prefix character ('%' or '@').

      Here is an example of LLVM code to multiply the integer variable -'%X' by 8:

      + '%X' by 8:

      The easy way:

      @@ -399,31 +441,29 @@ none of them start with a prefix character ('%' or '@').

      -add i32 %X, %X           ; yields {i32}:%0
      -add i32 %0, %0           ; yields {i32}:%1
      +%0 = add i32 %X, %X           ; yields {i32}:%0
      +%1 = add i32 %0, %0           ; yields {i32}:%1
       %result = add i32 %1, %1
       
      -

      This last way of multiplying %X by 8 illustrates several -important lexical features of LLVM:

      +

      This last way of multiplying %X by 8 illustrates several important + lexical features of LLVM:

        -
      1. Comments are delimited with a ';' and go until the end of - line.
      2. + line.
      3. Unnamed temporaries are created when the result of a computation is not - assigned to a named value.
      4. + assigned to a named value.
      5. Unnamed temporaries are numbered sequentially
      6. -
      -

      ...and it also shows a convention that we follow in this document. When -demonstrating instructions, we will follow an instruction with a comment that -defines the type and name of value produced. Comments are shown in italic -text.

      +

      It also shows a convention that we follow in this document. When + demonstrating instructions, we will follow an instruction with a comment that + defines the type and name of value produced. Comments are shown in italic + text.

      @@ -437,45 +477,47 @@ text.

      -

      LLVM programs are composed of "Module"s, each of which is a -translation unit of the input programs. Each module consists of -functions, global variables, and symbol table entries. Modules may be -combined together with the LLVM linker, which merges function (and -global variable) definitions, resolves forward declarations, and merges -symbol table entries. Here is an example of the "hello world" module:

      +

      LLVM programs are composed of "Module"s, each of which is a translation unit + of the input programs. Each module consists of functions, global variables, + and symbol table entries. Modules may be combined together with the LLVM + linker, which merges function (and global variable) definitions, resolves + forward declarations, and merges symbol table entries. Here is an example of + the "hello world" module:

      -
      ; Declare the string constant as a global constant...
      -@.LC0 = internal constant [13 x i8] c"hello world\0A\00"          ; [13 x i8]*
      +
      +; Declare the string constant as a global constant.
      +@.LC0 = internal constant [13 x i8] c"hello world\0A\00"    ; [13 x i8]*
       
       ; External declaration of the puts function
      -declare i32 @puts(i8 *)                                            ; i32(i8 *)* 
      +declare i32 @puts(i8 *)                                     ; i32(i8 *)* 
       
       ; Definition of main function
      -define i32 @main() {                                                 ; i32()* 
      -        ; Convert [13 x i8]* to i8  *...
      -        %cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0 ; i8 *
      +define i32 @main() {                                        ; i32()* 
      +  ; Convert [13 x i8]* to i8  *...
      +  %cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0   ; i8 *
      +
      +  ; Call puts function to write out the string to stdout.
      +  call i32 @puts(i8 * %cast210)                             ; i32
      +  ret i32 0
      } - ; Call puts function to write out the string to stdout... - call i32 @puts(i8 * %cast210) ; i32 - ret i32 0
      }
      +; Named metadata +!1 = metadata !{i32 41} +!foo = !{!1, null}
      -

      This example is made up of a global variable -named ".LC0", an external declaration of the "puts" -function, and a function definition -for "main".

      +

      This example is made up of a global variable named + ".LC0", an external declaration of the "puts" function, + a function definition for + "main" and named metadata + "foo".

      -

      In general, a module is made up of a list of global values, -where both functions and global variables are global values. Global values are -represented by a pointer to a memory location (in this case, a pointer to an -array of char, and a pointer to a function), and have one of the following linkage types.

      +

      In general, a module is made up of a list of global values, where both + functions and global variables are global values. Global values are + represented by a pointer to a memory location (in this case, a pointer to an + array of char, and a pointer to a function), and have one of the + following linkage types.

      @@ -486,139 +528,133 @@ href="#linkage">linkage types.

      -

      -All Global Variables and Functions have one of the following types of linkage: -

      +

      All Global Variables and Functions have one of the following types of + linkage:

      - -
      private:
      - -
      Global values with private linkage are only directly accessible by - objects in the current module. In particular, linking code into a module with - an private global value may cause the private to be renamed as necessary to - avoid collisions. Because the symbol is private to the module, all - references can be updated. This doesn't show up in any symbol table in the - object file. -
      - -
      internal:
      - -
      Similar to private, but the value shows as a local symbol (STB_LOCAL in - the case of ELF) in the object file. This corresponds to the notion of the - 'static' keyword in C. -
      - -
      available_externally: -
      - +
      private
      +
      Global values with private linkage are only directly accessible by objects + in the current module. In particular, linking code into a module with an + private global value may cause the private to be renamed as necessary to + avoid collisions. Because the symbol is private to the module, all + references can be updated. This doesn't show up in any symbol table in the + object file.
      + +
      linker_private
      +
      Similar to private, but the symbol is passed through the assembler and + removed by the linker after evaluation. Note that (unlike private + symbols) linker_private symbols are subject to coalescing by the linker: + weak symbols get merged and redefinitions are rejected. However, unlike + normal strong symbols, they are removed by the linker from the final + linked image (executable or dynamic library).
      + +
      internal
      +
      Similar to private, but the value shows as a local symbol + (STB_LOCAL in the case of ELF) in the object file. This + corresponds to the notion of the 'static' keyword in C.
      + +
      available_externally
      Globals with "available_externally" linkage are never emitted - into the object file corresponding to the LLVM module. They exist to - allow inlining and other optimizations to take place given knowledge of the - definition of the global, which is known to be somewhere outside the module. - Globals with available_externally linkage are allowed to be discarded - at will, and are otherwise the same as linkonce_odr. This linkage - type is only allowed on definitions, not declarations.
      - -
      linkonce:
      - + into the object file corresponding to the LLVM module. They exist to + allow inlining and other optimizations to take place given knowledge of + the definition of the global, which is known to be somewhere outside the + module. Globals with available_externally linkage are allowed to + be discarded at will, and are otherwise the same as linkonce_odr. + This linkage type is only allowed on definitions, not declarations. + +
      linkonce
      Globals with "linkonce" linkage are merged with other globals of - the same name when linkage occurs. This is typically used to implement - inline functions, templates, or other code which must be generated in each - translation unit that uses it. Unreferenced linkonce globals are - allowed to be discarded. -
      - -
      common:
      - -
      "common" linkage is exactly the same as linkonce - linkage, except that unreferenced common globals may not be - discarded. This is used for globals that may be emitted in multiple - translation units, but that are not guaranteed to be emitted into every - translation unit that uses them. One example of this is tentative - definitions in C, such as "int X;" at global scope. -
      - -
      weak:
      - -
      "weak" linkage is the same as common linkage, except - that some targets may choose to emit different assembly sequences for them - for target-dependent reasons. This is used for globals that are declared - "weak" in C source code. -
      - -
      appending:
      - + the same name when linkage occurs. This can be used to implement + some forms of inline functions, templates, or other code which must be + generated in each translation unit that uses it, but where the body may + be overridden with a more definitive definition later. Unreferenced + linkonce globals are allowed to be discarded. Note that + linkonce linkage does not actually allow the optimizer to + inline the body of this function into callers because it doesn't know if + this definition of the function is the definitive definition within the + program or whether it will be overridden by a stronger definition. + To enable inlining and other optimizations, use "linkonce_odr" + linkage. + +
      weak
      +
      "weak" linkage has the same merging semantics as + linkonce linkage, except that unreferenced globals with + weak linkage may not be discarded. This is used for globals that + are declared "weak" in C source code.
      + +
      common
      +
      "common" linkage is most similar to "weak" linkage, but + they are used for tentative definitions in C, such as "int X;" at + global scope. + Symbols with "common" linkage are merged in the same way as + weak symbols, and they may not be deleted if unreferenced. + common symbols may not have an explicit section, + must have a zero initializer, and may not be marked 'constant'. Functions and aliases may not + have common linkage.
      + + +
      appending
      "appending" linkage may only be applied to global variables of - pointer to array type. When two global variables with appending linkage are - linked together, the two global arrays are appended together. This is the - LLVM, typesafe, equivalent of having the system linker append together - "sections" with identical names when .o files are linked. -
      - -
      extern_weak:
      - -
      The semantics of this linkage follow the ELF object file model: the - symbol is weak until linked, if not linked, the symbol becomes null instead - of being an undefined reference. -
      - -
      linkonce_odr:
      -
      weak_odr:
      -
      Some languages allow differing globals to be merged, such as two - functions with different semantics. Other languages, such as C++, - ensure that only equivalent globals are ever merged (the "one definition - rule" - "ODR"). Such languages can use the linkonce_odr - and weak_odr linkage types to indicate that the global will only - be merged with equivalent globals. These linkage types are otherwise the - same as their non-odr versions. -
      + pointer to array type. When two global variables with appending linkage + are linked together, the two global arrays are appended together. This is + the LLVM, typesafe, equivalent of having the system linker append together + "sections" with identical names when .o files are linked. + +
      extern_weak
      +
      The semantics of this linkage follow the ELF object file model: the symbol + is weak until linked, if not linked, the symbol becomes null instead of + being an undefined reference.
      + +
      linkonce_odr
      +
      weak_odr
      +
      Some languages allow differing globals to be merged, such as two functions + with different semantics. Other languages, such as C++, ensure + that only equivalent globals are ever merged (the "one definition rule" - + "ODR"). Such languages can use the linkonce_odr + and weak_odr linkage types to indicate that the global will only + be merged with equivalent globals. These linkage types are otherwise the + same as their non-odr versions.
      externally visible:
      -
      If none of the above identifiers are used, the global is externally - visible, meaning that it participates in linkage and can be used to resolve - external symbol references. -
      + visible, meaning that it participates in linkage and can be used to + resolve external symbol references.
      -

      - The next two types of linkage are targeted for Microsoft Windows platform - only. They are designed to support importing (exporting) symbols from (to) - DLLs (Dynamic Link Libraries). -

      - -
      -
      dllimport:
      +

      The next two types of linkage are targeted for Microsoft Windows platform + only. They are designed to support importing (exporting) symbols from (to) + DLLs (Dynamic Link Libraries).

      +
      +
      dllimport
      "dllimport" linkage causes the compiler to reference a function - or variable via a global pointer to a pointer that is set up by the DLL - exporting the symbol. On Microsoft Windows targets, the pointer name is - formed by combining __imp_ and the function or variable name. -
      - -
      dllexport:
      + or variable via a global pointer to a pointer that is set up by the DLL + exporting the symbol. On Microsoft Windows targets, the pointer name is + formed by combining __imp_ and the function or variable + name. +
      dllexport
      "dllexport" linkage causes the compiler to provide a global - pointer to a pointer in a DLL, so that it can be referenced with the - dllimport attribute. On Microsoft Windows targets, the pointer - name is formed by combining __imp_ and the function or variable - name. -
      - + pointer to a pointer in a DLL, so that it can be referenced with the + dllimport attribute. On Microsoft Windows targets, the pointer + name is formed by combining __imp_ and the function or + variable name.
      -

      For example, since the ".LC0" -variable is defined to be internal, if another module defined a ".LC0" -variable and was linked with this one, one of the two would be renamed, -preventing a collision. Since "main" and "puts" are -external (i.e., lacking any linkage declarations), they are accessible -outside of the current module.

      -

      It is illegal for a function declaration -to have any linkage type other than "externally visible", dllimport -or extern_weak.

      +

      For example, since the ".LC0" variable is defined to be internal, if + another module defined a ".LC0" variable and was linked with this + one, one of the two would be renamed, preventing a collision. Since + "main" and "puts" are external (i.e., lacking any linkage + declarations), they are accessible outside of the current module.

      + +

      It is illegal for a function declaration to have any linkage type + other than "externally visible", dllimport + or extern_weak.

      +

      Aliases can have only external, internal, weak -or weak_odr linkages.

      + or weak_odr linkages.

      +
      @@ -629,55 +665,48 @@ or weak_odr linkages.

      LLVM functions, calls -and invokes can all have an optional calling convention -specified for the call. The calling convention of any pair of dynamic -caller/callee must match, or the behavior of the program is undefined. The -following calling conventions are supported by LLVM, and more may be added in -the future:

      + and invokes can all have an optional calling + convention specified for the call. The calling convention of any pair of + dynamic caller/callee must match, or the behavior of the program is + undefined. The following calling conventions are supported by LLVM, and more + may be added in the future:

      "ccc" - The C calling convention:
      -
      This calling convention (the default if no other calling convention is - specified) matches the target C calling conventions. This calling convention - supports varargs function calls and tolerates some mismatch in the declared - prototype and implemented declaration of the function (as does normal C). -
      + specified) matches the target C calling conventions. This calling + convention supports varargs function calls and tolerates some mismatch in + the declared prototype and implemented declaration of the function (as + does normal C).
      "fastcc" - The fast calling convention:
      -
      This calling convention attempts to make calls as fast as possible - (e.g. by passing things in registers). This calling convention allows the - target to use whatever tricks it wants to produce fast code for the target, - without having to conform to an externally specified ABI (Application Binary - Interface). Implementations of this convention should allow arbitrary - tail call optimization to be - supported. This calling convention does not support varargs and requires the - prototype of all callees to exactly match the prototype of the function - definition. -
      + (e.g. by passing things in registers). This calling convention allows the + target to use whatever tricks it wants to produce fast code for the + target, without having to conform to an externally specified ABI + (Application Binary Interface). + Tail calls can only be optimized + when this convention is used. This calling convention does not + support varargs and requires the prototype of all callees to exactly match + the prototype of the function definition.
      "coldcc" - The cold calling convention:
      -
      This calling convention attempts to make code in the caller as efficient - as possible under the assumption that the call is not commonly executed. As - such, these calls often preserve all registers so that the call does not break - any live ranges in the caller side. This calling convention does not support - varargs and requires the prototype of all callees to exactly match the - prototype of the function definition. -
      + as possible under the assumption that the call is not commonly executed. + As such, these calls often preserve all registers so that the call does + not break any live ranges in the caller side. This calling convention + does not support varargs and requires the prototype of all callees to + exactly match the prototype of the function definition.
      "cc <n>" - Numbered convention:
      -
      Any calling convention may be specified by number, allowing - target-specific calling conventions to be used. Target specific calling - conventions start at 64. -
      + target-specific calling conventions to be used. Target specific calling + conventions start at 64.

      More calling conventions can be added/defined on an as-needed basis, to -support pascal conventions or any other well-known target-independent -convention.

      + support Pascal conventions or any other well-known target-independent + convention.

      @@ -688,37 +717,29 @@ convention.

      -

      -All Global Variables and Functions have one of the following visibility styles: -

      +

      All Global Variables and Functions have one of the following visibility + styles:

      "default" - Default style:
      -
      On targets that use the ELF object file format, default visibility means - that the declaration is visible to other - modules and, in shared libraries, means that the declared entity may be - overridden. On Darwin, default visibility means that the declaration is - visible to other modules. Default visibility corresponds to "external - linkage" in the language. -
      + that the declaration is visible to other modules and, in shared libraries, + means that the declared entity may be overridden. On Darwin, default + visibility means that the declaration is visible to other modules. Default + visibility corresponds to "external linkage" in the language.
      "hidden" - Hidden style:
      -
      Two declarations of an object with hidden visibility refer to the same - object if they are in the same shared object. Usually, hidden visibility - indicates that the symbol will not be placed into the dynamic symbol table, - so no other module (executable or shared library) can reference it - directly. -
      + object if they are in the same shared object. Usually, hidden visibility + indicates that the symbol will not be placed into the dynamic symbol + table, so no other module (executable or shared library) can reference it + directly.
      "protected" - Protected style:
      -
      On ELF, protected visibility indicates that the symbol will be placed in - the dynamic symbol table, but that references within the defining module will - bind to the local symbol. That is, the symbol cannot be overridden by another - module. -
      + the dynamic symbol table, but that references within the defining module + will bind to the local symbol. That is, the symbol cannot be overridden by + another module.
      @@ -731,9 +752,8 @@ All Global Variables and Functions have one of the following visibility styles:

      LLVM IR allows you to specify name aliases for certain types. This can make -it easier to read the IR and make the IR more condensed (particularly when -recursive types are involved). An example of a name specification is: -

      + it easier to read the IR and make the IR more condensed (particularly when + recursive types are involved). An example of a name specification is:

      @@ -741,19 +761,19 @@ recursive types are involved).  An example of a name specification is:
       
      -

      You may give a name to any type except "void". Type name aliases may be used anywhere a type is -expected with the syntax "%mytype".

      +

      You may give a name to any type except + "void". Type name aliases may be used anywhere a type + is expected with the syntax "%mytype".

      Note that type names are aliases for the structural type that they indicate, -and that you can therefore specify multiple names for the same type. This often -leads to confusing behavior when dumping out a .ll file. Since LLVM IR uses -structural typing, the name is not part of the type. When printing out LLVM IR, -the printer will pick one name to render all types of a particular -shape. This means that if you have code where two different source types end up -having the same LLVM type, that the dumper will sometimes print the "wrong" or -unexpected type. This is an important design point and isn't going to -change.

      + and that you can therefore specify multiple names for the same type. This + often leads to confusing behavior when dumping out a .ll file. Since LLVM IR + uses structural typing, the name is not part of the type. When printing out + LLVM IR, the printer will pick one name to render all types of a + particular shape. This means that if you have code where two different + source types end up having the same LLVM type, that the dumper will sometimes + print the "wrong" or unexpected type. This is an important design point and + isn't going to change.

      @@ -765,48 +785,47 @@ change.

      Global variables define regions of memory allocated at compilation time -instead of run-time. Global variables may optionally be initialized, may have -an explicit section to be placed in, and may have an optional explicit alignment -specified. A variable may be defined as "thread_local", which means that it -will not be shared by threads (each thread will have a separated copy of the -variable). A variable may be defined as a global "constant," which indicates -that the contents of the variable will never be modified (enabling better -optimization, allowing the global data to be placed in the read-only section of -an executable, etc). Note that variables that need runtime initialization -cannot be marked "constant" as there is a store to the variable.

      - -

      -LLVM explicitly allows declarations of global variables to be marked -constant, even if the final definition of the global is not. This capability -can be used to enable slightly better optimization of the program, but requires -the language definition to guarantee that optimizations based on the -'constantness' are valid for the translation units that do not include the -definition. -

      - -

      As SSA values, global variables define pointer values that are in -scope (i.e. they dominate) all basic blocks in the program. Global -variables always define a pointer to their "content" type because they -describe a region of memory, and all memory objects in LLVM are -accessed through pointers.

      - -

      A global variable may be declared to reside in a target-specifc numbered -address space. For targets that support them, address spaces may affect how -optimizations are performed and/or what target instructions are used to access -the variable. The default address space is zero. The address space qualifier -must precede any other attributes.

      + instead of run-time. Global variables may optionally be initialized, may + have an explicit section to be placed in, and may have an optional explicit + alignment specified. A variable may be defined as "thread_local", which + means that it will not be shared by threads (each thread will have a + separated copy of the variable). A variable may be defined as a global + "constant," which indicates that the contents of the variable + will never be modified (enabling better optimization, allowing the + global data to be placed in the read-only section of an executable, etc). + Note that variables that need runtime initialization cannot be marked + "constant" as there is a store to the variable.

      + +

      LLVM explicitly allows declarations of global variables to be marked + constant, even if the final definition of the global is not. This capability + can be used to enable slightly better optimization of the program, but + requires the language definition to guarantee that optimizations based on the + 'constantness' are valid for the translation units that do not include the + definition.

      + +

      As SSA values, global variables define pointer values that are in scope + (i.e. they dominate) all basic blocks in the program. Global variables + always define a pointer to their "content" type because they describe a + region of memory, and all memory objects in LLVM are accessed through + pointers.

      + +

      A global variable may be declared to reside in a target-specific numbered + address space. For targets that support them, address spaces may affect how + optimizations are performed and/or what target instructions are used to + access the variable. The default address space is zero. The address space + qualifier must precede any other attributes.

      LLVM allows an explicit section to be specified for globals. If the target -supports it, it will emit globals to the section specified.

      + supports it, it will emit globals to the section specified.

      An explicit alignment may be specified for a global. If not present, or if -the alignment is set to zero, the alignment of the global is set by the target -to whatever it feels convenient. If an explicit alignment is specified, the -global is forced to have at least that much alignment. All alignments must be -a power of 2.

      + the alignment is set to zero, the alignment of the global is set by the + target to whatever it feels convenient. If an explicit alignment is + specified, the global is forced to have at least that much alignment. All + alignments must be a power of 2.

      -

      For example, the following defines a global in a numbered address space with -an initializer, section, and alignment:

      +

      For example, the following defines a global in a numbered address space with + an initializer, section, and alignment:

      @@ -824,74 +843,72 @@ an initializer, section, and alignment:

      -

      LLVM function definitions consist of the "define" keyord, -an optional linkage type, an optional -visibility style, an optional -calling convention, a return type, an optional -parameter attribute for the return type, a function -name, a (possibly empty) argument list (each with optional -parameter attributes), optional -function attributes, an optional section, -an optional alignment, an optional garbage collector name, -an opening curly brace, a list of basic blocks, and a closing curly brace. +

      LLVM function definitions consist of the "define" keyord, an + optional linkage type, an optional + visibility style, an optional + calling convention, a return type, an optional + parameter attribute for the return type, a function + name, a (possibly empty) argument list (each with optional + parameter attributes), optional + function attributes, an optional section, an optional + alignment, an optional garbage collector name, an opening + curly brace, a list of basic blocks, and a closing curly brace.

      -LLVM function declarations consist of the "declare" keyword, an -optional linkage type, an optional -visibility style, an optional -calling convention, a return type, an optional -parameter attribute for the return type, a function -name, a possibly empty list of arguments, an optional alignment, and an optional -garbage collector name.

      +

      LLVM function declarations consist of the "declare" keyword, an + optional linkage type, an optional + visibility style, an optional + calling convention, a return type, an optional + parameter attribute for the return type, a function + name, a possibly empty list of arguments, an optional alignment, and an + optional garbage collector name.

      A function definition contains a list of basic blocks, forming the CFG -(Control Flow Graph) for -the function. Each basic block may optionally start with a label (giving the -basic block a symbol table entry), contains a list of instructions, and ends -with a terminator instruction (such as a branch or -function return).

      + (Control Flow Graph) for the function. Each basic block may optionally start + with a label (giving the basic block a symbol table entry), contains a list + of instructions, and ends with a terminator + instruction (such as a branch or function return).

      The first basic block in a function is special in two ways: it is immediately -executed on entrance to the function, and it is not allowed to have predecessor -basic blocks (i.e. there can not be any branches to the entry block of a -function). Because the block can have no predecessors, it also cannot have any -PHI nodes.

      + executed on entrance to the function, and it is not allowed to have + predecessor basic blocks (i.e. there can not be any branches to the entry + block of a function). Because the block can have no predecessors, it also + cannot have any PHI nodes.

      LLVM allows an explicit section to be specified for functions. If the target -supports it, it will emit functions to the section specified.

      + supports it, it will emit functions to the section specified.

      An explicit alignment may be specified for a function. If not present, or if -the alignment is set to zero, the alignment of the function is set by the target -to whatever it feels convenient. If an explicit alignment is specified, the -function is forced to have at least that much alignment. All alignments must be -a power of 2.

      - -
      Syntax:
      + the alignment is set to zero, the alignment of the function is set by the + target to whatever it feels convenient. If an explicit alignment is + specified, the function is forced to have at least that much alignment. All + alignments must be a power of 2.

      +
      Syntax:
      - +
       define [linkage] [visibility]
      -      [cconv] [ret attrs]
      -      <ResultType> @<FunctionName> ([argument list])
      -      [fn Attrs] [section "name"] [align N]
      -      [gc] { ... }
      -
      +       [cconv] [ret attrs]
      +       <ResultType> @<FunctionName> ([argument list])
      +       [fn Attrs] [section "name"] [align N]
      +       [gc] { ... }
      +
      - +
      -

      Aliases act as "second name" for the aliasee value (which can be either - function, global variable, another alias or bitcast of global value). Aliases - may have an optional linkage type, and an - optional visibility style.

      -
      Syntax:
      +

      Aliases act as "second name" for the aliasee value (which can be either + function, global variable, another alias or bitcast of global value). Aliases + may have an optional linkage type, and an + optional visibility style.

      +
      Syntax:
       @<Name> = alias [Linkage] [Visibility] <AliaseeTy> @<Aliasee>
      @@ -900,21 +917,42 @@ define [linkage] [visibility]
       
       
      + + + +
      + +

      Named metadata is a collection of metadata. Metadata + nodes (but not metadata strings) and null are the only valid operands for + a named metadata.

      + +
      Syntax:
      +
      +
      +!1 = metadata !{metadata !"one"}
      +!name = !{null, !1}
      +
      +
      +
      +
      -

      The return type and each parameter of a function type may have a set of - parameter attributes associated with them. Parameter attributes are - used to communicate additional information about the result or parameters of - a function. Parameter attributes are considered to be part of the function, - not of the function type, so functions with different parameter attributes - can have the same function type.

      -

      Parameter attributes are simple keywords that follow the type specified. If - multiple parameter attributes are needed, they are space separated. For - example:

      +

      The return type and each parameter of a function type may have a set of + parameter attributes associated with them. Parameter attributes are + used to communicate additional information about the result or parameters of + a function. Parameter attributes are considered to be part of the function, + not of the function type, so functions with different parameter attributes + can have the same function type.

      + +

      Parameter attributes are simple keywords that follow the type specified. If + multiple parameter attributes are needed, they are space separated. For + example:

      @@ -924,71 +962,72 @@ declare signext i8 @returns_signed_char()
       
      -

      Note that any attributes for the function result (nounwind, - readonly) come immediately after the argument list.

      - -

      Currently, only the following parameter attributes are defined:

      -
      -
      zeroext
      -
      This indicates to the code generator that the parameter or return value - should be zero-extended to a 32-bit value by the caller (for a parameter) - or the callee (for a return value).
      - -
      signext
      -
      This indicates to the code generator that the parameter or return value - should be sign-extended to a 32-bit value by the caller (for a parameter) - or the callee (for a return value).
      - -
      inreg
      -
      This indicates that this parameter or return value should be treated - in a special target-dependent fashion during while emitting code for a - function call or return (usually, by putting it in a register as opposed - to memory, though some targets use it to distinguish between two different - kinds of registers). Use of this attribute is target-specific.
      - -
      byval
      -
      This indicates that the pointer parameter should really be passed by - value to the function. The attribute implies that a hidden copy of the - pointee is made between the caller and the callee, so the callee is unable - to modify the value in the callee. This attribute is only valid on LLVM - pointer arguments. It is generally used to pass structs and arrays by - value, but is also valid on pointers to scalars. The copy is considered to - belong to the caller not the callee (for example, - readonly functions should not write to - byval parameters). This is not a valid attribute for return - values. The byval attribute also supports specifying an alignment with the - align attribute. This has a target-specific effect on the code generator - that usually indicates a desired alignment for the synthesized stack - slot.
      - -
      sret
      -
      This indicates that the pointer parameter specifies the address of a - structure that is the return value of the function in the source program. - This pointer must be guaranteed by the caller to be valid: loads and stores - to the structure may be assumed by the callee to not to trap. This may only - be applied to the first parameter. This is not a valid attribute for - return values.
      - -
      noalias
      -
      This indicates that the pointer does not alias any global or any other - parameter. The caller is responsible for ensuring that this is the - case. On a function return value, noalias additionally indicates - that the pointer does not alias any other pointers visible to the - caller. For further details, please see the discussion of the NoAlias - response in - alias - analysis.
      - -
      nocapture
      -
      This indicates that the callee does not make any copies of the pointer - that outlive the callee itself. This is not a valid attribute for return - values.
      - -
      nest
      -
      This indicates that the pointer parameter can be excised using the - trampoline intrinsics. This is not a valid - attribute for return values.
      -
      +

      Note that any attributes for the function result (nounwind, + readonly) come immediately after the argument list.

      + +

      Currently, only the following parameter attributes are defined:

      + +
      +
      zeroext
      +
      This indicates to the code generator that the parameter or return value + should be zero-extended to a 32-bit value by the caller (for a parameter) + or the callee (for a return value).
      + +
      signext
      +
      This indicates to the code generator that the parameter or return value + should be sign-extended to a 32-bit value by the caller (for a parameter) + or the callee (for a return value).
      + +
      inreg
      +
      This indicates that this parameter or return value should be treated in a + special target-dependent fashion during while emitting code for a function + call or return (usually, by putting it in a register as opposed to memory, + though some targets use it to distinguish between two different kinds of + registers). Use of this attribute is target-specific.
      + +
      byval
      +
      This indicates that the pointer parameter should really be passed by value + to the function. The attribute implies that a hidden copy of the pointee + is made between the caller and the callee, so the callee is unable to + modify the value in the callee. This attribute is only valid on LLVM + pointer arguments. It is generally used to pass structs and arrays by + value, but is also valid on pointers to scalars. The copy is considered + to belong to the caller not the callee (for example, + readonly functions should not write to + byval parameters). This is not a valid attribute for return + values. The byval attribute also supports specifying an alignment with + the align attribute. This has a target-specific effect on the code + generator that usually indicates a desired alignment for the synthesized + stack slot.
      + +
      sret
      +
      This indicates that the pointer parameter specifies the address of a + structure that is the return value of the function in the source program. + This pointer must be guaranteed by the caller to be valid: loads and + stores to the structure may be assumed by the callee to not to trap. This + may only be applied to the first parameter. This is not a valid attribute + for return values.
      + +
      noalias
      +
      This indicates that the pointer does not alias any global or any other + parameter. The caller is responsible for ensuring that this is the + case. On a function return value, noalias additionally indicates + that the pointer does not alias any other pointers visible to the + caller. For further details, please see the discussion of the NoAlias + response in + alias + analysis.
      + +
      nocapture
      +
      This indicates that the callee does not make any copies of the pointer + that outlive the callee itself. This is not a valid attribute for return + values.
      + +
      nest
      +
      This indicates that the pointer parameter can be excised using the + trampoline intrinsics. This is not a valid + attribute for return values.
      +
      @@ -998,15 +1037,20 @@ declare signext i8 @returns_signed_char()
      +

      Each function may specify a garbage collector name, which is simply a -string.

      + string:

      -
      define void @f() gc "name" { ...
      +
      +
      +define void @f() gc "name" { ... }
      +
      +

      The compiler declares the supported values of name. Specifying a -collector which will cause the compiler to alter its output in order to support -the named garbage collection algorithm.

      + collector which will cause the compiler to alter its output in order to + support the named garbage collection algorithm.

      +
      @@ -1016,90 +1060,107 @@ the named garbage collection algorithm.

      -

      Function attributes are set to communicate additional information about - a function. Function attributes are considered to be part of the function, - not of the function type, so functions with different parameter attributes - can have the same function type.

      +

      Function attributes are set to communicate additional information about a + function. Function attributes are considered to be part of the function, not + of the function type, so functions with different parameter attributes can + have the same function type.

      -

      Function attributes are simple keywords that follow the type specified. If - multiple attributes are needed, they are space separated. For - example:

      +

      Function attributes are simple keywords that follow the type specified. If + multiple attributes are needed, they are space separated. For example:

       define void @f() noinline { ... }
       define void @f() alwaysinline { ... }
       define void @f() alwaysinline optsize { ... }
      -define void @f() optsize
      +define void @f() optsize { ... }
       
      -
      alwaysinline
      -
      This attribute indicates that the inliner should attempt to inline this -function into callers whenever possible, ignoring any active inlining size -threshold for this caller.
      - -
      noinline
      -
      This attribute indicates that the inliner should never inline this function -in any situation. This attribute may not be used together with the -alwaysinline attribute.
      - -
      optsize
      -
      This attribute suggests that optimization passes and code generator passes -make choices that keep the code size of this function low, and otherwise do -optimizations specifically to reduce code size.
      - -
      noreturn
      -
      This function attribute indicates that the function never returns normally. -This produces undefined behavior at runtime if the function ever does -dynamically return.
      - -
      nounwind
      -
      This function attribute indicates that the function never returns with an -unwind or exceptional control flow. If the function does unwind, its runtime -behavior is undefined.
      - -
      readnone
      -
      This attribute indicates that the function computes its result (or decides to -unwind an exception) based strictly on its arguments, without dereferencing any -pointer arguments or otherwise accessing any mutable state (e.g. memory, control -registers, etc) visible to caller functions. It does not write through any -pointer arguments (including byval arguments) and -never changes any state visible to callers. This means that it cannot unwind -exceptions by calling the C++ exception throwing methods, but could -use the unwind instruction.
      - -
      readonly
      -
      This attribute indicates that the function does not write through any -pointer arguments (including byval arguments) -or otherwise modify any state (e.g. memory, control registers, etc) visible to -caller functions. It may dereference pointer arguments and read state that may -be set in the caller. A readonly function always returns the same value (or -unwinds an exception identically) when called with the same set of arguments -and global state. It cannot unwind an exception by calling the C++ -exception throwing methods, but may use the unwind instruction.
      - -
      ssp
      -
      This attribute indicates that the function should emit a stack smashing -protector. It is in the form of a "canary"—a random value placed on the -stack before the local variables that's checked upon return from the function to -see if it has been overwritten. A heuristic is used to determine if a function -needs stack protectors or not. - -

      If a function that has an ssp attribute is inlined into a function -that doesn't have an ssp attribute, then the resulting function will -have an ssp attribute.

      - -
      sspreq
      -
      This attribute indicates that the function should always emit a -stack smashing protector. This overrides the ssp -function attribute. - -

      If a function that has an sspreq attribute is inlined into a -function that doesn't have an sspreq attribute or which has -an ssp attribute, then the resulting function will have -an sspreq attribute.

      +
      alwaysinline
      +
      This attribute indicates that the inliner should attempt to inline this + function into callers whenever possible, ignoring any active inlining size + threshold for this caller.
      + +
      inlinehint
      +
      This attribute indicates that the source code contained a hint that inlining + this function is desirable (such as the "inline" keyword in C/C++). It + is just a hint; it imposes no requirements on the inliner.
      + +
      noinline
      +
      This attribute indicates that the inliner should never inline this + function in any situation. This attribute may not be used together with + the alwaysinline attribute.
      + +
      optsize
      +
      This attribute suggests that optimization passes and code generator passes + make choices that keep the code size of this function low, and otherwise + do optimizations specifically to reduce code size.
      + +
      noreturn
      +
      This function attribute indicates that the function never returns + normally. This produces undefined behavior at runtime if the function + ever does dynamically return.
      + +
      nounwind
      +
      This function attribute indicates that the function never returns with an + unwind or exceptional control flow. If the function does unwind, its + runtime behavior is undefined.
      + +
      readnone
      +
      This attribute indicates that the function computes its result (or decides + to unwind an exception) based strictly on its arguments, without + dereferencing any pointer arguments or otherwise accessing any mutable + state (e.g. memory, control registers, etc) visible to caller functions. + It does not write through any pointer arguments + (including byval arguments) and never + changes any state visible to callers. This means that it cannot unwind + exceptions by calling the C++ exception throwing methods, but + could use the unwind instruction.
      + +
      readonly
      +
      This attribute indicates that the function does not write through any + pointer arguments (including byval + arguments) or otherwise modify any state (e.g. memory, control registers, + etc) visible to caller functions. It may dereference pointer arguments + and read state that may be set in the caller. A readonly function always + returns the same value (or unwinds an exception identically) when called + with the same set of arguments and global state. It cannot unwind an + exception by calling the C++ exception throwing methods, but may + use the unwind instruction.
      + +
      ssp
      +
      This attribute indicates that the function should emit a stack smashing + protector. It is in the form of a "canary"—a random value placed on + the stack before the local variables that's checked upon return from the + function to see if it has been overwritten. A heuristic is used to + determine if a function needs stack protectors or not.
      +
      + If a function that has an ssp attribute is inlined into a + function that doesn't have an ssp attribute, then the resulting + function will have an ssp attribute.
      + +
      sspreq
      +
      This attribute indicates that the function should always emit a + stack smashing protector. This overrides + the ssp function attribute.
      +
      + If a function that has an sspreq attribute is inlined into a + function that doesn't have an sspreq attribute or which has + an ssp attribute, then the resulting function will have + an sspreq attribute.
      + +
      noredzone
      +
      This attribute indicates that the code generator should not use a red + zone, even if the target-specific ABI normally permits it.
      + +
      noimplicitfloat
      +
      This attributes disables implicit floating point instructions.
      + +
      naked
      +
      This attribute disables prologue / epilogue emission for the function. + This can have very system-specific consequences.
      @@ -1110,12 +1171,11 @@ an sspreq attribute.

      -

      -Modules may contain "module-level inline asm" blocks, which corresponds to the -GCC "file scope inline asm" blocks. These blocks are internally concatenated by -LLVM and treated as a single unit, but may be separated in the .ll file if -desired. The syntax is very simple: -

      + +

      Modules may contain "module-level inline asm" blocks, which corresponds to + the GCC "file scope inline asm" blocks. These blocks are internally + concatenated by LLVM and treated as a single unit, but may be separated in + the .ll file if desired. The syntax is very simple:

      @@ -1126,13 +1186,11 @@ module asm "more can go here"
       
       

      The strings can contain any character by escaping non-printable characters. The escape sequence used is simply "\xx" where "xx" is the two digit hex code - for the number. -

      + for the number.

      + +

      The inline asm code is simply printed to the machine code .s file when + assembly code is generated.

      -

      - The inline asm code is simply printed to the machine code .s file when - assembly code is generated. -

      @@ -1141,43 +1199,72 @@ module asm "more can go here"
      +

      A module may specify a target specific data layout string that specifies how -data is to be laid out in memory. The syntax for the data layout is simply:

      -
          target datalayout = "layout specification"
      -

      The layout specification consists of a list of specifications -separated by the minus sign character ('-'). Each specification starts with a -letter and may include other information after the letter to define some -aspect of the data layout. The specifications accepted are as follows:

      + data is to be laid out in memory. The syntax for the data layout is + simply:

      + +
      +
      +target datalayout = "layout specification"
      +
      +
      + +

      The layout specification consists of a list of specifications + separated by the minus sign character ('-'). Each specification starts with + a letter and may include other information after the letter to define some + aspect of the data layout. The specifications accepted are as follows:

      +
      E
      Specifies that the target lays out data in big-endian form. That is, the - bits with the most significance have the lowest address location.
      + bits with the most significance have the lowest address location. +
      e
      Specifies that the target lays out data in little-endian form. That is, - the bits with the least significance have the lowest address location.
      + the bits with the least significance have the lowest address + location. +
      p:size:abi:pref
      -
      This specifies the size of a pointer and its abi and - preferred alignments. All sizes are in bits. Specifying the pref - alignment is optional. If omitted, the preceding : should be omitted - too.
      +
      This specifies the size of a pointer and its abi and + preferred alignments. All sizes are in bits. Specifying + the pref alignment is optional. If omitted, the + preceding : should be omitted too.
      +
      isize:abi:pref
      This specifies the alignment for an integer type of a given bit - size. The value of size must be in the range [1,2^23).
      + size. The value of size must be in the range [1,2^23). +
      vsize:abi:pref
      -
      This specifies the alignment for a vector type of a given bit - size.
      +
      This specifies the alignment for a vector type of a given bit + size.
      +
      fsize:abi:pref
      -
      This specifies the alignment for a floating point type of a given bit - size. The value of size must be either 32 (float) or 64 - (double).
      +
      This specifies the alignment for a floating point type of a given bit + size. The value of size must be either 32 (float) or 64 + (double).
      +
      asize:abi:pref
      This specifies the alignment for an aggregate type of a given bit - size.
      + size. + +
      ssize:abi:pref
      +
      This specifies the alignment for a stack object of a given bit + size.
      + +
      nsize1:size2:size3...
      +
      This specifies a set of native integer widths for the target CPU + in bits. For example, it might contain "n32" for 32-bit PowerPC, + "n32:64" for PowerPC 64, or "n8:16:32:64" for X86-64. Elements of + this set are considered to support most general arithmetic + operations efficiently.
      +

      When constructing the data layout for a given target, LLVM starts with a -default set of specifications which are then (possibly) overriden by the -specifications in the datalayout keyword. The default specifications -are given in this list:

      + default set of specifications which are then (possibly) overriden by the + specifications in the datalayout keyword. The default specifications + are given in this list:

      +
      • E - big endian
      • p:32:64:64 - 32-bit pointers with 64-bit alignment
      • @@ -1192,23 +1279,82 @@ are given in this list:

      • v64:64:64 - 64-bit vector is 64-bit aligned
      • v128:128:128 - 128-bit vector is 128-bit aligned
      • a0:0:1 - aggregates are 8-bit aligned
      • +
      • s0:64:64 - stack objects are 64-bit aligned
      -

      When LLVM is determining the alignment for a given type, it uses the -following rules:

      + +

      When LLVM is determining the alignment for a given type, it uses the + following rules:

      +
      1. If the type sought is an exact match for one of the specifications, that - specification is used.
      2. + specification is used. +
      3. If no match is found, and the type sought is an integer type, then the - smallest integer type that is larger than the bitwidth of the sought type is - used. If none of the specifications are larger than the bitwidth then the the - largest integer type is used. For example, given the default specifications - above, the i7 type will use the alignment of i8 (next largest) while both - i65 and i256 will use the alignment of i64 (largest specified).
      4. + smallest integer type that is larger than the bitwidth of the sought type + is used. If none of the specifications are larger than the bitwidth then + the the largest integer type is used. For example, given the default + specifications above, the i7 type will use the alignment of i8 (next + largest) while both i65 and i256 will use the alignment of i64 (largest + specified). +
      5. If no match is found, and the type sought is a vector type, then the - largest vector type that is smaller than the sought vector type will be used - as a fall back. This happens because <128 x double> can be implemented - in terms of 64 <2 x double>, for example.
      6. + largest vector type that is smaller than the sought vector type will be + used as a fall back. This happens because <128 x double> can be + implemented in terms of 64 <2 x double>, for example.
      + +
      + + + + +
      + +

      Any memory access must be done through a pointer value associated +with an address range of the memory access, otherwise the behavior +is undefined. Pointer values are associated with address ranges +according to the following rules:

      + +
        +
      • A pointer value formed from a + getelementptr instruction + is associated with the addresses associated with the first operand + of the getelementptr.
      • +
      • An address of a global variable is associated with the address + range of the variable's storage.
      • +
      • The result value of an allocation instruction is associated with + the address range of the allocated storage.
      • +
      • A null pointer in the default address-space is associated with + no address.
      • +
      • A pointer value formed by an + inttoptr is associated with all + address ranges of all pointer values that contribute (directly or + indirectly) to the computation of the pointer's value.
      • +
      • The result value of a + bitcast is associated with all + addresses associated with the operand of the bitcast.
      • +
      • An integer constant other than zero or a pointer value returned + from a function not defined within LLVM may be associated with address + ranges allocated through mechanisms other than those provided by + LLVM. Such ranges shall not overlap with any ranges of addresses + allocated by mechanisms provided by LLVM.
      • +
      + +

      LLVM IR does not associate types with memory. The result type of a +load merely indicates the size and +alignment of the memory from which to load, as well as the +interpretation of the value. The first operand of a +store similarly only indicates the size +and alignment of the store.

      + +

      Consequently, type-based alias analysis, aka TBAA, aka +-fstrict-aliasing, is not applicable to general unadorned +LLVM IR. Metadata may be used to encode +additional information which specialized optimization passes may use +to implement type-based alias analysis.

      +
      @@ -1218,22 +1364,22 @@ following rules:

      The LLVM type system is one of the most important features of the -intermediate representation. Being typed enables a number of -optimizations to be performed on the intermediate representation directly, -without having to do -extra analyses on the side before the transformation. A strong type -system makes it easier to read the generated code and enables novel -analyses and transformations that are not feasible to perform on normal -three address code representations.

      + intermediate representation. Being typed enables a number of optimizations + to be performed on the intermediate representation directly, without having + to do extra analyses on the side before the transformation. A strong type + system makes it easier to read the generated code and enables novel analyses + and transformations that are not feasible to perform on normal three address + code representations.

      +
      -

      The types fall into a few useful -classifications:

      + +

      The types fall into a few useful classifications:

      @@ -1254,14 +1400,16 @@ classifications:

      vector, structure, array, - label. + label, + metadata. + floating point, + metadata. @@ -1278,18 +1426,55 @@ classifications:

      primitive label, void, - floating point.
      derived
      -

      The first class types are perhaps the -most important. Values of these types are the only ones which can be -produced by instructions, passed as arguments, or used as operands to -instructions.

      +

      The first class types are perhaps the most + important. Values of these types are the only ones which can be produced by + instructions.

      +
      +

      The primitive types are the fundamental building blocks of the LLVM -system.

      + system.

      + +
      + + + + +
      + +
      Overview:
      +

      The integer type is a very simple type that simply specifies an arbitrary + bit width for the integer type desired. Any bit width from 1 bit to + 223-1 (about 8 million) can be specified.

      + +
      Syntax:
      +
      +  iN
      +
      + +

      The number of bits the integer will occupy is specified by the N + value.

      + +
      Examples:
      + + + + + + + + + + + + + +
      i1a single-bit integer.
      i32a 32-bit integer.
      i1942652a really big integer of over 1 million bits.
      @@ -1297,99 +1482,79 @@ system.

      - - - - - - - - - -
      TypeDescription
      float32-bit floating point value
      double64-bit floating point value
      fp128128-bit floating point value (112-bit mantissa)
      x86_fp8080-bit floating point value (X87)
      ppc_fp128128-bit floating point value (two 64-bits)
      + + + + + + + + + + +
      TypeDescription
      float32-bit floating point value
      double64-bit floating point value
      fp128128-bit floating point value (112-bit mantissa)
      x86_fp8080-bit floating point value (X87)
      ppc_fp128128-bit floating point value (two 64-bits)
      +
      +
      Overview:

      The void type does not represent any value and has no size.

      Syntax:
      -
         void
       
      +
      +
      Overview:

      The label type represents code labels.

      Syntax:
      -
         label
       
      -
      - - - - - -
      - -

      The real power in LLVM comes from the derived types in the system. -This is what allows a programmer to represent arrays, functions, -pointers, and other useful types. Note that these derived types may be -recursive: For example, it is possible to have a two dimensional array.

      - +
      Overview:
      -

      The integer type is a very simple derived type that simply specifies an -arbitrary bit width for the integer type desired. Any bit width from 1 bit to -2^23-1 (about 8 million) can be specified.

      +

      The metadata type represents embedded metadata. No derived types may be + created from metadata except for function + arguments.

      Syntax:
      -
      -  iN
      +  metadata
       
      -

      The number of bits the integer will occupy is specified by the N -value.

      +
      -
      Examples:
      - - - - - - - - - - - - - -
      i1a single-bit integer.
      i32a 32-bit integer.
      i1942652a really big integer of over 1 million bits.
      -

      Note that the code generator does not yet support large integer types -to be used as function return types. The specific limit on how large a -return type the code generator can currently handle is target-dependent; -currently it's often 64 bits for 32-bit targets and 128 bits for 64-bit -targets.

      + + + +
      + +

      The real power in LLVM comes from the derived types in the system. This is + what allows a programmer to represent arrays, functions, pointers, and other + useful types. Each of these types contain one or more element types which + may be a primitive type, or another derived type. For example, it is + possible to have a two dimensional array, using an array as the element type + of another array.

      @@ -1399,19 +1564,17 @@ targets.

      Overview:
      -

      The array type is a very simple derived type that arranges elements -sequentially in memory. The array type requires a size (number of -elements) and an underlying data type.

      + sequentially in memory. The array type requires a size (number of elements) + and an underlying data type.

      Syntax:
      -
         [<# elements> x <elementtype>]
       
      -

      The number of elements is a constant integer value; elementtype may -be any type with a size.

      +

      The number of elements is a constant integer value; elementtype may + be any type with a size.

      Examples:
      @@ -1444,45 +1607,39 @@ be any type with a size.

      -

      Note that 'variable sized arrays' can be implemented in LLVM with a zero -length array. Normally, accesses past the end of an array are undefined in -LLVM (e.g. it is illegal to access the 5th element of a 3 element array). -As a special case, however, zero length arrays are recognized to be variable -length. This allows implementation of 'pascal style arrays' with the LLVM -type "{ i32, [0 x float]}", for example.

      - -

      Note that the code generator does not yet support large aggregate types -to be used as function return types. The specific limit on how large an -aggregate return type the code generator can currently handle is -target-dependent, and also dependent on the aggregate element types.

      +

      There is no restriction on indexing beyond the end of the array implied by + a static type (though there are restrictions on indexing beyond the bounds + of an allocated object in some cases). This means that single-dimension + 'variable sized array' addressing can be implemented in LLVM with a zero + length array type. An implementation of 'pascal style arrays' in LLVM could + use the type "{ i32, [0 x float]}", for example.

      +
      Overview:
      - -

      The function type can be thought of as a function signature. It -consists of a return type and a list of formal parameter types. The -return type of a function type is a scalar type, a void type, or a struct type. -If the return type is a struct type then all struct elements must be of first -class types, and the struct must have at least one element.

      +

      The function type can be thought of as a function signature. It consists of + a return type and a list of formal parameter types. The return type of a + function type is a scalar type, a void type, or a struct type. If the return + type is a struct type then all struct elements must be of first class types, + and the struct must have at least one element.

      Syntax:
      -
      -  <returntype list> (<parameter list>)
      +  <returntype> (<parameter list>)
       

      ...where '<parameter list>' is a comma-separated list of type -specifiers. Optionally, the parameter list may include a type ..., -which indicates that the function takes a variable number of arguments. -Variable argument functions can access their arguments with the variable argument handling intrinsic functions. -'<returntype list>' is a comma-separated list of -first class type specifiers.

      + specifiers. Optionally, the parameter list may include a type ..., + which indicates that the function takes a variable number of arguments. + Variable argument functions can access their arguments with + the variable argument handling intrinsic + functions. '<returntype>' is a any type except + label.

      Examples:
      @@ -1493,41 +1650,50 @@ Variable argument functions can access their arguments with the - - -
      float (i16 signext, i32 *) * Pointer to a function that takes - an i16 that should be sign extended and a - pointer to i32, returning + Pointer to a function that takes + an i16 that should be sign extended and a + pointer to i32, returning float.
      i32 (i8*, ...)A vararg function that takes at least one - pointer to i8 (char in C), - which returns an integer. This is the signature for printf in + A vararg function that takes at least one + pointer to i8 (char in C), + which returns an integer. This is the signature for printf in LLVM.
      {i32, i32} (i32)A function taking an i32, returning two - i32 values as an aggregate of type { i32, i32 } + A function taking an i32, returning a + structure containing two i32 values
      + +
      +
      Overview:
      -

      The structure type is used to represent a collection of data members -together in memory. The packing of the field types is defined to match -the ABI of the underlying processor. The elements of a structure may -be any type that has a size.

      -

      Structures are accessed using 'load -and 'store' by getting a pointer to a -field with the 'getelementptr' -instruction.

      +

      The structure type is used to represent a collection of data members together + in memory. The packing of the field types is defined to match the ABI of the + underlying processor. The elements of a structure may be any type that has a + size.

      + +

      Structures in memory are accessed using 'load' + and 'store' by getting a pointer to a field + with the 'getelementptr' instruction. + Structures in registers are accessed using the + 'extractvalue' and + 'insertvalue' instructions.

      Syntax:
      -
        { <type list> }
      +
      +  { <type list> }
      +
      +
      Examples:
      @@ -1542,28 +1708,29 @@ instruction.

      -

      Note that the code generator does not yet support large aggregate types -to be used as function return types. The specific limit on how large an -aggregate return type the code generator can currently handle is -target-dependent, and also dependent on the aggregate element types.

      -
      +
      +
      Overview:

      The packed structure type is used to represent a collection of data members -together in memory. There is no padding between fields. Further, the alignment -of a packed structure is 1 byte. The elements of a packed structure may -be any type that has a size.

      -

      Structures are accessed using 'load -and 'store' by getting a pointer to a -field with the 'getelementptr' -instruction.

      + together in memory. There is no padding between fields. Further, the + alignment of a packed structure is 1 byte. The elements of a packed + structure may be any type that has a size.

      + +

      Structures are accessed using 'load and + 'store' by getting a pointer to a field with + the 'getelementptr' instruction.

      +
      Syntax:
      -
        < { <type list> } > 
      +
      +  < { <type list> } >
      +
      +
      Examples:
      @@ -1578,23 +1745,28 @@ instruction.

      an i32.
      +
      +
      +
      Overview:
      -

      As in many languages, the pointer type represents a pointer or -reference to another object, which must live in memory. Pointer types may have -an optional address space attribute defining the target-specific numbered -address space where the pointed-to object resides. The default address space is -zero.

      +

      As in many languages, the pointer type represents a pointer or reference to + another object, which must live in memory. Pointer types may have an optional + address space attribute defining the target-specific numbered address space + where the pointed-to object resides. The default address space is zero.

      -

      Note that LLVM does not permit pointers to void (void*) nor does -it permit pointers to labels (label*). Use i8* instead.

      +

      Note that LLVM does not permit pointers to void (void*) nor does it + permit pointers to labels (label*). Use i8* instead.

      Syntax:
      -
        <type> *
      +
      +  <type> *
      +
      +
      Examples:
      @@ -1614,33 +1786,30 @@ it permit pointers to labels (label*). Use i8* instead.

      that resides in address space #5.
      +
      +
      Overview:
      - -

      A vector type is a simple derived type that represents a vector -of elements. Vector types are used when multiple primitive data -are operated in parallel using a single instruction (SIMD). -A vector type requires a size (number of -elements) and an underlying primitive data type. Vectors must have a power -of two length (1, 2, 4, 8, 16 ...). Vector types are -considered first class.

      +

      A vector type is a simple derived type that represents a vector of elements. + Vector types are used when multiple primitive data are operated in parallel + using a single instruction (SIMD). A vector type requires a size (number of + elements) and an underlying primitive data type. Vector types are considered + first class.

      Syntax:
      -
         < <# elements> x <elementtype> >
       
      -

      The number of elements is a constant integer value; elementtype may -be any integer or floating point type.

      +

      The number of elements is a constant integer value; elementtype may be any + integer or floating point type.

      Examples:
      - @@ -1656,11 +1825,6 @@ be any integer or floating point type.

      <4 x i32>
      -

      Note that the code generator does not yet support large vector types -to be used as function return types. The specific limit on how large a -vector return type codegen can currently handle is target-dependent; -currently it's often a few times longer than a hardware vector register.

      -
      @@ -1668,26 +1832,24 @@ currently it's often a few times longer than a hardware vector register.

      Overview:
      -

      Opaque types are used to represent unknown types in the system. This -corresponds (for example) to the C notion of a forward declared structure type. -In LLVM, opaque types can eventually be resolved to any type (not just a -structure type).

      + corresponds (for example) to the C notion of a forward declared structure + type. In LLVM, opaque types can eventually be resolved to any type (not just + a structure type).

      Syntax:
      -
         opaque
       
      Examples:
      -
      opaque An opaque type.
      +
      @@ -1696,12 +1858,13 @@ structure type).

      +
      Overview:
      -

      -An "up reference" allows you to refer to a lexically enclosing type without -requiring it to have a name. For instance, a structure declaration may contain a -pointer to any of the types it is lexically a member of. Example of up -references (with their equivalent as named type declarations) include:

      +

      An "up reference" allows you to refer to a lexically enclosing type without + requiring it to have a name. For instance, a structure declaration may + contain a pointer to any of the types it is lexically a member of. Example + of up references (with their equivalent as named type declarations) + include:

          { \2 * }                %x = type { %x* }
      @@ -1709,24 +1872,20 @@ references (with their equivalent as named type declarations) include:

      \1* %z = type %z*
      -

      -An up reference is needed by the asmprinter for printing out cyclic types when -there is no declared name for a type in the cycle. Because the asmprinter does -not want to print out an infinite type string, it needs a syntax to handle -recursive types that have no names (all names are optional in llvm IR). -

      +

      An up reference is needed by the asmprinter for printing out cyclic types + when there is no declared name for a type in the cycle. Because the + asmprinter does not want to print out an infinite type string, it needs a + syntax to handle recursive types that have no names (all names are optional + in llvm IR).

      Syntax:
          \<level>
       
      -

      -The level is the count of the lexical type that is being referred to. -

      +

      The level is the count of the lexical type that is being referred to.

      Examples:
      - @@ -1738,8 +1897,8 @@ The level is the count of the lexical type that is being referred to. structure.
      \1*
      -
      + @@ -1748,7 +1907,7 @@ The level is the count of the lexical type that is being referred to.

      LLVM has several different basic types of constants. This section describes -them all and their syntax.

      + them all and their syntax.

      @@ -1759,117 +1918,103 @@ them all and their syntax.

      Boolean constants
      -
      The two strings 'true' and 'false' are both valid - constants of the i1 type. -
      + constants of the i1 type.
      Integer constants
      - -
      Standard integers (such as '4') are constants of the integer type. Negative numbers may be used with - integer types. -
      +
      Standard integers (such as '4') are constants of + the integer type. Negative numbers may be used + with integer types.
      Floating point constants
      -
      Floating point constants use standard decimal notation (e.g. 123.421), - exponential notation (e.g. 1.23421e+2), or a more precise hexadecimal - notation (see below). The assembler requires the exact decimal value of - a floating-point constant. For example, the assembler accepts 1.25 but - rejects 1.3 because 1.3 is a repeating decimal in binary. Floating point - constants must have a floating point type.
      + exponential notation (e.g. 1.23421e+2), or a more precise hexadecimal + notation (see below). The assembler requires the exact decimal value of a + floating-point constant. For example, the assembler accepts 1.25 but + rejects 1.3 because 1.3 is a repeating decimal in binary. Floating point + constants must have a floating point type.
      Null pointer constants
      -
      The identifier 'null' is recognized as a null pointer constant - and must be of pointer type.
      - + and must be of pointer type.
      -

      The one non-intuitive notation for constants is the hexadecimal form -of floating point constants. For example, the form 'double -0x432ff973cafa8000' is equivalent to (but harder to read than) 'double -4.5e+15'. The only time hexadecimal floating point constants are required -(and the only time that they are generated by the disassembler) is when a -floating point constant must be emitted but it cannot be represented as a -decimal floating point number in a reasonable number of digits. For example, -NaN's, infinities, and other -special values are represented in their IEEE hexadecimal format so that -assembly and disassembly do not cause any bits to change in the constants.

      +

      The one non-intuitive notation for constants is the hexadecimal form of + floating point constants. For example, the form 'double + 0x432ff973cafa8000' is equivalent to (but harder to read than) + 'double 4.5e+15'. The only time hexadecimal floating point + constants are required (and the only time that they are generated by the + disassembler) is when a floating point constant must be emitted but it cannot + be represented as a decimal floating point number in a reasonable number of + digits. For example, NaN's, infinities, and other special values are + represented in their IEEE hexadecimal format so that assembly and disassembly + do not cause any bits to change in the constants.

      +

      When using the hexadecimal form, constants of types float and double are -represented using the 16-digit form shown above (which matches the IEEE754 -representation for double); float values must, however, be exactly representable -as IEE754 single precision. -Hexadecimal format is always used for long -double, and there are three forms of long double. The 80-bit -format used by x86 is represented as 0xK -followed by 20 hexadecimal digits. -The 128-bit format used by PowerPC (two adjacent doubles) is represented -by 0xM followed by 32 hexadecimal digits. The IEEE 128-bit -format is represented -by 0xL followed by 32 hexadecimal digits; no currently supported -target uses this format. Long doubles will only work if they match -the long double format on your target. All hexadecimal formats are big-endian -(sign bit at the left).

      + represented using the 16-digit form shown above (which matches the IEEE754 + representation for double); float values must, however, be exactly + representable as IEE754 single precision. Hexadecimal format is always used + for long double, and there are three forms of long double. The 80-bit format + used by x86 is represented as 0xK followed by 20 hexadecimal digits. + The 128-bit format used by PowerPC (two adjacent doubles) is represented + by 0xM followed by 32 hexadecimal digits. The IEEE 128-bit format + is represented by 0xL followed by 32 hexadecimal digits; no + currently supported target uses this format. Long doubles will only work if + they match the long double format on your target. All hexadecimal formats + are big-endian (sign bit at the left).

      +
      +

      Complex constants are a (potentially recursive) combination of simple -constants and smaller complex constants.

      + constants and smaller complex constants.

      Structure constants
      -
      Structure constants are represented with notation similar to structure - type definitions (a comma separated list of elements, surrounded by braces - ({})). For example: "{ i32 4, float 17.0, i32* @G }", - where "@G" is declared as "@G = external global i32". Structure constants - must have structure type, and the number and - types of elements must match those specified by the type. -
      + type definitions (a comma separated list of elements, surrounded by braces + ({})). For example: "{ i32 4, float 17.0, i32* @G }", + where "@G" is declared as "@G = external global i32". + Structure constants must have structure type, and + the number and types of elements must match those specified by the + type.
      Array constants
      -
      Array constants are represented with notation similar to array type - definitions (a comma separated list of elements, surrounded by square brackets - ([])). For example: "[ i32 42, i32 11, i32 74 ]". Array - constants must have array type, and the number and - types of elements must match those specified by the type. -
      + definitions (a comma separated list of elements, surrounded by square + brackets ([])). For example: "[ i32 42, i32 11, i32 74 + ]". Array constants must have array type, and + the number and types of elements must match those specified by the + type.
      Vector constants
      -
      Vector constants are represented with notation similar to vector type - definitions (a comma separated list of elements, surrounded by - less-than/greater-than's (<>)). For example: "< i32 42, - i32 11, i32 74, i32 100 >". Vector constants must have vector type, and the number and types of elements must - match those specified by the type. -
      + definitions (a comma separated list of elements, surrounded by + less-than/greater-than's (<>)). For example: "< i32 + 42, i32 11, i32 74, i32 100 >". Vector constants must + have vector type, and the number and types of + elements must match those specified by the type.
      Zero initialization
      -
      The string 'zeroinitializer' can be used to zero initialize a - value to zero of any type, including scalar and aggregate types. - This is often used to avoid having to print large zero initializers (e.g. for - large arrays) and is always exactly equivalent to using explicit zero - initializers. -
      + value to zero of any type, including scalar and aggregate types. + This is often used to avoid having to print large zero initializers + (e.g. for large arrays) and is always exactly equivalent to using explicit + zero initializers.
      Metadata node
      - -
      A metadata node is a structure-like constant with the type of an empty - struct. For example: "{ } !{ i32 0, { } !"test" }". Unlike other - constants that are meant to be interpreted as part of the instruction stream, - metadata is a place to attach additional information such as debug info. -
      +
      A metadata node is a structure-like constant with + metadata type. For example: "metadata !{ + i32 0, metadata !"test" }". Unlike other constants that are meant to + be interpreted as part of the instruction stream, metadata is a place to + attach additional information such as debug info.
      @@ -1881,12 +2026,12 @@ constants and smaller complex constants.

      -

      The addresses of global variables and functions are always implicitly valid (link-time) -constants. These constants are explicitly referenced when the identifier for the global is used and always have pointer type. For example, the following is a legal LLVM -file:

      +

      The addresses of global variables + and functions are always implicitly valid + (link-time) constants. These constants are explicitly referenced when + the identifier for the global is used and always + have pointer type. For example, the following is a + legal LLVM file:

      @@ -1901,87 +2046,256 @@ file:

      -

      The string 'undef' is recognized as a type-less constant that has - no specific value. Undefined values may be of any type and be used anywhere - a constant is permitted.

      -

      Undefined values indicate to the compiler that the program is well defined - no matter what value is used, giving the compiler more freedom to optimize. -

      -
      +

      The string 'undef' can be used anywhere a constant is expected, and + indicates that the user of the value may receive an unspecified bit-pattern. + Undefined values may be of any type (other than label or void) and be used + anywhere a constant is permitted.

      - - +

      Undefined values are useful because they indicate to the compiler that the + program is well defined no matter what value is used. This gives the + compiler more freedom to optimize. Here are some examples of (potentially + surprising) transformations that are valid (in pseudo IR):

      -
      -

      Constant expressions are used to allow expressions involving other constants -to be used as constants. Constant expressions may be of any first class type and may involve any LLVM operation -that does not have side effects (e.g. load and call are not supported). The -following is the syntax for constant expressions:

      +
      +
      +  %A = add %X, undef
      +  %B = sub %X, undef
      +  %C = xor %X, undef
      +Safe:
      +  %A = undef
      +  %B = undef
      +  %C = undef
      +
      +
      -
      +

      This is safe because all of the output bits are affected by the undef bits. +Any output bit can have a zero or one depending on the input bits.

      + +
      +
      +  %A = or %X, undef
      +  %B = and %X, undef
      +Safe:
      +  %A = -1
      +  %B = 0
      +Unsafe:
      +  %A = undef
      +  %B = undef
      +
      +
      + +

      These logical operations have bits that are not always affected by the input. +For example, if "%X" has a zero bit, then the output of the 'and' operation will +always be a zero, no matter what the corresponding bit from the undef is. As +such, it is unsafe to optimize or assume that the result of the and is undef. +However, it is safe to assume that all bits of the undef could be 0, and +optimize the and to 0. Likewise, it is safe to assume that all the bits of +the undef operand to the or could be set, allowing the or to be folded to +-1.

      + +
      +
      +  %A = select undef, %X, %Y
      +  %B = select undef, 42, %Y
      +  %C = select %X, %Y, undef
      +Safe:
      +  %A = %X     (or %Y)
      +  %B = 42     (or %Y)
      +  %C = %Y
      +Unsafe:
      +  %A = undef
      +  %B = undef
      +  %C = undef
      +
      +
      + +

      This set of examples show that undefined select (and conditional branch) +conditions can go "either way" but they have to come from one of the two +operands. In the %A example, if %X and %Y were both known to have a clear low +bit, then %A would have to have a cleared low bit. However, in the %C example, +the optimizer is allowed to assume that the undef operand could be the same as +%Y, allowing the whole select to be eliminated.

      + + +
      +
      +  %A = xor undef, undef
      +
      +  %B = undef
      +  %C = xor %B, %B
      +
      +  %D = undef
      +  %E = icmp lt %D, 4
      +  %F = icmp gte %D, 4
      +
      +Safe:
      +  %A = undef
      +  %B = undef
      +  %C = undef
      +  %D = undef
      +  %E = undef
      +  %F = undef
      +
      +
      + +

      This example points out that two undef operands are not necessarily the same. +This can be surprising to people (and also matches C semantics) where they +assume that "X^X" is always zero, even if X is undef. This isn't true for a +number of reasons, but the short answer is that an undef "variable" can +arbitrarily change its value over its "live range". This is true because the +"variable" doesn't actually have a live range. Instead, the value is +logically read from arbitrary registers that happen to be around when needed, +so the value is not necessarily consistent over time. In fact, %A and %C need +to have the same semantics or the core LLVM "replace all uses with" concept +would not hold.

      + +
      +
      +  %A = fdiv undef, %X
      +  %B = fdiv %X, undef
      +Safe:
      +  %A = undef
      +b: unreachable
      +
      +
      + +

      These examples show the crucial difference between an undefined +value and undefined behavior. An undefined value (like undef) is +allowed to have an arbitrary bit-pattern. This means that the %A operation +can be constant folded to undef because the undef could be an SNaN, and fdiv is +not (currently) defined on SNaN's. However, in the second example, we can make +a more aggressive assumption: because the undef is allowed to be an arbitrary +value, we are allowed to assume that it could be zero. Since a divide by zero +has undefined behavior, we are allowed to assume that the operation +does not execute at all. This allows us to delete the divide and all code after +it: since the undefined operation "can't happen", the optimizer can assume that +it occurs in dead code. +

      + +
      +
      +a:  store undef -> %X
      +b:  store %X -> undef
      +Safe:
      +a: <deleted>
      +b: unreachable
      +
      +
      + +

      These examples reiterate the fdiv example: a store "of" an undefined value +can be assumed to not have any effect: we can assume that the value is +overwritten with bits that happen to match what was already there. However, a +store "to" an undefined location could clobber arbitrary memory, therefore, it +has undefined behavior.

      + +
      + + + +
      + +

      blockaddress(@function, %block)

      + +

      The 'blockaddress' constant computes the address of the specified + basic block in the specified function, and always has an i8* type. Taking + the address of the entry block is illegal.

      + +

      This value only has defined behavior when used as an operand to the + 'indirectbr' instruction or for comparisons + against null. Pointer equality tests between labels addresses is undefined + behavior - though, again, comparison against null is ok, and no label is + equal to the null pointer. This may also be passed around as an opaque + pointer sized value as long as the bits are not inspected. This allows + ptrtoint and arithmetic to be performed on these values so long as + the original value is reconstituted before the indirectbr.

      + +

      Finally, some targets may provide defined semantics when + using the value as the operand to an inline assembly, but that is target + specific. +

      + +
      + + + + + +
      + +

      Constant expressions are used to allow expressions involving other constants + to be used as constants. Constant expressions may be of + any first class type and may involve any LLVM + operation that does not have side effects (e.g. load and call are not + supported). The following is the syntax for constant expressions:

      + +
      trunc ( CST to TYPE )
      -
      Truncate a constant to another type. The bit size of CST must be larger - than the bit size of TYPE. Both types must be integers.
      +
      Truncate a constant to another type. The bit size of CST must be larger + than the bit size of TYPE. Both types must be integers.
      zext ( CST to TYPE )
      -
      Zero extend a constant to another type. The bit size of CST must be - smaller or equal to the bit size of TYPE. Both types must be integers.
      +
      Zero extend a constant to another type. The bit size of CST must be + smaller or equal to the bit size of TYPE. Both types must be + integers.
      sext ( CST to TYPE )
      -
      Sign extend a constant to another type. The bit size of CST must be - smaller or equal to the bit size of TYPE. Both types must be integers.
      +
      Sign extend a constant to another type. The bit size of CST must be + smaller or equal to the bit size of TYPE. Both types must be + integers.
      fptrunc ( CST to TYPE )
      -
      Truncate a floating point constant to another floating point type. The - size of CST must be larger than the size of TYPE. Both types must be - floating point.
      +
      Truncate a floating point constant to another floating point type. The + size of CST must be larger than the size of TYPE. Both types must be + floating point.
      fpext ( CST to TYPE )
      -
      Floating point extend a constant to another type. The size of CST must be - smaller or equal to the size of TYPE. Both types must be floating point.
      +
      Floating point extend a constant to another type. The size of CST must be + smaller or equal to the size of TYPE. Both types must be floating + point.
      fptoui ( CST to TYPE )
      Convert a floating point constant to the corresponding unsigned integer - constant. TYPE must be a scalar or vector integer type. CST must be of scalar - or vector floating point type. Both CST and TYPE must be scalars, or vectors - of the same number of elements. If the value won't fit in the integer type, - the results are undefined.
      + constant. TYPE must be a scalar or vector integer type. CST must be of + scalar or vector floating point type. Both CST and TYPE must be scalars, + or vectors of the same number of elements. If the value won't fit in the + integer type, the results are undefined.
      fptosi ( CST to TYPE )
      Convert a floating point constant to the corresponding signed integer - constant. TYPE must be a scalar or vector integer type. CST must be of scalar - or vector floating point type. Both CST and TYPE must be scalars, or vectors - of the same number of elements. If the value won't fit in the integer type, - the results are undefined.
      + constant. TYPE must be a scalar or vector integer type. CST must be of + scalar or vector floating point type. Both CST and TYPE must be scalars, + or vectors of the same number of elements. If the value won't fit in the + integer type, the results are undefined.
      uitofp ( CST to TYPE )
      Convert an unsigned integer constant to the corresponding floating point - constant. TYPE must be a scalar or vector floating point type. CST must be of - scalar or vector integer type. Both CST and TYPE must be scalars, or vectors - of the same number of elements. If the value won't fit in the floating point - type, the results are undefined.
      + constant. TYPE must be a scalar or vector floating point type. CST must be + of scalar or vector integer type. Both CST and TYPE must be scalars, or + vectors of the same number of elements. If the value won't fit in the + floating point type, the results are undefined.
      sitofp ( CST to TYPE )
      Convert a signed integer constant to the corresponding floating point - constant. TYPE must be a scalar or vector floating point type. CST must be of - scalar or vector integer type. Both CST and TYPE must be scalars, or vectors - of the same number of elements. If the value won't fit in the floating point - type, the results are undefined.
      + constant. TYPE must be a scalar or vector floating point type. CST must be + of scalar or vector integer type. Both CST and TYPE must be scalars, or + vectors of the same number of elements. If the value won't fit in the + floating point type, the results are undefined.
      ptrtoint ( CST to TYPE )
      Convert a pointer typed constant to the corresponding integer constant - TYPE must be an integer type. CST must be of pointer type. The CST value is - zero extended, truncated, or unchanged to make it fit in TYPE.
      + TYPE must be an integer type. CST must be of pointer + type. The CST value is zero extended, truncated, or unchanged to + make it fit in TYPE.
      inttoptr ( CST to TYPE )
      -
      Convert a integer constant to a pointer constant. TYPE must be a - pointer type. CST must be of integer type. The CST value is zero extended, - truncated, or unchanged to make it fit in a pointer size. This one is - really dangerous!
      +
      Convert a integer constant to a pointer constant. TYPE must be a pointer + type. CST must be of integer type. The CST value is zero extended, + truncated, or unchanged to make it fit in a pointer size. This one is + really dangerous!
      bitcast ( CST to TYPE )
      Convert a constant, CST, to another TYPE. The constraints of the operands @@ -1989,16 +2303,14 @@ following is the syntax for constant expressions:

      instruction.
      getelementptr ( CSTPTR, IDX0, IDX1, ... )
      - +
      getelementptr inbounds ( CSTPTR, IDX0, IDX1, ... )
      Perform the getelementptr operation on - constants. As with the getelementptr - instruction, the index list may have zero or more indexes, which are required - to make sense for the type of "CSTPTR".
      + constants. As with the getelementptr + instruction, the index list may have zero or more indexes, which are + required to make sense for the type of "CSTPTR".
      select ( COND, VAL1, VAL2 )
      - -
      Perform the select operation on - constants.
      +
      Perform the select operation on constants.
      icmp COND ( VAL1, VAL2 )
      Performs the icmp operation on constants.
      @@ -2006,65 +2318,26 @@ following is the syntax for constant expressions:

      fcmp COND ( VAL1, VAL2 )
      Performs the fcmp operation on constants.
      -
      vicmp COND ( VAL1, VAL2 )
      -
      Performs the vicmp operation on constants.
      - -
      vfcmp COND ( VAL1, VAL2 )
      -
      Performs the vfcmp operation on constants.
      -
      extractelement ( VAL, IDX )
      - -
      Perform the extractelement - operation on constants.
      +
      Perform the extractelement operation on + constants.
      insertelement ( VAL, ELT, IDX )
      - -
      Perform the insertelement - operation on constants.
      - +
      Perform the insertelement operation on + constants.
      shufflevector ( VEC1, VEC2, IDXMASK )
      - -
      Perform the shufflevector - operation on constants.
      +
      Perform the shufflevector operation on + constants.
      OPCODE ( LHS, RHS )
      - -
      Perform the specified operation of the LHS and RHS constants. OPCODE may - be any of the binary or bitwise - binary operations. The constraints on operands are the same as those for - the corresponding instruction (e.g. no bitwise operations on floating point - values are allowed).
      +
      Perform the specified operation of the LHS and RHS constants. OPCODE may + be any of the binary + or bitwise binary operations. The constraints + on operands are the same as those for the corresponding instruction + (e.g. no bitwise operations on floating point values are allowed).
      -
      - - - -
      - -

      Embedded metadata provides a way to attach arbitrary data to the -instruction stream without affecting the behaviour of the program. There are -two metadata primitives, strings and nodes. All metadata has the type of an -empty struct and is identified in syntax by a preceding exclamation point -('!'). -

      - -

      A metadata string is a string surrounded by double quotes. It can contain -any character by escaping non-printable characters with "\xx" where "xx" is -the two digit hex code. For example: "!"test\00"". -

      - -

      Metadata nodes are represented with notation similar to structure constants -(a comma separated list of elements, surrounded by braces and preceeded by an -exclamation point). For example: "!{ { } !"test\00", i32 10}". -

      - -

      Optimizations may rely on metadata to provide additional information about -the program that isn't available in the instructions, or that isn't easily -computable. Similarly, the code generator may expect a certain metadata format -to be used to express debugging information.

      @@ -2078,14 +2351,14 @@ to be used to express debugging information.

      -

      -LLVM supports inline assembler expressions (as opposed to -Module-Level Inline Assembly) through the use of a special value. This -value represents the inline assembler as a string (containing the instructions -to emit), a list of operand constraints (stored as a string), and a flag that -indicates whether or not the inline asm expression has side effects. An example -inline assembler expression is: -

      +

      LLVM supports inline assembler expressions (as opposed + to Module-Level Inline Assembly) through the use of + a special value. This value represents the inline assembler as a string + (containing the instructions to emit), a list of operand constraints (stored + as a string), a flag that indicates whether or not the inline asm + expression has side effects, and a flag indicating whether the function + containing the asm needs to align its stack conservatively. An example + inline assembler expression is:

      @@ -2093,10 +2366,9 @@ i32 (i32) asm "bswap $0", "=r,r"
       
      -

      -Inline assembler expressions may only be used as the callee operand of -a call instruction. Thus, typically we have: -

      +

      Inline assembler expressions may only be used as the callee operand of + a call instruction. Thus, typically we + have:

      @@ -2104,11 +2376,9 @@ a call instruction.  Thus, typically we have:
       
      -

      -Inline asms with side effects not visible in the constraint list must be marked -as having side effects. This is done through the use of the -'sideeffect' keyword, like so: -

      +

      Inline asms with side effects not visible in the constraint list must be + marked as having side effects. This is done through the use of the + 'sideeffect' keyword, like so:

      @@ -2116,26 +2386,159 @@ call void asm sideeffect "eieio", ""()
       
      +

      In some cases inline asms will contain code that will not work unless the + stack is aligned in some way, such as calls or SSE instructions on x86, + yet will not contain code that does that alignment within the asm. + The compiler should make conservative assumptions about what the asm might + contain and should generate its usual stack alignment code in the prologue + if the 'alignstack' keyword is present:

      + +
      +
      +call void asm alignstack "eieio", ""()
      +
      +
      + +

      If both keywords appear the 'sideeffect' keyword must come + first.

      +

      TODO: The format of the asm and constraints string still need to be -documented here. Constraints on what can be done (e.g. duplication, moving, etc -need to be documented). This is probably best done by reference to another -document that covers inline asm from a holistic perspective. -

      + documented here. Constraints on what can be done (e.g. duplication, moving, + etc need to be documented). This is probably best done by reference to + another document that covers inline asm from a holistic perspective.

      + +
      + + + + +
      + +

      LLVM IR allows metadata to be attached to instructions in the program that + can convey extra information about the code to the optimizers and code + generator. One example application of metadata is source-level debug + information. There are two metadata primitives: strings and nodes. All + metadata has the metadata type and is identified in syntax by a + preceding exclamation point ('!').

      + +

      A metadata string is a string surrounded by double quotes. It can contain + any character by escaping non-printable characters with "\xx" where "xx" is + the two digit hex code. For example: "!"test\00"".

      + +

      Metadata nodes are represented with notation similar to structure constants + (a comma separated list of elements, surrounded by braces and preceded by an + exclamation point). For example: "!{ metadata !"test\00", i32 + 10}". Metadata nodes can have any values as their operand.

      + +

      A named metadata is a collection of + metadata nodes, which can be looked up in the module symbol table. For + example: "!foo = metadata !{!4, !3}". + +

      + + + + + + +

      LLVM has a number of "magic" global variables that contain data that affect +code generation or other IR semantics. These are documented here. All globals +of this sort should have a section specified as "llvm.metadata". This +section and all globals that start with "llvm." are reserved for use +by LLVM.

      + + + + +
      + +

      The @llvm.used global is an array with i8* element type which has appending linkage. This array contains a list of +pointers to global variables and functions which may optionally have a pointer +cast formed of bitcast or getelementptr. For example, a legal use of it is:

      + +
      +  @X = global i8 4
      +  @Y = global i32 123
      +
      +  @llvm.used = appending global [2 x i8*] [
      +     i8* @X,
      +     i8* bitcast (i32* @Y to i8*)
      +  ], section "llvm.metadata"
      +
      + +

      If a global variable appears in the @llvm.used list, then the +compiler, assembler, and linker are required to treat the symbol as if there is +a reference to the global that it cannot see. For example, if a variable has +internal linkage and no references other than that from the @llvm.used +list, it cannot be deleted. This is commonly used to represent references from +inline asms and other things the compiler cannot "see", and corresponds to +"attribute((used))" in GNU C.

      + +

      On some targets, the code generator must emit a directive to the assembler or +object file to prevent the assembler and linker from molesting the symbol.

      + +
      + + + + +
      + +

      The @llvm.compiler.used directive is the same as the +@llvm.used directive, except that it only prevents the compiler from +touching the symbol. On targets that support it, this allows an intelligent +linker to optimize references to the symbol without being impeded as it would be +by @llvm.used.

      + +

      This is a rare construct that should only be used in rare circumstances, and +should not be exposed to source languages.

      + +
      + + + + +
      + +

      TODO: Describe this.

      + +
      + + + + +
      + +

      TODO: Describe this.

      +
      -

      The LLVM instruction set consists of several different -classifications of instructions: terminator -instructions, binary instructions, -bitwise binary instructions, memory instructions, and other -instructions.

      +

      The LLVM instruction set consists of several different classifications of + instructions: terminator + instructions, binary instructions, + bitwise binary instructions, + memory instructions, and + other instructions.

      @@ -2145,25 +2548,30 @@ Instructions
      -

      As mentioned previously, every -basic block in a program ends with a "Terminator" instruction, which -indicates which block should be executed after the current block is -finished. These terminator instructions typically yield a 'void' -value: they produce control flow, not values (the one exception being -the 'invoke' instruction).

      -

      There are six different terminator instructions: the 'ret' instruction, the 'br' -instruction, the 'switch' instruction, -the 'invoke' instruction, the 'unwind' instruction, and the 'unreachable' instruction.

      +

      As mentioned previously, every basic block + in a program ends with a "Terminator" instruction, which indicates which + block should be executed after the current block is finished. These + terminator instructions typically yield a 'void' value: they produce + control flow, not values (the one exception being the + 'invoke' instruction).

      + +

      There are six different terminator instructions: the + 'ret' instruction, the + 'br' instruction, the + 'switch' instruction, the + ''indirectbr' Instruction, the + 'invoke' instruction, the + 'unwind' instruction, and the + 'unreachable' instruction.

      +
      +
      Syntax:
         ret <type> <value>       ; Return a value from a non-void function
      @@ -2171,122 +2579,121 @@ Instruction 
      Overview:
      +

      The 'ret' instruction is used to return control flow (and optionally + a value) from a function back to the caller.

      -

      The 'ret' instruction is used to return control flow (and -optionally a value) from a function back to the caller.

      -

      There are two forms of the 'ret' instruction: one that -returns a value and then causes control flow, and one that just causes -control flow to occur.

      +

      There are two forms of the 'ret' instruction: one that returns a + value and then causes control flow, and one that just causes control flow to + occur.

      Arguments:
      +

      The 'ret' instruction optionally accepts a single argument, the + return value. The type of the return value must be a + 'first class' type.

      -

      The 'ret' instruction optionally accepts a single argument, -the return value. The type of the return value must be a -'first class' type.

      - -

      A function is not well formed if -it it has a non-void return type and contains a 'ret' -instruction with no return value or a return value with a type that -does not match its type, or if it has a void return type and contains -a 'ret' instruction with a return value.

      +

      A function is not well formed if it it has a + non-void return type and contains a 'ret' instruction with no return + value or a return value with a type that does not match its type, or if it + has a void return type and contains a 'ret' instruction with a + return value.

      Semantics:
      - -

      When the 'ret' instruction is executed, control flow -returns back to the calling function's context. If the caller is a "call" instruction, execution continues at -the instruction after the call. If the caller was an "invoke" instruction, execution continues -at the beginning of the "normal" destination block. If the instruction -returns a value, that value shall set the call or invoke instruction's -return value.

      +

      When the 'ret' instruction is executed, control flow returns back to + the calling function's context. If the caller is a + "call" instruction, execution continues at the + instruction after the call. If the caller was an + "invoke" instruction, execution continues at + the beginning of the "normal" destination block. If the instruction returns + a value, that value shall set the call or invoke instruction's return + value.

      Example:
      -
         ret i32 5                       ; Return an integer value of 5
         ret void                        ; Return from a void function
         ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
       
      -

      Note that the code generator does not yet fully support large - return values. The specific sizes that are currently supported are - dependent on the target. For integers, on 32-bit targets the limit - is often 64 bits, and on 64-bit targets the limit is often 128 bits. - For aggregate types, the current limits are dependent on the element - types; for example targets are often limited to 2 total integer - elements and 2 total floating-point elements.

      -
      +
      +
      Syntax:
      -
        br i1 <cond>, label <iftrue>, label <iffalse>
      br label <dest> ; Unconditional branch +
      +  br i1 <cond>, label <iftrue>, label <iffalse>
      br label <dest> ; Unconditional branch
      +
      Overview:
      -

      The 'br' instruction is used to cause control flow to -transfer to a different basic block in the current function. There are -two forms of this instruction, corresponding to a conditional branch -and an unconditional branch.

      +

      The 'br' instruction is used to cause control flow to transfer to a + different basic block in the current function. There are two forms of this + instruction, corresponding to a conditional branch and an unconditional + branch.

      +
      Arguments:
      -

      The conditional branch form of the 'br' instruction takes a -single 'i1' value and two 'label' values. The -unconditional form of the 'br' instruction takes a single -'label' value as a target.

      +

      The conditional branch form of the 'br' instruction takes a single + 'i1' value and two 'label' values. The unconditional form + of the 'br' instruction takes a single 'label' value as a + target.

      +
      Semantics:

      Upon execution of a conditional 'br' instruction, the 'i1' -argument is evaluated. If the value is true, control flows -to the 'iftrue' label argument. If "cond" is false, -control flows to the 'iffalse' label argument.

      + argument is evaluated. If the value is true, control flows to the + 'iftrue' label argument. If "cond" is false, + control flows to the 'iffalse' label argument.

      +
      Example:
      -
      Test:
      %cond = icmp eq, i32 %a, %b
      br i1 %cond, label %IfEqual, label %IfUnequal
      IfEqual:
      ret i32 1
      IfUnequal:
      ret i32 0
      +
      +Test:
      +  %cond = icmp eq i32 %a, %b
      +  br i1 %cond, label %IfEqual, label %IfUnequal
      +IfEqual:
      +  ret i32 1
      +IfUnequal:
      +  ret i32 0
      +
      +
      +
      -
      Syntax:
      +
      Syntax:
         switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
       
      Overview:
      -

      The 'switch' instruction is used to transfer control flow to one of -several different places. It is a generalization of the 'br' -instruction, allowing a branch to occur to one of many possible -destinations.

      - + several different places. It is a generalization of the 'br' + instruction, allowing a branch to occur to one of many possible + destinations.

      Arguments:
      -

      The 'switch' instruction uses three parameters: an integer -comparison value 'value', a default 'label' destination, and -an array of pairs of comparison value constants and 'label's. The -table is not allowed to contain duplicate constant entries.

      + comparison value 'value', a default 'label' destination, + and an array of pairs of comparison value constants and 'label's. + The table is not allowed to contain duplicate constant entries.

      Semantics:
      -

      The switch instruction specifies a table of values and -destinations. When the 'switch' instruction is executed, this -table is searched for the given value. If the value is found, control flow is -transfered to the corresponding destination; otherwise, control flow is -transfered to the default destination.

      + destinations. When the 'switch' instruction is executed, this table + is searched for the given value. If the value is found, control flow is + transferred to the corresponding destination; otherwise, control flow is + transferred to the default destination.

      Implementation:
      -

      Depending on properties of the target machine and the particular -switch instruction, this instruction may be code generated in different -ways. For example, it could be generated as a series of chained conditional -branches or with a lookup table.

      + switch instruction, this instruction may be code generated in + different ways. For example, it could be generated as a series of chained + conditional branches or with a lookup table.

      Example:
      -
        ; Emulate a conditional br instruction
        %Val = zext i1 %value to i32
      @@ -2300,84 +2707,135 @@ branches or with a lookup table.

      i32 1, label %onone i32 2, label %ontwo ]
      +
      +
      Syntax:
      +
      +  indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
      +
      + +
      Overview:
      +

      The 'indirectbr' instruction implements an indirect branch to a label + within the current function, whose address is specified by + "address". Address must be derived from a blockaddress constant.

      + +
      Arguments:
      + +

      The 'address' argument is the address of the label to jump to. The + rest of the arguments indicate the full set of possible destinations that the + address may point to. Blocks are allowed to occur multiple times in the + destination list, though this isn't particularly useful.

      + +

      This destination list is required so that dataflow analysis has an accurate + understanding of the CFG.

      + +
      Semantics:
      + +

      Control transfers to the block specified in the address argument. All + possible destination blocks must be listed in the label list, otherwise this + instruction has undefined behavior. This implies that jumps to labels + defined in other functions have undefined behavior as well.

      + +
      Implementation:
      + +

      This is typically implemented with a jump through a register.

      + +
      Example:
      +
      + indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
      +
      + +
      + + + + + +
      + +
      Syntax:
         <result> = invoke [cconv] [ret attrs] <ptr to function ty> <function ptr val>(<function args>) [fn attrs]
                       to label <normal label> unwind label <exception label>
       
      Overview:
      -

      The 'invoke' instruction causes control to transfer to a specified -function, with the possibility of control flow transfer to either the -'normal' label or the -'exception' label. If the callee function returns with the -"ret" instruction, control flow will return to the -"normal" label. If the callee (or any indirect callees) returns with the "unwind" instruction, control is interrupted and -continued at the dynamically nearest "exception" label.

      + function, with the possibility of control flow transfer to either the + 'normal' label or the 'exception' label. If the callee + function returns with the "ret" instruction, + control flow will return to the "normal" label. If the callee (or any + indirect callees) returns with the "unwind" + instruction, control is interrupted and continued at the dynamically nearest + "exception" label.

      Arguments:
      -

      This instruction requires several arguments:

        -
      1. - The optional "cconv" marker indicates which calling - convention the call should use. If none is specified, the call defaults - to using C calling conventions. -
      2. +
      3. The optional "cconv" marker indicates which calling + convention the call should use. If none is specified, the call + defaults to using C calling conventions.
      4. The optional Parameter Attributes list for - return values. Only 'zeroext', 'signext', - and 'inreg' attributes are valid here.
      5. + return values. Only 'zeroext', 'signext', and + 'inreg' attributes are valid here.
      6. 'ptr to function ty': shall be the signature of the pointer to - function value being invoked. In most cases, this is a direct function - invocation, but indirect invokes are just as possible, branching off - an arbitrary pointer to function value. -
      7. + function value being invoked. In most cases, this is a direct function + invocation, but indirect invokes are just as possible, branching + off an arbitrary pointer to function value.
      8. 'function ptr val': An LLVM value containing a pointer to a - function to be invoked.
      9. + function to be invoked.
      10. 'function args': argument list whose types match the function - signature argument types. If the function signature indicates the function - accepts a variable number of arguments, the extra arguments can be - specified.
      11. + signature argument types. If the function signature indicates the + function accepts a variable number of arguments, the extra arguments can + be specified.
      12. 'normal label': the label reached when the called function - executes a 'ret' instruction.
      13. + executes a 'ret' instruction.
      14. 'exception label': the label reached when a callee returns with - the unwind instruction.
      15. + the unwind instruction.
      16. The optional function attributes list. Only - 'noreturn', 'nounwind', 'readonly' and - 'readnone' attributes are valid here.
      17. + 'noreturn', 'nounwind', 'readonly' and + 'readnone' attributes are valid here.
      Semantics:
      - -

      This instruction is designed to operate as a standard 'call' instruction in most regards. The primary -difference is that it establishes an association with a label, which is used by -the runtime library to unwind the stack.

      +

      This instruction is designed to operate as a standard + 'call' instruction in most regards. The + primary difference is that it establishes an association with a label, which + is used by the runtime library to unwind the stack.

      This instruction is used in languages with destructors to ensure that proper -cleanup is performed in the case of either a longjmp or a thrown -exception. Additionally, this is important for implementation of -'catch' clauses in high-level languages that support them.

      + cleanup is performed in the case of either a longjmp or a thrown + exception. Additionally, this is important for implementation of + 'catch' clauses in high-level languages that support them.

      + +

      For the purposes of the SSA form, the definition of the value returned by the + 'invoke' instruction is deemed to occur on the edge from the current + block to the "normal" label. If the callee unwinds then no return value is + available.

      + +

      Note that the code generator does not yet completely support unwind, and +that the invoke/unwind semantics are likely to change in future versions.

      Example:
      @@ -2386,8 +2844,8 @@ exception.  Additionally, this is important for implementation of
         %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
                     unwind label %TestCleanup              ; {i32}:retval set
       
      -
      + @@ -2398,305 +2856,448 @@ Instruction
      Syntax:
      -  unwind
      +  unwind
      +
      + +
      Overview:
      +

      The 'unwind' instruction unwinds the stack, continuing control flow + at the first callee in the dynamic call stack which used + an invoke instruction to perform the call. + This is primarily used to implement exception handling.

      + +
      Semantics:
      +

      The 'unwind' instruction causes execution of the current function to + immediately halt. The dynamic call stack is then searched for the + first invoke instruction on the call stack. + Once found, execution continues at the "exceptional" destination block + specified by the invoke instruction. If there is no invoke + instruction in the dynamic call chain, undefined behavior results.

      + +

      Note that the code generator does not yet completely support unwind, and +that the invoke/unwind semantics are likely to change in future versions.

      + + + + + + + +
      + +
      Syntax:
      +
      +  unreachable
      +
      + +
      Overview:
      +

      The 'unreachable' instruction has no defined semantics. This + instruction is used to inform the optimizer that a particular portion of the + code is not reachable. This can be used to indicate that the code after a + no-return function cannot be reached, and other facts.

      + +
      Semantics:
      +

      The 'unreachable' instruction has no defined semantics.

      + +
      + + + + +
      + +

      Binary operators are used to do most of the computation in a program. They + require two operands of the same type, execute an operation on them, and + produce a single value. The operands might represent multiple data, as is + the case with the vector data type. The result value + has the same type as its operands.

      + +

      There are several different binary operators:

      + +
      + + + + +
      + +
      Syntax:
      +
      +  <result> = add <ty> <op1>, <op2>          ; yields {ty}:result
      +  <result> = add nuw <ty> <op1>, <op2>      ; yields {ty}:result
      +  <result> = add nsw <ty> <op1>, <op2>      ; yields {ty}:result
      +  <result> = add nuw nsw <ty> <op1>, <op2>  ; yields {ty}:result
      +
      + +
      Overview:
      +

      The 'add' instruction returns the sum of its two operands.

      + +
      Arguments:
      +

      The two arguments to the 'add' instruction must + be integer or vector of + integer values. Both arguments must have identical types.

      + +
      Semantics:
      +

      The value produced is the integer sum of the two operands.

      + +

      If the sum has unsigned overflow, the result returned is the mathematical + result modulo 2n, where n is the bit width of the result.

      + +

      Because LLVM integers use a two's complement representation, this instruction + is appropriate for both signed and unsigned integers.

      + +

      nuw and nsw stand for "No Unsigned Wrap" + and "No Signed Wrap", respectively. If the nuw and/or + nsw keywords are present, the result value of the add + is undefined if unsigned and/or signed overflow, respectively, occurs.

      + +
      Example:
      +
      +  <result> = add i32 4, %var          ; yields {i32}:result = 4 + %var
      +
      + +
      + + + + +
      + +
      Syntax:
      +
      +  <result> = fadd <ty> <op1>, <op2>   ; yields {ty}:result
       
      Overview:
      +

      The 'fadd' instruction returns the sum of its two operands.

      -

      The 'unwind' instruction unwinds the stack, continuing control flow -at the first callee in the dynamic call stack which used an invoke instruction to perform the call. This is -primarily used to implement exception handling.

      +
      Arguments:
      +

      The two arguments to the 'fadd' instruction must be + floating point or vector of + floating point values. Both arguments must have identical types.

      Semantics:
      +

      The value produced is the floating point sum of the two operands.

      + +
      Example:
      +
      +  <result> = fadd float 4.0, %var          ; yields {float}:result = 4.0 + %var
      +
      -

      The 'unwind' instruction causes execution of the current function to -immediately halt. The dynamic call stack is then searched for the first invoke instruction on the call stack. Once found, -execution continues at the "exceptional" destination block specified by the -invoke instruction. If there is no invoke instruction in the -dynamic call chain, undefined behavior results.

      - - +
      Syntax:
      -  unreachable
      +  <result> = sub <ty> <op1>, <op2>          ; yields {ty}:result
      +  <result> = sub nuw <ty> <op1>, <op2>      ; yields {ty}:result
      +  <result> = sub nsw <ty> <op1>, <op2>      ; yields {ty}:result
      +  <result> = sub nuw nsw <ty> <op1>, <op2>  ; yields {ty}:result
       
      Overview:
      +

      The 'sub' instruction returns the difference of its two + operands.

      -

      The 'unreachable' instruction has no defined semantics. This -instruction is used to inform the optimizer that a particular portion of the -code is not reachable. This can be used to indicate that the code after a -no-return function cannot be reached, and other facts.

      +

      Note that the 'sub' instruction is used to represent the + 'neg' instruction present in most other intermediate + representations.

      + +
      Arguments:
      +

      The two arguments to the 'sub' instruction must + be integer or vector of + integer values. Both arguments must have identical types.

      Semantics:
      +

      The value produced is the integer difference of the two operands.

      -

      The 'unreachable' instruction has no defined semantics.

      -
      +

      If the difference has unsigned overflow, the result returned is the + mathematical result modulo 2n, where n is the bit width of the + result.

      +

      Because LLVM integers use a two's complement representation, this instruction + is appropriate for both signed and unsigned integers.

      +

      nuw and nsw stand for "No Unsigned Wrap" + and "No Signed Wrap", respectively. If the nuw and/or + nsw keywords are present, the result value of the sub + is undefined if unsigned and/or signed overflow, respectively, occurs.

      + +
      Example:
      +
      +  <result> = sub i32 4, %var          ; yields {i32}:result = 4 - %var
      +  <result> = sub i32 0, %val          ; yields {i32}:result = -%var
      +
      - - -
      -

      Binary operators are used to do most of the computation in a -program. They require two operands of the same type, execute an operation on them, and -produce a single value. The operands might represent -multiple data, as is the case with the vector data type. -The result value has the same type as its operands.

      -

      There are several different binary operators:

      +
      Syntax:
      -
      -  <result> = add <ty> <op1>, <op2>   ; yields {ty}:result
      +  <result> = fsub <ty> <op1>, <op2>   ; yields {ty}:result
       
      Overview:
      +

      The 'fsub' instruction returns the difference of its two + operands.

      -

      The 'add' instruction returns the sum of its two operands.

      +

      Note that the 'fsub' instruction is used to represent the + 'fneg' instruction present in most other intermediate + representations.

      Arguments:
      - -

      The two arguments to the 'add' instruction must be integer, floating point, or - vector values. Both arguments must have identical - types.

      +

      The two arguments to the 'fsub' instruction must be + floating point or vector of + floating point values. Both arguments must have identical types.

      Semantics:
      - -

      The value produced is the integer or floating point sum of the two -operands.

      - -

      If an integer sum has unsigned overflow, the result returned is the -mathematical result modulo 2n, where n is the bit width of -the result.

      - -

      Because LLVM integers use a two's complement representation, this -instruction is appropriate for both signed and unsigned integers.

      +

      The value produced is the floating point difference of the two operands.

      Example:
      -
      -  <result> = add i32 4, %var          ; yields {i32}:result = 4 + %var
      +  <result> = fsub float 4.0, %var           ; yields {float}:result = 4.0 - %var
      +  <result> = fsub float -0.0, %val          ; yields {float}:result = -%var
       
      +
      +
      Syntax:
      -
      -  <result> = sub <ty> <op1>, <op2>   ; yields {ty}:result
      +  <result> = mul <ty> <op1>, <op2>          ; yields {ty}:result
      +  <result> = mul nuw <ty> <op1>, <op2>      ; yields {ty}:result
      +  <result> = mul nsw <ty> <op1>, <op2>      ; yields {ty}:result
      +  <result> = mul nuw nsw <ty> <op1>, <op2>  ; yields {ty}:result
       
      Overview:
      - -

      The 'sub' instruction returns the difference of its two -operands.

      - -

      Note that the 'sub' instruction is used to represent the -'neg' instruction present in most other intermediate -representations.

      +

      The 'mul' instruction returns the product of its two operands.

      Arguments:
      - -

      The two arguments to the 'sub' instruction must be integer, floating point, - or vector values. Both arguments must have identical - types.

      +

      The two arguments to the 'mul' instruction must + be integer or vector of + integer values. Both arguments must have identical types.

      Semantics:
      +

      The value produced is the integer product of the two operands.

      -

      The value produced is the integer or floating point difference of -the two operands.

      +

      If the result of the multiplication has unsigned overflow, the result + returned is the mathematical result modulo 2n, where n is the bit + width of the result.

      -

      If an integer difference has unsigned overflow, the result returned is the -mathematical result modulo 2n, where n is the bit width of -the result.

      +

      Because LLVM integers use a two's complement representation, and the result + is the same width as the operands, this instruction returns the correct + result for both signed and unsigned integers. If a full product + (e.g. i32xi32->i64) is needed, the operands should + be sign-extended or zero-extended as appropriate to the width of the full + product.

      -

      Because LLVM integers use a two's complement representation, this -instruction is appropriate for both signed and unsigned integers.

      +

      nuw and nsw stand for "No Unsigned Wrap" + and "No Signed Wrap", respectively. If the nuw and/or + nsw keywords are present, the result value of the mul + is undefined if unsigned and/or signed overflow, respectively, occurs.

      Example:
      -  <result> = sub i32 4, %var          ; yields {i32}:result = 4 - %var
      -  <result> = sub i32 0, %val          ; yields {i32}:result = -%var
      +  <result> = mul i32 4, %var          ; yields {i32}:result = 4 * %var
       
      +
      Syntax:
      -
        <result> = mul <ty> <op1>, <op2>   ; yields {ty}:result
      +
      +  <result> = fmul <ty> <op1>, <op2>   ; yields {ty}:result
       
      +
      Overview:
      -

      The 'mul' instruction returns the product of its two -operands.

      +

      The 'fmul' instruction returns the product of its two operands.

      Arguments:
      +

      The two arguments to the 'fmul' instruction must be + floating point or vector of + floating point values. Both arguments must have identical types.

      -

      The two arguments to the 'mul' instruction must be integer, floating point, -or vector values. Both arguments must have identical -types.

      -
      Semantics:
      +

      The value produced is the floating point product of the two operands.

      -

      The value produced is the integer or floating point product of the -two operands.

      - -

      If the result of an integer multiplication has unsigned overflow, -the result returned is the mathematical result modulo -2n, where n is the bit width of the result.

      -

      Because LLVM integers use a two's complement representation, and the -result is the same width as the operands, this instruction returns the -correct result for both signed and unsigned integers. If a full product -(e.g. i32xi32->i64) is needed, the operands -should be sign-extended or zero-extended as appropriate to the -width of the full product.

      Example:
      -
        <result> = mul i32 4, %var          ; yields {i32}:result = 4 * %var
      +
      +  <result> = fmul float 4.0, %var          ; yields {float}:result = 4.0 * %var
       
      +
      +
      +
      Syntax:
      -
        <result> = udiv <ty> <op1>, <op2>   ; yields {ty}:result
      +
      +  <result> = udiv <ty> <op1>, <op2>   ; yields {ty}:result
       
      +
      Overview:
      -

      The 'udiv' instruction returns the quotient of its two -operands.

      +

      The 'udiv' instruction returns the quotient of its two operands.

      Arguments:
      - -

      The two arguments to the 'udiv' instruction must be -integer or vector of integer -values. Both arguments must have identical types.

      +

      The two arguments to the 'udiv' instruction must be + integer or vector of integer + values. Both arguments must have identical types.

      Semantics:
      -

      The value produced is the unsigned integer quotient of the two operands.

      +

      Note that unsigned integer division and signed integer division are distinct -operations; for signed integer division, use 'sdiv'.

      + operations; for signed integer division, use 'sdiv'.

      +

      Division by zero leads to undefined behavior.

      +
      Example:
      -
        <result> = udiv i32 4, %var          ; yields {i32}:result = 4 / %var
      +
      +  <result> = udiv i32 4, %var          ; yields {i32}:result = 4 / %var
       
      +
      + +
      +
      Syntax:
      -  <result> = sdiv <ty> <op1>, <op2>   ; yields {ty}:result
      +  <result> = sdiv <ty> <op1>, <op2>         ; yields {ty}:result
      +  <result> = sdiv exact <ty> <op1>, <op2>   ; yields {ty}:result
       
      Overview:
      - -

      The 'sdiv' instruction returns the quotient of its two -operands.

      +

      The 'sdiv' instruction returns the quotient of its two operands.

      Arguments:
      - -

      The two arguments to the 'sdiv' instruction must be -integer or vector of integer -values. Both arguments must have identical types.

      +

      The two arguments to the 'sdiv' instruction must be + integer or vector of integer + values. Both arguments must have identical types.

      Semantics:
      -

      The value produced is the signed integer quotient of the two operands rounded towards zero.

      +

      The value produced is the signed integer quotient of the two operands rounded + towards zero.

      +

      Note that signed integer division and unsigned integer division are distinct -operations; for unsigned integer division, use 'udiv'.

      + operations; for unsigned integer division, use 'udiv'.

      +

      Division by zero leads to undefined behavior. Overflow also leads to -undefined behavior; this is a rare case, but can occur, for example, -by doing a 32-bit division of -2147483648 by -1.

      + undefined behavior; this is a rare case, but can occur, for example, by doing + a 32-bit division of -2147483648 by -1.

      + +

      If the exact keyword is present, the result value of the + sdiv is undefined if the result would be rounded or if overflow + would occur.

      +
      Example:
      -
        <result> = sdiv i32 4, %var          ; yields {i32}:result = 4 / %var
      +
      +  <result> = sdiv i32 4, %var          ; yields {i32}:result = 4 / %var
       
      +
      + +
      +
      Syntax:
         <result> = fdiv <ty> <op1>, <op2>   ; yields {ty}:result
       
      -
      Overview:
      -

      The 'fdiv' instruction returns the quotient of its two -operands.

      +
      Overview:
      +

      The 'fdiv' instruction returns the quotient of its two operands.

      Arguments:
      -

      The two arguments to the 'fdiv' instruction must be -floating point or vector -of floating point values. Both arguments must have identical types.

      + floating point or vector of + floating point values. Both arguments must have identical types.

      Semantics:
      -

      The value produced is the floating point quotient of the two operands.

      Example:
      -
         <result> = fdiv float 4.0, %var          ; yields {float}:result = 4.0 / %var
       
      +
      +
      +
      Syntax:
      -
        <result> = urem <ty> <op1>, <op2>   ; yields {ty}:result
      +
      +  <result> = urem <ty> <op1>, <op2>   ; yields {ty}:result
       
      +
      Overview:
      -

      The 'urem' instruction returns the remainder from the -unsigned division of its two arguments.

      +

      The 'urem' instruction returns the remainder from the unsigned + division of its two arguments.

      +
      Arguments:
      -

      The two arguments to the 'urem' instruction must be -integer or vector of integer -values. Both arguments must have identical types.

      +

      The two arguments to the 'urem' instruction must be + integer or vector of integer + values. Both arguments must have identical types.

      +
      Semantics:

      This instruction returns the unsigned integer remainder of a division. -This instruction always performs an unsigned division to get the remainder.

      + This instruction always performs an unsigned division to get the + remainder.

      +

      Note that unsigned integer remainder and signed integer remainder are -distinct operations; for signed integer remainder, use 'srem'.

      + distinct operations; for signed integer remainder, use 'srem'.

      +

      Taking the remainder of a division by zero leads to undefined behavior.

      +
      Example:
      -
        <result> = urem i32 4, %var          ; yields {i32}:result = 4 % %var
      +
      +  <result> = urem i32 4, %var          ; yields {i32}:result = 4 % %var
       
      +
      'srem' Instruction @@ -2705,47 +3306,48 @@ distinct operations; for signed integer remainder, use 'srem'.

      Syntax:
      -
         <result> = srem <ty> <op1>, <op2>   ; yields {ty}:result
       
      Overview:
      - -

      The 'srem' instruction returns the remainder from the -signed division of its two operands. This instruction can also take -vector versions of the values in which case -the elements must be integers.

      +

      The 'srem' instruction returns the remainder from the signed + division of its two operands. This instruction can also take + vector versions of the values in which case the + elements must be integers.

      Arguments:
      - -

      The two arguments to the 'srem' instruction must be -integer or vector of integer -values. Both arguments must have identical types.

      +

      The two arguments to the 'srem' instruction must be + integer or vector of integer + values. Both arguments must have identical types.

      Semantics:
      -

      This instruction returns the remainder of a division (where the result -has the same sign as the dividend, op1), not the modulo -operator (where the result has the same sign as the divisor, op2) of -a value. For more information about the difference, see The -Math Forum. For a table of how this is implemented in various languages, -please see -Wikipedia: modulo operation.

      + has the same sign as the dividend, op1), not the modulo + operator (where the result has the same sign as the divisor, op2) of + a value. For more information about the difference, + see The + Math Forum. For a table of how this is implemented in various languages, + please see + Wikipedia: modulo operation.

      +

      Note that signed integer remainder and unsigned integer remainder are -distinct operations; for unsigned integer remainder, use 'urem'.

      + distinct operations; for unsigned integer remainder, use 'urem'.

      +

      Taking the remainder of a division by zero leads to undefined behavior. -Overflow also leads to undefined behavior; this is a rare case, but can occur, -for example, by taking the remainder of a 32-bit division of -2147483648 by -1. -(The remainder doesn't actually overflow, but this rule lets srem be -implemented using instructions that return both the result of the division -and the remainder.)

      + Overflow also leads to undefined behavior; this is a rare case, but can + occur, for example, by taking the remainder of a 32-bit division of + -2147483648 by -1. (The remainder doesn't actually overflow, but this rule + lets srem be implemented using instructions that return both the result of + the division and the remainder.)

      +
      Example:
      -
        <result> = srem i32 4, %var          ; yields {i32}:result = 4 % %var
      +
      +  <result> = srem i32 4, %var          ; yields {i32}:result = 4 % %var
       
      + @@ -2753,99 +3355,110 @@ and the remainder.)

      Syntax:
      -
        <result> = frem <ty> <op1>, <op2>   ; yields {ty}:result
      +
      +  <result> = frem <ty> <op1>, <op2>   ; yields {ty}:result
       
      +
      Overview:
      -

      The 'frem' instruction returns the remainder from the -division of its two operands.

      +

      The 'frem' instruction returns the remainder from the division of + its two operands.

      +
      Arguments:

      The two arguments to the 'frem' instruction must be -floating point or vector -of floating point values. Both arguments must have identical types.

      + floating point or vector of + floating point values. Both arguments must have identical types.

      Semantics:
      - -

      This instruction returns the remainder of a division. -The remainder has the same sign as the dividend.

      +

      This instruction returns the remainder of a division. The remainder + has the same sign as the dividend.

      Example:
      -
         <result> = frem float 4.0, %var          ; yields {float}:result = 4.0 % %var
       
      +
      +
      -

      Bitwise binary operators are used to do various forms of -bit-twiddling in a program. They are generally very efficient -instructions and can commonly be strength reduced from other -instructions. They require two operands of the same type, execute an operation on them, -and produce a single value. The resulting value is the same type as its operands.

      + +

      Bitwise binary operators are used to do various forms of bit-twiddling in a + program. They are generally very efficient instructions and can commonly be + strength reduced from other instructions. They require two operands of the + same type, execute an operation on them, and produce a single value. The + resulting value is the same type as its operands.

      +
      +
      +
      Syntax:
      -
        <result> = shl <ty> <op1>, <op2>   ; yields {ty}:result
      +
      +  <result> = shl <ty> <op1>, <op2>   ; yields {ty}:result
       
      Overview:
      - -

      The 'shl' instruction returns the first operand shifted to -the left a specified number of bits.

      +

      The 'shl' instruction returns the first operand shifted to the left + a specified number of bits.

      Arguments:
      +

      Both arguments to the 'shl' instruction must be the + same integer or vector of + integer type. 'op2' is treated as an unsigned value.

      -

      Both arguments to the 'shl' instruction must be the same integer or vector of integer -type. 'op2' is treated as an unsigned value.

      -
      Semantics:
      +

      The value produced is op1 * 2op2 mod + 2n, where n is the width of the result. If op2 + is (statically or dynamically) negative or equal to or larger than the number + of bits in op1, the result is undefined. If the arguments are + vectors, each vector element of op1 is shifted by the corresponding + shift amount in op2.

      -

      The value produced is op1 * 2op2 mod 2n, -where n is the width of the result. If op2 is (statically or dynamically) negative or -equal to or larger than the number of bits in op1, the result is undefined. -If the arguments are vectors, each vector element of op1 is shifted by the -corresponding shift amount in op2.

      - -
      Example:
      +
      Example:
      +
         <result> = shl i32 4, %var   ; yields {i32}: 4 << %var
         <result> = shl i32 4, 2      ; yields {i32}: 16
         <result> = shl i32 1, 10     ; yields {i32}: 1024
         <result> = shl i32 1, 32     ; undefined
         <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 2, i32 4>
       
      +
      + +
      +
      Syntax:
      -
        <result> = lshr <ty> <op1>, <op2>   ; yields {ty}:result
      +
      +  <result> = lshr <ty> <op1>, <op2>   ; yields {ty}:result
       
      Overview:
      -

      The 'lshr' instruction (logical shift right) returns the first -operand shifted to the right a specified number of bits with zero fill.

      +

      The 'lshr' instruction (logical shift right) returns the first + operand shifted to the right a specified number of bits with zero fill.

      Arguments:
      -

      Both arguments to the 'lshr' instruction must be the same -integer or vector of integer -type. 'op2' is treated as an unsigned value.

      +

      Both arguments to the 'lshr' instruction must be the same + integer or vector of integer + type. 'op2' is treated as an unsigned value.

      Semantics:
      -

      This instruction always performs a logical shift right operation. The most -significant bits of the result will be filled with zero bits after the -shift. If op2 is (statically or dynamically) equal to or larger than -the number of bits in op1, the result is undefined. If the arguments are -vectors, each vector element of op1 is shifted by the corresponding shift -amount in op2.

      + significant bits of the result will be filled with zero bits after the shift. + If op2 is (statically or dynamically) equal to or larger than the + number of bits in op1, the result is undefined. If the arguments are + vectors, each vector element of op1 is shifted by the corresponding + shift amount in op2.

      Example:
      @@ -2856,6 +3469,7 @@ amount in op2.

      <result> = lshr i32 1, 32 ; undefined <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
      +
      @@ -2864,25 +3478,27 @@ Instruction
      Syntax:
      -
        <result> = ashr <ty> <op1>, <op2>   ; yields {ty}:result
      +
      +  <result> = ashr <ty> <op1>, <op2>   ; yields {ty}:result
       
      Overview:
      -

      The 'ashr' instruction (arithmetic shift right) returns the first -operand shifted to the right a specified number of bits with sign extension.

      +

      The 'ashr' instruction (arithmetic shift right) returns the first + operand shifted to the right a specified number of bits with sign + extension.

      Arguments:
      -

      Both arguments to the 'ashr' instruction must be the same -integer or vector of integer -type. 'op2' is treated as an unsigned value.

      +

      Both arguments to the 'ashr' instruction must be the same + integer or vector of integer + type. 'op2' is treated as an unsigned value.

      Semantics:
      -

      This instruction always performs an arithmetic shift right operation, -The most significant bits of the result will be filled with the sign bit -of op1. If op2 is (statically or dynamically) equal to or -larger than the number of bits in op1, the result is undefined. If the -arguments are vectors, each vector element of op1 is shifted by the -corresponding shift amount in op2.

      +

      This instruction always performs an arithmetic shift right operation, The + most significant bits of the result will be filled with the sign bit + of op1. If op2 is (statically or dynamically) equal to or + larger than the number of bits in op1, the result is undefined. If + the arguments are vectors, each vector element of op1 is shifted by + the corresponding shift amount in op2.

      Example:
      @@ -2893,6 +3509,7 @@ corresponding shift amount in op2.

      <result> = ashr i32 1, 32 ; undefined <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
      +
      @@ -2902,26 +3519,22 @@ Instruction
      Syntax:
      -
         <result> = and <ty> <op1>, <op2>   ; yields {ty}:result
       
      Overview:
      - -

      The 'and' instruction returns the bitwise logical and of -its two operands.

      +

      The 'and' instruction returns the bitwise logical and of its two + operands.

      Arguments:
      - -

      The two arguments to the 'and' instruction must be -integer or vector of integer -values. Both arguments must have identical types.

      +

      The two arguments to the 'and' instruction must be + integer or vector of integer + values. Both arguments must have identical types.

      Semantics:

      The truth table used for the 'and' instruction is:

      -

      -
      + @@ -2951,7 +3564,7 @@ values. Both arguments must have identical types.

      -
      +
      Example:
         <result> = and i32 4, %var         ; yields {i32}:result = 4 & %var
      @@ -2961,22 +3574,26 @@ values.  Both arguments must have identical types.

      +
      +
      Syntax:
      -
        <result> = or <ty> <op1>, <op2>   ; yields {ty}:result
      +
      +  <result> = or <ty> <op1>, <op2>   ; yields {ty}:result
       
      +
      Overview:
      -

      The 'or' instruction returns the bitwise logical inclusive -or of its two operands.

      +

      The 'or' instruction returns the bitwise logical inclusive or of its + two operands.

      +
      Arguments:
      +

      The two arguments to the 'or' instruction must be + integer or vector of integer + values. Both arguments must have identical types.

      -

      The two arguments to the 'or' instruction must be -integer or vector of integer -values. Both arguments must have identical types.

      Semantics:

      The truth table used for the 'or' instruction is:

      -

      -
      + @@ -3006,34 +3623,40 @@ values. Both arguments must have identical types.

      -
      +
      Example:
      -
        <result> = or i32 4, %var         ; yields {i32}:result = 4 | %var
      +
      +  <result> = or i32 4, %var         ; yields {i32}:result = 4 | %var
         <result> = or i32 15, 40          ; yields {i32}:result = 47
         <result> = or i32 4, 8            ; yields {i32}:result = 12
       
      +
      + +
      +
      Syntax:
      -
        <result> = xor <ty> <op1>, <op2>   ; yields {ty}:result
      +
      +  <result> = xor <ty> <op1>, <op2>   ; yields {ty}:result
       
      +
      Overview:
      -

      The 'xor' instruction returns the bitwise logical exclusive -or of its two operands. The xor is used to implement the -"one's complement" operation, which is the "~" operator in C.

      +

      The 'xor' instruction returns the bitwise logical exclusive or of + its two operands. The xor is used to implement the "one's + complement" operation, which is the "~" operator in C.

      +
      Arguments:
      -

      The two arguments to the 'xor' instruction must be -integer or vector of integer -values. Both arguments must have identical types.

      +

      The two arguments to the 'xor' instruction must be + integer or vector of integer + values. Both arguments must have identical types.

      Semantics:
      -

      The truth table used for the 'xor' instruction is:

      -

      -
      + @@ -3063,29 +3686,30 @@ values. Both arguments must have identical types.

      -
      -

      +
      Example:
      -
        <result> = xor i32 4, %var         ; yields {i32}:result = 4 ^ %var
      +
      +  <result> = xor i32 4, %var         ; yields {i32}:result = 4 ^ %var
         <result> = xor i32 15, 40          ; yields {i32}:result = 39
         <result> = xor i32 4, 8            ; yields {i32}:result = 12
         <result> = xor i32 %V, -1          ; yields {i32}:result = ~%V
       
      +
      -
      +

      LLVM supports several instructions to represent vector operations in a -target-independent manner. These instructions cover the element-access and -vector-specific operations needed to process vectors effectively. While LLVM -does directly support these vector operations, many sophisticated algorithms -will want to use target-specific intrinsics to take full advantage of a specific -target.

      + target-independent manner. These instructions cover the element-access and + vector-specific operations needed to process vectors effectively. While LLVM + does directly support these vector operations, many sophisticated algorithms + will want to use target-specific intrinsics to take full advantage of a + specific target.

      @@ -3097,365 +3721,216 @@ target.

      Syntax:
      - -
      -  <result> = extractelement <n x <ty>> <val>, i32 <idx>    ; yields <ty>
      -
      - -
      Overview:
      - -

      -The 'extractelement' instruction extracts a single scalar -element from a vector at a specified index. -

      - - -
      Arguments:
      - -

      -The first operand of an 'extractelement' instruction is a -value of vector type. The second operand is -an index indicating the position from which to extract the element. -The index may be a variable.

      - -
      Semantics:
      - -

      -The result is a scalar of the same type as the element type of -val. Its value is the value at position idx of -val. If idx exceeds the length of val, the -results are undefined. -

      - -
      Example:
      - -
      -  %result = extractelement <4 x i32> %vec, i32 0    ; yields i32
      -
      -
      - - - - - -
      - -
      Syntax:
      - -
      -  <result> = insertelement <n x <ty>> <val>, <ty> <elt>, i32 <idx>    ; yields <n x <ty>>
      -
      - -
      Overview:
      - -

      -The 'insertelement' instruction inserts a scalar -element into a vector at a specified index. -

      - - -
      Arguments:
      - -

      -The first operand of an 'insertelement' instruction is a -value of vector type. The second operand is a -scalar value whose type must equal the element type of the first -operand. The third operand is an index indicating the position at -which to insert the value. The index may be a variable.

      - -
      Semantics:
      - -

      -The result is a vector of the same type as val. Its -element values are those of val except at position -idx, where it gets the value elt. If idx -exceeds the length of val, the results are undefined. -

      - -
      Example:
      - -
      -  %result = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
      -
      -
      - - - - -
      - -
      Syntax:
      - -
      -  <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
      -
      - -
      Overview:
      - -

      -The 'shufflevector' instruction constructs a permutation of elements -from two input vectors, returning a vector with the same element type as -the input and length that is the same as the shuffle mask. -

      - -
      Arguments:
      - -

      -The first two operands of a 'shufflevector' instruction are vectors -with types that match each other. The third argument is a shuffle mask whose -element type is always 'i32'. The result of the instruction is a vector whose -length is the same as the shuffle mask and whose element type is the same as -the element type of the first two operands. -

      - -

      -The shuffle mask operand is required to be a constant vector with either -constant integer or undef values. -

      - -
      Semantics:
      - -

      -The elements of the two input vectors are numbered from left to right across -both of the vectors. The shuffle mask operand specifies, for each element of -the result vector, which element of the two input vectors the result element -gets. The element selector may be undef (meaning "don't care") and the second -operand may be undef if performing a shuffle from only one vector. -

      - -
      Example:
      - -
      -  %result = shufflevector <4 x i32> %v1, <4 x i32> %v2, 
      -                          <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
      -  %result = shufflevector <4 x i32> %v1, <4 x i32> undef, 
      -                          <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
      -  %result = shufflevector <8 x i32> %v1, <8 x i32> undef, 
      -                          <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
      -  %result = shufflevector <4 x i32> %v1, <4 x i32> %v2, 
      -                          <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
      +
      +  <result> = extractelement <n x <ty>> <val>, i32 <idx>    ; yields <ty>
       
      -
      +
      Overview:
      +

      The 'extractelement' instruction extracts a single scalar element + from a vector at a specified index.

      - - -
      +
      Arguments:
      +

      The first operand of an 'extractelement' instruction is a value + of vector type. The second operand is an index + indicating the position from which to extract the element. The index may be + a variable.

      -

      LLVM supports several instructions for working with aggregate values. -

      +
      Semantics:
      +

      The result is a scalar of the same type as the element type of + val. Its value is the value at position idx of + val. If idx exceeds the length of val, the + results are undefined.

      + +
      Example:
      +
      +  <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
      +
      Syntax:
      -
      -  <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
      +  <result> = insertelement <n x <ty>> <val>, <ty> <elt>, i32 <idx>    ; yields <n x <ty>>
       
      Overview:
      - -

      -The 'extractvalue' instruction extracts the value of a struct field -or array element from an aggregate value. -

      - +

      The 'insertelement' instruction inserts a scalar element into a + vector at a specified index.

      Arguments:
      - -

      -The first operand of an 'extractvalue' instruction is a -value of struct or array -type. The operands are constant indices to specify which value to extract -in a similar manner as indices in a -'getelementptr' instruction. -

      +

      The first operand of an 'insertelement' instruction is a value + of vector type. The second operand is a scalar value + whose type must equal the element type of the first operand. The third + operand is an index indicating the position at which to insert the value. + The index may be a variable.

      Semantics:
      - -

      -The result is the value at the position in the aggregate specified by -the index operands. -

      +

      The result is a vector of the same type as val. Its element values + are those of val except at position idx, where it gets the + value elt. If idx exceeds the length of val, the + results are undefined.

      Example:
      -
      -  %result = extractvalue {i32, float} %agg, 0    ; yields i32
      +  <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
       
      -
      +
      Syntax:
      -
      -  <result> = insertvalue <aggregate type> <val>, <ty> <val>, <idx>    ; yields <n x <ty>>
      +  <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
       
      Overview:
      - -

      -The 'insertvalue' instruction inserts a value -into a struct field or array element in an aggregate. -

      - +

      The 'shufflevector' instruction constructs a permutation of elements + from two input vectors, returning a vector with the same element type as the + input and length that is the same as the shuffle mask.

      Arguments:
      +

      The first two operands of a 'shufflevector' instruction are vectors + with types that match each other. The third argument is a shuffle mask whose + element type is always 'i32'. The result of the instruction is a vector + whose length is the same as the shuffle mask and whose element type is the + same as the element type of the first two operands.

      -

      -The first operand of an 'insertvalue' instruction is a -value of struct or array type. -The second operand is a first-class value to insert. -The following operands are constant indices -indicating the position at which to insert the value in a similar manner as -indices in a -'getelementptr' instruction. -The value to insert must have the same type as the value identified -by the indices. -

      +

      The shuffle mask operand is required to be a constant vector with either + constant integer or undef values.

      Semantics:
      - -

      -The result is an aggregate of the same type as val. Its -value is that of val except that the value at the position -specified by the indices is that of elt. -

      +

      The elements of the two input vectors are numbered from left to right across + both of the vectors. The shuffle mask operand specifies, for each element of + the result vector, which element of the two input vectors the result element + gets. The element selector may be undef (meaning "don't care") and the + second operand may be undef if performing a shuffle from only one vector.

      Example:
      -
      -  %result = insertvalue {i32, float} %agg, i32 1, 0    ; yields {i32, float}
      +  <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
      +                          <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
      +  <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
      +                          <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
      +  <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
      +                          <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
      +  <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
      +                          <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
       
      -
      + -
      - Memory Access and Addressing Operations +
      -

      A key design point of an SSA-based representation is how it -represents memory. In LLVM, no memory locations are in SSA form, which -makes things very simple. This section describes how to read, write, -allocate, and free memory in LLVM.

      +

      LLVM supports several instructions for working with aggregate values.

      Syntax:
      -
      -  <result> = malloc <type>[, i32 <NumElements>][, align <alignment>]     ; yields {type*}:result
      +  <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
       
      Overview:
      - -

      The 'malloc' instruction allocates memory from the system -heap and returns a pointer to it. The object is always allocated in the generic -address space (address space zero).

      +

      The 'extractvalue' instruction extracts the value of a struct field + or array element from an aggregate value.

      Arguments:
      - -

      The 'malloc' instruction allocates -sizeof(<type>)*NumElements -bytes of memory from the operating system and returns a pointer of the -appropriate type to the program. If "NumElements" is specified, it is the -number of elements allocated, otherwise "NumElements" is defaulted to be one. -If a constant alignment is specified, the value result of the allocation is guaranteed to -be aligned to at least that boundary. If not specified, or if zero, the target can -choose to align the allocation on any convenient boundary.

      - -

      'type' must be a sized type.

      +

      The first operand of an 'extractvalue' instruction is a value + of struct or array type. The + operands are constant indices to specify which value to extract in a similar + manner as indices in a + 'getelementptr' instruction.

      Semantics:
      - -

      Memory is allocated using the system "malloc" function, and -a pointer is returned. The result of a zero byte allocation is undefined. The -result is null if there is insufficient memory available.

      +

      The result is the value at the position in the aggregate specified by the + index operands.

      Example:
      -
      -  %array  = malloc [4 x i8]                     ; yields {[%4 x i8]*}:array
      -
      -  %size   = add i32 2, 2                        ; yields {i32}:size = i32 4
      -  %array1 = malloc i8, i32 4                    ; yields {i8*}:array1
      -  %array2 = malloc [12 x i8], i32 %size         ; yields {[12 x i8]*}:array2
      -  %array3 = malloc i32, i32 4, align 1024       ; yields {i32*}:array3
      -  %array4 = malloc i32, align 1024              ; yields {i32*}:array4
      +  <result> = extractvalue {i32, float} %agg, 0    ; yields i32
       
      -

      Note that the code generator does not yet respect the - alignment value.

      -
      Syntax:
      -
      -  free <type> <value>                           ; yields {void}
      +  <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>    ; yields <aggregate type>
       
      Overview:
      +

      The 'insertvalue' instruction inserts a value into a struct field or + array element in an aggregate.

      -

      The 'free' instruction returns memory back to the unused -memory heap to be reallocated in the future.

      Arguments:
      - -

      'value' shall be a pointer value that points to a value -that was allocated with the 'malloc' -instruction.

      +

      The first operand of an 'insertvalue' instruction is a value + of struct or array type. The + second operand is a first-class value to insert. The following operands are + constant indices indicating the position at which to insert the value in a + similar manner as indices in a + 'getelementptr' instruction. The + value to insert must have the same type as the value identified by the + indices.

      Semantics:
      - -

      Access to the memory pointed to by the pointer is no longer defined -after this instruction executes. If the pointer is null, the operation -is a noop.

      +

      The result is an aggregate of the same type as val. Its value is + that of val except that the value at the position specified by the + indices is that of elt.

      Example:
      -
      -  %array  = malloc [4 x i8]                     ; yields {[4 x i8]*}:array
      -            free   [4 x i8]* %array
      +  %agg1 = insertvalue {i32, float} undef, i32 1, 0         ; yields {i32 1, float undef}
      +  %agg2 = insertvalue {i32, float} %agg1, float %val, 1    ; yields {i32 1, float %val}
       
      + +
      + + + + + +
      + +

      A key design point of an SSA-based representation is how it represents + memory. In LLVM, no memory locations are in SSA form, which makes things + very simple. This section describes how to read, write, and allocate + memory in LLVM.

      +
      @@ -3466,136 +3941,150 @@ is a noop.

      Syntax:
      -
         <result> = alloca <type>[, i32 <NumElements>][, align <alignment>]     ; yields {type*}:result
       
      Overview:
      -

      The 'alloca' instruction allocates memory on the stack frame of the -currently executing function, to be automatically released when this function -returns to its caller. The object is always allocated in the generic address -space (address space zero).

      + currently executing function, to be automatically released when this function + returns to its caller. The object is always allocated in the generic address + space (address space zero).

      Arguments:
      - -

      The 'alloca' instruction allocates sizeof(<type>)*NumElements -bytes of memory on the runtime stack, returning a pointer of the -appropriate type to the program. If "NumElements" is specified, it is the -number of elements allocated, otherwise "NumElements" is defaulted to be one. -If a constant alignment is specified, the value result of the allocation is guaranteed -to be aligned to at least that boundary. If not specified, or if zero, the target -can choose to align the allocation on any convenient boundary.

      +

      The 'alloca' instruction + allocates sizeof(<type>)*NumElements bytes of memory on the + runtime stack, returning a pointer of the appropriate type to the program. + If "NumElements" is specified, it is the number of elements allocated, + otherwise "NumElements" is defaulted to be one. If a constant alignment is + specified, the value result of the allocation is guaranteed to be aligned to + at least that boundary. If not specified, or if zero, the target can choose + to align the allocation on any convenient boundary compatible with the + type.

      'type' may be any sized type.

      Semantics:
      -

      Memory is allocated; a pointer is returned. The operation is undefined if -there is insufficient stack space for the allocation. 'alloca'd -memory is automatically released when the function returns. The 'alloca' -instruction is commonly used to represent automatic variables that must -have an address available. When the function returns (either with the ret or unwind -instructions), the memory is reclaimed. Allocating zero bytes -is legal, but the result is undefined.

      + there is insufficient stack space for the allocation. 'alloca'd + memory is automatically released when the function returns. The + 'alloca' instruction is commonly used to represent automatic + variables that must have an address available. When the function returns + (either with the ret + or unwind instructions), the memory is + reclaimed. Allocating zero bytes is legal, but the result is undefined.

      Example:
      -
         %ptr = alloca i32                             ; yields {i32*}:ptr
         %ptr = alloca i32, i32 4                      ; yields {i32*}:ptr
         %ptr = alloca i32, i32 4, align 1024          ; yields {i32*}:ptr
         %ptr = alloca i32, align 1024                 ; yields {i32*}:ptr
       
      +
      +
      +
      Syntax:
      -
        <result> = load <ty>* <pointer>[, align <alignment>]
      <result> = volatile load <ty>* <pointer>[, align <alignment>]
      +
      +  <result> = load <ty>* <pointer>[, align <alignment>]
      +  <result> = volatile load <ty>* <pointer>[, align <alignment>]
      +
      +
      Overview:

      The 'load' instruction is used to read from memory.

      +
      Arguments:
      -

      The argument to the 'load' instruction specifies the memory -address from which to load. The pointer must point to a first class type. If the load is -marked as volatile, then the optimizer is not allowed to modify -the number or order of execution of this load with other -volatile load and store -instructions.

      -

      -The optional constant "align" argument specifies the alignment of the operation -(that is, the alignment of the memory address). A value of 0 or an -omitted "align" argument means that the operation has the preferential -alignment for the target. It is the responsibility of the code emitter -to ensure that the alignment information is correct. Overestimating -the alignment results in an undefined behavior. Underestimating the -alignment may produce less efficient code. An alignment of 1 is always -safe. -

      +

      The argument to the 'load' instruction specifies the memory address + from which to load. The pointer must point to + a first class type. If the load is + marked as volatile, then the optimizer is not allowed to modify the + number or order of execution of this load with other + volatile load and store + instructions.

      + +

      The optional constant "align" argument specifies the alignment of the + operation (that is, the alignment of the memory address). A value of 0 or an + omitted "align" argument means that the operation has the preferential + alignment for the target. It is the responsibility of the code emitter to + ensure that the alignment information is correct. Overestimating the + alignment results in an undefined behavior. Underestimating the alignment may + produce less efficient code. An alignment of 1 is always safe.

      +
      Semantics:
      -

      The location of memory pointed to is loaded. If the value being loaded -is of scalar type then the number of bytes read does not exceed the minimum -number of bytes needed to hold all bits of the type. For example, loading an -i24 reads at most three bytes. When loading a value of a type like -i20 with a size that is not an integral number of bytes, the result -is undefined if the value was not originally written using a store of the -same type.

      +

      The location of memory pointed to is loaded. If the value being loaded is of + scalar type then the number of bytes read does not exceed the minimum number + of bytes needed to hold all bits of the type. For example, loading an + i24 reads at most three bytes. When loading a value of a type like + i20 with a size that is not an integral number of bytes, the result + is undefined if the value was not originally written using a store of the + same type.

      +
      Examples:
      -
        %ptr = alloca i32                               ; yields {i32*}:ptr
      -  store i32 3, i32* %ptr                          ; yields {void}
      +
      +  %ptr = alloca i32                               ; yields {i32*}:ptr
      +  store i32 3, i32* %ptr                          ; yields {void}
         %val = load i32* %ptr                           ; yields {i32}:val = i32 3
       
      +
      + +
      +
      Syntax:
      -
        store <ty> <value>, <ty>* <pointer>[, align <alignment>]                   ; yields {void}
      +
      +  store <ty> <value>, <ty>* <pointer>[, align <alignment>]                   ; yields {void}
         volatile store <ty> <value>, <ty>* <pointer>[, align <alignment>]          ; yields {void}
       
      +
      Overview:

      The 'store' instruction is used to write to memory.

      +
      Arguments:
      -

      There are two arguments to the 'store' instruction: a value -to store and an address at which to store it. The type of the '<pointer>' -operand must be a pointer to the first class type -of the '<value>' -operand. If the store is marked as volatile, then the -optimizer is not allowed to modify the number or order of execution of -this store with other volatile load and store instructions.

      -

      -The optional constant "align" argument specifies the alignment of the operation -(that is, the alignment of the memory address). A value of 0 or an -omitted "align" argument means that the operation has the preferential -alignment for the target. It is the responsibility of the code emitter -to ensure that the alignment information is correct. Overestimating -the alignment results in an undefined behavior. Underestimating the -alignment may produce less efficient code. An alignment of 1 is always -safe. -

      +

      There are two arguments to the 'store' instruction: a value to store + and an address at which to store it. The type of the + '<pointer>' operand must be a pointer to + the first class type of the + '<value>' operand. If the store is marked + as volatile, then the optimizer is not allowed to modify the number + or order of execution of this store with other + volatile load and store + instructions.

      + +

      The optional constant "align" argument specifies the alignment of the + operation (that is, the alignment of the memory address). A value of 0 or an + omitted "align" argument means that the operation has the preferential + alignment for the target. It is the responsibility of the code emitter to + ensure that the alignment information is correct. Overestimating the + alignment results in an undefined behavior. Underestimating the alignment may + produce less efficient code. An alignment of 1 is always safe.

      +
      Semantics:
      -

      The contents of memory are updated to contain '<value>' -at the location specified by the '<pointer>' operand. -If '<value>' is of scalar type then the number of bytes -written does not exceed the minimum number of bytes needed to hold all -bits of the type. For example, storing an i24 writes at most -three bytes. When writing a value of a type like i20 with a -size that is not an integral number of bytes, it is unspecified what -happens to the extra bits that do not belong to the type, but they will -typically be overwritten.

      +

      The contents of memory are updated to contain '<value>' at the + location specified by the '<pointer>' operand. If + '<value>' is of scalar type then the number of bytes written + does not exceed the minimum number of bytes needed to hold all bits of the + type. For example, storing an i24 writes at most three bytes. When + writing a value of a type like i20 with a size that is not an + integral number of bytes, it is unspecified what happens to the extra bits + that do not belong to the type, but they will typically be overwritten.

      +
      Example:
      -
        %ptr = alloca i32                               ; yields {i32*}:ptr
      +
      +  %ptr = alloca i32                               ; yields {i32*}:ptr
         store i32 3, i32* %ptr                          ; yields {void}
         %val = load i32* %ptr                           ; yields {i32}:val = i32 3
       
      +
      @@ -3604,38 +4093,39 @@ typically be overwritten.

      +
      Syntax:
         <result> = getelementptr <pty>* <ptrval>{, <ty> <idx>}*
      +  <result> = getelementptr inbounds <pty>* <ptrval>{, <ty> <idx>}*
       
      Overview:
      - -

      -The 'getelementptr' instruction is used to get the address of a -subelement of an aggregate data structure. It performs address calculation only -and does not access memory.

      +

      The 'getelementptr' instruction is used to get the address of a + subelement of an aggregate data structure. It performs address calculation + only and does not access memory.

      Arguments:
      -

      The first argument is always a pointer, and forms the basis of the -calculation. The remaining arguments are indices, that indicate which of the -elements of the aggregate object are indexed. The interpretation of each index -is dependent on the type being indexed into. The first index always indexes the -pointer value given as the first argument, the second index indexes a value of -the type pointed to (not necessarily the value directly pointed to, since the -first index can be non-zero), etc. The first type indexed into must be a pointer -value, subsequent types can be arrays, vectors and structs. Note that subsequent -types being indexed into can never be pointers, since that would require loading -the pointer before continuing calculation.

      + calculation. The remaining arguments are indices that indicate which of the + elements of the aggregate object are indexed. The interpretation of each + index is dependent on the type being indexed into. The first index always + indexes the pointer value given as the first argument, the second index + indexes a value of the type pointed to (not necessarily the value directly + pointed to, since the first index can be non-zero), etc. The first type + indexed into must be a pointer value, subsequent types can be arrays, vectors + and structs. Note that subsequent types being indexed into can never be + pointers, since that would require loading the pointer before continuing + calculation.

      The type of each index argument depends on the type it is indexing into. -When indexing into a (packed) structure, only i32 integer -constants are allowed. When indexing into an array, pointer or vector, -integers of any width are allowed (also non-constants).

      + When indexing into a (optionally packed) structure, only i32 integer + constants are allowed. When indexing into an array, pointer or + vector, integers of any width are allowed, and they are not required to be + constant.

      -

      For example, let's consider a C code fragment and how it gets -compiled to LLVM:

      +

      For example, let's consider a C code fragment and how it gets compiled to + LLVM:

      @@ -3663,7 +4153,7 @@ int *foo(struct ST *s) {
       %RT = type { i8 , [10 x [20 x i32]], i8  }
       %ST = type { i32, double, %RT }
       
      -define i32* %foo(%ST* %s) {
      +define i32* @foo(%ST* %s) {
       entry:
         %reg = getelementptr %ST* %s, i32 1, i32 2, i32 1, i32 5, i32 13
         ret i32* %reg
      @@ -3672,23 +4162,22 @@ entry:
       
      Semantics:
      -

      In the example above, the first index is indexing into the '%ST*' -type, which is a pointer, yielding a '%ST' = '{ i32, double, %RT -}' type, a structure. The second index indexes into the third element of -the structure, yielding a '%RT' = '{ i8 , [10 x [20 x i32]], -i8 }' type, another structure. The third index indexes into the second -element of the structure, yielding a '[10 x [20 x i32]]' type, an -array. The two dimensions of the array are subscripted into, yielding an -'i32' type. The 'getelementptr' instruction returns a pointer -to this element, thus computing a value of 'i32*' type.

      + type, which is a pointer, yielding a '%ST' = '{ i32, double, %RT + }' type, a structure. The second index indexes into the third element + of the structure, yielding a '%RT' = '{ i8 , [10 x [20 x i32]], + i8 }' type, another structure. The third index indexes into the second + element of the structure, yielding a '[10 x [20 x i32]]' type, an + array. The two dimensions of the array are subscripted into, yielding an + 'i32' type. The 'getelementptr' instruction returns a + pointer to this element, thus computing a value of 'i32*' type.

      -

      Note that it is perfectly legal to index partially through a -structure, returning a pointer to an inner element. Because of this, -the LLVM code for the given testcase is equivalent to:

      +

      Note that it is perfectly legal to index partially through a structure, + returning a pointer to an inner element. Because of this, the LLVM code for + the given testcase is equivalent to:

      -  define i32* %foo(%ST* %s) {
      +  define i32* @foo(%ST* %s) {
           %t1 = getelementptr %ST* %s, i32 1                        ; yields %ST*:%t1
           %t2 = getelementptr %ST* %t1, i32 0, i32 2                ; yields %RT*:%t2
           %t3 = getelementptr %RT* %t2, i32 0, i32 1                ; yields [10 x [20 x i32]]*:%t3
      @@ -3698,20 +4187,27 @@ the LLVM code for the given testcase is equivalent to:

      }
      -

      Note that it is undefined to access an array out of bounds: array -and pointer indexes must always be within the defined bounds of the -array type when accessed with an instruction that dereferences the -pointer (e.g. a load or store instruction). The one exception for -this rule is zero length arrays. These arrays are defined to be -accessible as variable length arrays, which requires access beyond the -zero'th element.

      +

      If the inbounds keyword is present, the result value of the + getelementptr is undefined if the base pointer is not an + in bounds address of an allocated object, or if any of the addresses + that would be formed by successive addition of the offsets implied by the + indices to the base address with infinitely precise arithmetic are not an + in bounds address of that allocated object. + The in bounds addresses for an allocated object are all the addresses + that point into the object, plus the address one byte past the end.

      -

      The getelementptr instruction is often confusing. For some more insight -into how it works, see the getelementptr -FAQ.

      +

      If the inbounds keyword is not present, the offsets are added to + the base address with silently-wrapping two's complement arithmetic, and + the result value of the getelementptr may be outside the object + pointed to by the base pointer. The result value may not necessarily be + used to access memory though, even if it happens to point into allocated + storage. See the Pointer Aliasing Rules + section for more information.

      -
      Example:
      +

      The getelementptr instruction is often confusing. For some more insight into + how it works, see the getelementptr FAQ.

      +
      Example:
           ; yields [12 x i8]*:aptr
           %aptr = getelementptr {i32, [12 x i8]}* %saptr, i64 0, i32 1
      @@ -3722,15 +4218,19 @@ FAQ.

      ; yields i32*:iptr %iptr = getelementptr [10 x i32]* @arr, i16 0, i16 0
      +
      +
      +

      The instructions in this category are the conversion instructions (casting) -which all take a single operand and a type. They perform various bit conversions -on the operand.

      + which all take a single operand and a type. They perform various bit + conversions on the operand.

      +
      @@ -3745,31 +4245,30 @@ on the operand.

      Overview:
      -

      -The 'trunc' instruction truncates its operand to the type ty2. -

      +

      The 'trunc' instruction truncates its operand to the + type ty2.

      Arguments:
      -

      -The 'trunc' instruction takes a value to trunc, which must -be an integer type, and a type that specifies the size -and type of the result, which must be an integer -type. The bit size of value must be larger than the bit size of -ty2. Equal sized types are not allowed.

      +

      The 'trunc' instruction takes a value to trunc, which must + be an integer type, and a type that specifies the + size and type of the result, which must be + an integer type. The bit size of value must + be larger than the bit size of ty2. Equal sized types are not + allowed.

      Semantics:
      -

      -The 'trunc' instruction truncates the high order bits in value -and converts the remaining bits to ty2. Since the source size must be -larger than the destination size, trunc cannot be a no-op cast. -It will always truncate bits.

      +

      The 'trunc' instruction truncates the high order bits + in value and converts the remaining bits to ty2. Since the + source size must be larger than the destination size, trunc cannot + be a no-op cast. It will always truncate bits.

      Example:
         %X = trunc i32 257 to i8              ; yields i8:1
         %Y = trunc i32 123 to i1              ; yields i1:true
      -  %Y = trunc i32 122 to i1              ; yields i1:false
      +  %Z = trunc i32 122 to i1              ; yields i1:false
       
      + @@ -3784,20 +4283,20 @@ It will always truncate bits.

      Overview:
      -

      The 'zext' instruction zero extends its operand to type -ty2.

      +

      The 'zext' instruction zero extends its operand to type + ty2.

      Arguments:
      -

      The 'zext' instruction takes a value to cast, which must be of -integer type, and a type to cast it to, which must -also be of integer type. The bit size of the -value must be smaller than the bit size of the destination type, -ty2.

      +

      The 'zext' instruction takes a value to cast, which must be of + integer type, and a type to cast it to, which must + also be of integer type. The bit size of the + value must be smaller than the bit size of the destination type, + ty2.

      Semantics:

      The zext fills the high order bits of the value with zero -bits until it reaches the size of the destination type, ty2.

      + bits until it reaches the size of the destination type, ty2.

      When zero extending from i1, the result will always be either 0 or 1.

      @@ -3806,6 +4305,7 @@ bits until it reaches the size of the destination type, ty2.

      %X = zext i32 257 to i64 ; yields i64:257 %Y = zext i1 true to i32 ; yields i32:1 + @@ -3823,18 +4323,16 @@ bits until it reaches the size of the destination type, ty2.

      The 'sext' sign extends value to the type ty2.

      Arguments:
      -

      -The 'sext' instruction takes a value to cast, which must be of -integer type, and a type to cast it to, which must -also be of integer type. The bit size of the -value must be smaller than the bit size of the destination type, -ty2.

      +

      The 'sext' instruction takes a value to cast, which must be of + integer type, and a type to cast it to, which must + also be of integer type. The bit size of the + value must be smaller than the bit size of the destination type, + ty2.

      Semantics:
      -

      -The 'sext' instruction performs a sign extension by copying the sign -bit (highest order bit) of the value until it reaches the bit size of -the type ty2.

      +

      The 'sext' instruction performs a sign extension by copying the sign + bit (highest order bit) of the value until it reaches the bit size + of the type ty2.

      When sign extending from i1, the extension always results in -1 or 0.

      @@ -3843,6 +4341,7 @@ the type ty2.

      %X = sext i8 -1 to i16 ; yields i16 :65535 %Y = sext i1 true to i32 ; yields i32:-1 + @@ -3853,34 +4352,34 @@ the type ty2.

      Syntax:
      -
         <result> = fptrunc <ty> <value> to <ty2>             ; yields ty2
       
      Overview:

      The 'fptrunc' instruction truncates value to type -ty2.

      - + ty2.

      Arguments:

      The 'fptrunc' instruction takes a floating - point value to cast and a floating point type to -cast it to. The size of value must be larger than the size of -ty2. This implies that fptrunc cannot be used to make a -no-op cast.

      + point value to cast and a floating point type + to cast it to. The size of value must be larger than the size of + ty2. This implies that fptrunc cannot be used to make a + no-op cast.

      Semantics:
      -

      The 'fptrunc' instruction truncates a value from a larger -floating point type to a smaller -floating point type. If the value cannot fit within -the destination type, ty2, then the results are undefined.

      +

      The 'fptrunc' instruction truncates a value from a larger + floating point type to a smaller + floating point type. If the value cannot fit + within the destination type, ty2, then the results are + undefined.

      Example:
         %X = fptrunc double 123.0 to float         ; yields float:123.0
         %Y = fptrunc double 1.0E+300 to float      ; yields undefined
       
      +
      @@ -3896,26 +4395,27 @@ the destination type, ty2, then the results are undefined.

      Overview:

      The 'fpext' extends a floating point value to a larger -floating point value.

      + floating point value.

      Arguments:
      -

      The 'fpext' instruction takes a -floating point value to cast, -and a floating point type to cast it to. The source -type must be smaller than the destination type.

      +

      The 'fpext' instruction takes a + floating point value to cast, and + a floating point type to cast it to. The source + type must be smaller than the destination type.

      Semantics:

      The 'fpext' instruction extends the value from a smaller -floating point type to a larger -floating point type. The fpext cannot be -used to make a no-op cast because it always changes bits. Use -bitcast to make a no-op cast for a floating point cast.

      + floating point type to a larger + floating point type. The fpext cannot be + used to make a no-op cast because it always changes bits. Use + bitcast to make a no-op cast for a floating point cast.

      Example:
         %X = fpext float 3.1415 to double        ; yields double:3.1415
         %Y = fpext float 1.0 to float            ; yields float:1.0 (no-op)
       
      + @@ -3931,28 +4431,28 @@ used to make a no-op cast because it always changes bits. Use
      Overview:

      The 'fptoui' converts a floating point value to its -unsigned integer equivalent of type ty2. -

      + unsigned integer equivalent of type ty2.

      Arguments:
      -

      The 'fptoui' instruction takes a value to cast, which must be a -scalar or vector floating point value, and a type -to cast it to ty2, which must be an integer -type. If ty is a vector floating point type, ty2 must be a -vector integer type with the same number of elements as ty

      +

      The 'fptoui' instruction takes a value to cast, which must be a + scalar or vector floating point value, and a type + to cast it to ty2, which must be an integer + type. If ty is a vector floating point type, ty2 must be a + vector integer type with the same number of elements as ty

      Semantics:
      -

      The 'fptoui' instruction converts its -floating point operand into the nearest (rounding -towards zero) unsigned integer value. If the value cannot fit in ty2, -the results are undefined.

      +

      The 'fptoui' instruction converts its + floating point operand into the nearest (rounding + towards zero) unsigned integer value. If the value cannot fit + in ty2, the results are undefined.

      Example:
         %X = fptoui double 123.0 to i32      ; yields i32:123
         %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
      -  %X = fptoui float 1.04E+17 to i8     ; yields undefined:1
      +  %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
       
      + @@ -3967,29 +4467,30 @@ the results are undefined.

      Overview:
      -

      The 'fptosi' instruction converts -floating point value to type ty2. -

      +

      The 'fptosi' instruction converts + floating point value to + type ty2.

      Arguments:
      -

      The 'fptosi' instruction takes a value to cast, which must be a -scalar or vector floating point value, and a type -to cast it to ty2, which must be an integer -type. If ty is a vector floating point type, ty2 must be a -vector integer type with the same number of elements as ty

      +

      The 'fptosi' instruction takes a value to cast, which must be a + scalar or vector floating point value, and a type + to cast it to ty2, which must be an integer + type. If ty is a vector floating point type, ty2 must be a + vector integer type with the same number of elements as ty

      Semantics:
      -

      The 'fptosi' instruction converts its -floating point operand into the nearest (rounding -towards zero) signed integer value. If the value cannot fit in ty2, -the results are undefined.

      +

      The 'fptosi' instruction converts its + floating point operand into the nearest (rounding + towards zero) signed integer value. If the value cannot fit in ty2, + the results are undefined.

      Example:
         %X = fptosi double -123.0 to i32      ; yields i32:-123
         %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
      -  %X = fptosi float 1.04E+17 to i8      ; yields undefined:1
      +  %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
       
      + @@ -4005,25 +4506,27 @@ the results are undefined.

      Overview:

      The 'uitofp' instruction regards value as an unsigned -integer and converts that value to the ty2 type.

      + integer and converts that value to the ty2 type.

      Arguments:

      The 'uitofp' instruction takes a value to cast, which must be a -scalar or vector integer value, and a type to cast it -to ty2, which must be an floating point -type. If ty is a vector integer type, ty2 must be a vector -floating point type with the same number of elements as ty

      + scalar or vector integer value, and a type to cast + it to ty2, which must be an floating point + type. If ty is a vector integer type, ty2 must be a vector + floating point type with the same number of elements as ty

      Semantics:

      The 'uitofp' instruction interprets its operand as an unsigned -integer quantity and converts it to the corresponding floating point value. If -the value cannot fit in the floating point value, the results are undefined.

      + integer quantity and converts it to the corresponding floating point + value. If the value cannot fit in the floating point value, the results are + undefined.

      Example:
         %X = uitofp i32 257 to float         ; yields float:257.0
         %Y = uitofp i8 -1 to double          ; yields double:255.0
       
      + @@ -4038,26 +4541,27 @@ the value cannot fit in the floating point value, the results are undefined.

      Overview:
      -

      The 'sitofp' instruction regards value as a signed -integer and converts that value to the ty2 type.

      +

      The 'sitofp' instruction regards value as a signed integer + and converts that value to the ty2 type.

      Arguments:

      The 'sitofp' instruction takes a value to cast, which must be a -scalar or vector integer value, and a type to cast it -to ty2, which must be an floating point -type. If ty is a vector integer type, ty2 must be a vector -floating point type with the same number of elements as ty

      + scalar or vector integer value, and a type to cast + it to ty2, which must be an floating point + type. If ty is a vector integer type, ty2 must be a vector + floating point type with the same number of elements as ty

      Semantics:
      -

      The 'sitofp' instruction interprets its operand as a signed -integer quantity and converts it to the corresponding floating point value. If -the value cannot fit in the floating point value, the results are undefined.

      +

      The 'sitofp' instruction interprets its operand as a signed integer + quantity and converts it to the corresponding floating point value. If the + value cannot fit in the floating point value, the results are undefined.

      Example:
         %X = sitofp i32 257 to float         ; yields float:257.0
         %Y = sitofp i8 -1 to double          ; yields double:-1.0
       
      + @@ -4072,28 +4576,29 @@ the value cannot fit in the floating point value, the results are undefined.

      Overview:
      -

      The 'ptrtoint' instruction converts the pointer value to -the integer type ty2.

      +

      The 'ptrtoint' instruction converts the pointer value to + the integer type ty2.

      Arguments:
      -

      The 'ptrtoint' instruction takes a value to cast, which -must be a pointer value, and a type to cast it to -ty2, which must be an integer type.

      +

      The 'ptrtoint' instruction takes a value to cast, which + must be a pointer value, and a type to cast it to + ty2, which must be an integer type.

      Semantics:

      The 'ptrtoint' instruction converts value to integer type -ty2 by interpreting the pointer value as an integer and either -truncating or zero extending that value to the size of the integer type. If -value is smaller than ty2 then a zero extension is done. If -value is larger than ty2 then a truncation is done. If they -are the same size, then nothing is done (no-op cast) other than a type -change.

      + ty2 by interpreting the pointer value as an integer and either + truncating or zero extending that value to the size of the integer type. If + value is smaller than ty2 then a zero extension is done. If + value is larger than ty2 then a truncation is done. If they + are the same size, then nothing is done (no-op cast) other than a type + change.

      Example:
         %X = ptrtoint i32* %X to i8           ; yields truncation on 32-bit architecture
         %Y = ptrtoint i32* %x to i64          ; yields zero extension on 32-bit architecture
       
      + @@ -4108,28 +4613,29 @@ change.

      Overview:
      -

      The 'inttoptr' instruction converts an integer value to -a pointer type, ty2.

      +

      The 'inttoptr' instruction converts an integer value to a + pointer type, ty2.

      Arguments:

      The 'inttoptr' instruction takes an integer -value to cast, and a type to cast it to, which must be a -pointer type.

      + value to cast, and a type to cast it to, which must be a + pointer type.

      Semantics:

      The 'inttoptr' instruction converts value to type -ty2 by applying either a zero extension or a truncation depending on -the size of the integer value. If value is larger than the -size of a pointer then a truncation is done. If value is smaller than -the size of a pointer then a zero extension is done. If they are the same size, -nothing is done (no-op cast).

      + ty2 by applying either a zero extension or a truncation depending on + the size of the integer value. If value is larger than the + size of a pointer then a truncation is done. If value is smaller + than the size of a pointer then a zero extension is done. If they are the + same size, nothing is done (no-op cast).

      Example:
         %X = inttoptr i32 255 to i32*          ; yields zero extension on 64-bit architecture
      -  %X = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
      -  %Y = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
      +  %Y = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
      +  %Z = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
       
      + @@ -4144,61 +4650,68 @@ nothing is done (no-op cast).

      Overview:
      -

      The 'bitcast' instruction converts value to type -ty2 without changing any bits.

      + ty2 without changing any bits.

      Arguments:
      - -

      The 'bitcast' instruction takes a value to cast, which must be -a non-aggregate first class value, and a type to cast it to, which must also be -a non-aggregate first class type. The bit sizes of -value -and the destination type, ty2, must be identical. If the source -type is a pointer, the destination type must also be a pointer. This -instruction supports bitwise conversion of vectors to integers and to vectors -of other types (as long as they have the same size).

      +

      The 'bitcast' instruction takes a value to cast, which must be a + non-aggregate first class value, and a type to cast it to, which must also be + a non-aggregate first class type. The bit sizes + of value and the destination type, ty2, must be + identical. If the source type is a pointer, the destination type must also be + a pointer. This instruction supports bitwise conversion of vectors to + integers and to vectors of other types (as long as they have the same + size).

      Semantics:

      The 'bitcast' instruction converts value to type -ty2. It is always a no-op cast because no bits change with -this conversion. The conversion is done as if the value had been -stored to memory and read back as type ty2. Pointer types may only be -converted to other pointer types with this instruction. To convert pointers to -other types, use the inttoptr or -ptrtoint instructions first.

      + ty2. It is always a no-op cast because no bits change with + this conversion. The conversion is done as if the value had been + stored to memory and read back as type ty2. Pointer types may only + be converted to other pointer types with this instruction. To convert + pointers to other types, use the inttoptr or + ptrtoint instructions first.

      Example:
         %X = bitcast i8 255 to i8              ; yields i8 :-1
         %Y = bitcast i32* %x to sint*          ; yields sint*:%x
      -  %Z = bitcast <2 x int> %V to i64;      ; yields i64: %V   
      +  %Z = bitcast <2 x int> %V to i64;      ; yields i64: %V
       
      + +
      -

      The instructions in this category are the "miscellaneous" -instructions, which defy better classification.

      + +

      The instructions in this category are the "miscellaneous" instructions, which + defy better classification.

      +
      +
      +
      Syntax:
      -
        <result> = icmp <cond> <ty> <op1>, <op2>   ; yields {i1} or {<N x i1>}:result
      +
      +  <result> = icmp <cond> <ty> <op1>, <op2>   ; yields {i1} or {<N x i1>}:result
       
      +
      Overview:
      -

      The 'icmp' instruction returns a boolean value or -a vector of boolean values based on comparison -of its two integer, integer vector, or pointer operands.

      +

      The 'icmp' instruction returns a boolean value or a vector of + boolean values based on comparison of its two integer, integer vector, or + pointer operands.

      +
      Arguments:

      The 'icmp' instruction takes three operands. The first operand is -the condition code indicating the kind of comparison to perform. It is not -a value, just a keyword. The possible condition code are: -

      + the condition code indicating the kind of comparison to perform. It is not a + value, just a keyword. The possible condition code are:

      +
      1. eq: equal
      2. ne: not equal
      3. @@ -4211,48 +4724,63 @@ a value, just a keyword. The possible condition code are:
      4. slt: signed less than
      5. sle: signed less or equal
      +

      The remaining two arguments must be integer or -pointer -or integer vector typed. -They must also be identical types.

      + pointer or integer vector + typed. They must also be identical types.

      +
      Semantics:
      -

      The 'icmp' compares op1 and op2 according to -the condition code given as cond. The comparison performed always -yields either an i1 or vector of i1 result, as follows: -

      +

      The 'icmp' compares op1 and op2 according to the + condition code given as cond. The comparison performed always yields + either an i1 or vector of i1 + result, as follows:

      +
        -
      1. eq: yields true if the operands are equal, - false otherwise. No sign interpretation is necessary or performed. -
      2. -
      3. ne: yields true if the operands are unequal, - false otherwise. No sign interpretation is necessary or performed.
      4. +
      5. eq: yields true if the operands are equal, + false otherwise. No sign interpretation is necessary or + performed.
      6. + +
      7. ne: yields true if the operands are unequal, + false otherwise. No sign interpretation is necessary or + performed.
      8. +
      9. ugt: interprets the operands as unsigned values and yields - true if op1 is greater than op2.
      10. + true if op1 is greater than op2. +
      11. uge: interprets the operands as unsigned values and yields - true if op1 is greater than or equal to op2.
      12. + true if op1 is greater than or equal + to op2. +
      13. ult: interprets the operands as unsigned values and yields - true if op1 is less than op2.
      14. + true if op1 is less than op2. +
      15. ule: interprets the operands as unsigned values and yields - true if op1 is less than or equal to op2.
      16. + true if op1 is less than or equal to op2. +
      17. sgt: interprets the operands as signed values and yields - true if op1 is greater than op2.
      18. + true if op1 is greater than op2. +
      19. sge: interprets the operands as signed values and yields - true if op1 is greater than or equal to op2.
      20. + true if op1 is greater than or equal + to op2. +
      21. slt: interprets the operands as signed values and yields - true if op1 is less than op2.
      22. + true if op1 is less than op2. +
      23. sle: interprets the operands as signed values and yields - true if op1 is less than or equal to op2.
      24. + true if op1 is less than or equal to op2.
      +

      If the operands are pointer typed, the pointer -values are compared as if they were integers.

      -

      If the operands are integer vectors, then they are compared -element by element. The result is an i1 vector with -the same number of elements as the values being compared. -Otherwise, the result is an i1. -

      + values are compared as if they were integers.

      + +

      If the operands are integer vectors, then they are compared element by + element. The result is an i1 vector with the same number of elements + as the values being compared. Otherwise, the result is an i1.

      Example:
      -
        <result> = icmp eq i32 4, 5          ; yields: result=false
      +
      +  <result> = icmp eq i32 4, 5          ; yields: result=false
         <result> = icmp ne float* %X, %X     ; yields: result=false
         <result> = icmp ult i16  4, 5        ; yields: result=true
         <result> = icmp sgt i16  4, 5        ; yields: result=false
      @@ -4268,25 +4796,30 @@ Otherwise, the result is an i1.
       
       
      +
       
      +
      Syntax:
      -
        <result> = fcmp <cond> <ty> <op1>, <op2>     ; yields {i1} or {<N x i1>}:result
      +
      +  <result> = fcmp <cond> <ty> <op1>, <op2>     ; yields {i1} or {<N x i1>}:result
       
      +
      Overview:
      -

      The 'fcmp' instruction returns a boolean value -or vector of boolean values based on comparison -of its operands.

      -

      -If the operands are floating point scalars, then the result -type is a boolean (i1). -

      -

      If the operands are floating point vectors, then the result type -is a vector of boolean with the same number of elements as the -operands being compared.

      +

      The 'fcmp' instruction returns a boolean value or vector of boolean + values based on comparison of its operands.

      + +

      If the operands are floating point scalars, then the result type is a boolean +(i1).

      + +

      If the operands are floating point vectors, then the result type is a vector + of boolean with the same number of elements as the operands being + compared.

      +
      Arguments:

      The 'fcmp' instruction takes three operands. The first operand is -the condition code indicating the kind of comparison to perform. It is not -a value, just a keyword. The possible condition code are:

      + the condition code indicating the kind of comparison to perform. It is not a + value, just a keyword. The possible condition code are:

      +
      1. false: no comparison, always returns false
      2. oeq: ordered and equal
      3. @@ -4305,52 +4838,71 @@ a value, just a keyword. The possible condition code are:

      4. uno: unordered (either nans)
      5. true: no comparison, always returns true
      +

      Ordered means that neither operand is a QNAN while -unordered means that either operand may be a QNAN.

      -

      Each of val1 and val2 arguments must be -either a floating point type -or a vector of floating point type. -They must have identical types.

      + unordered means that either operand may be a QNAN.

      + +

      Each of val1 and val2 arguments must be either + a floating point type or + a vector of floating point type. They must have + identical types.

      +
      Semantics:

      The 'fcmp' instruction compares op1 and op2 -according to the condition code given as cond. -If the operands are vectors, then the vectors are compared -element by element. -Each comparison performed -always yields an i1 result, as follows:

      + according to the condition code given as cond. If the operands are + vectors, then the vectors are compared element by element. Each comparison + performed always yields an i1 result, as + follows:

      +
      1. false: always yields false, regardless of operands.
      2. -
      3. oeq: yields true if both operands are not a QNAN and - op1 is equal to op2.
      4. + +
      5. oeq: yields true if both operands are not a QNAN and + op1 is equal to op2.
      6. +
      7. ogt: yields true if both operands are not a QNAN and - op1 is greather than op2.
      8. -
      9. oge: yields true if both operands are not a QNAN and - op1 is greater than or equal to op2.
      10. -
      11. olt: yields true if both operands are not a QNAN and - op1 is less than op2.
      12. -
      13. ole: yields true if both operands are not a QNAN and - op1 is less than or equal to op2.
      14. -
      15. one: yields true if both operands are not a QNAN and - op1 is not equal to op2.
      16. + op1 is greather than op2. + +
      17. oge: yields true if both operands are not a QNAN and + op1 is greater than or equal to op2.
      18. + +
      19. olt: yields true if both operands are not a QNAN and + op1 is less than op2.
      20. + +
      21. ole: yields true if both operands are not a QNAN and + op1 is less than or equal to op2.
      22. + +
      23. one: yields true if both operands are not a QNAN and + op1 is not equal to op2.
      24. +
      25. ord: yields true if both operands are not a QNAN.
      26. -
      27. ueq: yields true if either operand is a QNAN or - op1 is equal to op2.
      28. -
      29. ugt: yields true if either operand is a QNAN or - op1 is greater than op2.
      30. -
      31. uge: yields true if either operand is a QNAN or - op1 is greater than or equal to op2.
      32. -
      33. ult: yields true if either operand is a QNAN or - op1 is less than op2.
      34. -
      35. ule: yields true if either operand is a QNAN or - op1 is less than or equal to op2.
      36. -
      37. une: yields true if either operand is a QNAN or - op1 is not equal to op2.
      38. + +
      39. ueq: yields true if either operand is a QNAN or + op1 is equal to op2.
      40. + +
      41. ugt: yields true if either operand is a QNAN or + op1 is greater than op2.
      42. + +
      43. uge: yields true if either operand is a QNAN or + op1 is greater than or equal to op2.
      44. + +
      45. ult: yields true if either operand is a QNAN or + op1 is less than op2.
      46. + +
      47. ule: yields true if either operand is a QNAN or + op1 is less than or equal to op2.
      48. + +
      49. une: yields true if either operand is a QNAN or + op1 is not equal to op2.
      50. +
      51. uno: yields true if either operand is a QNAN.
      52. +
      53. true: always yields true, regardless of operands.
      Example:
      -
        <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
      +
      +  <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
         <result> = fcmp one float 4.0, 5.0    ; yields: result=true
         <result> = fcmp olt float 4.0, 5.0    ; yields: result=true
         <result> = fcmp ueq double 1.0, 2.0   ; yields: result=false
      @@ -4361,109 +4913,6 @@ always yields an i1 result, as follows:

      - - -
      -
      Syntax:
      -
        <result> = vicmp <cond> <ty> <op1>, <op2>   ; yields {ty}:result
      -
      -
      Overview:
      -

      The 'vicmp' instruction returns an integer vector value based on -element-wise comparison of its two integer vector operands.

      -
      Arguments:
      -

      The 'vicmp' instruction takes three operands. The first operand is -the condition code indicating the kind of comparison to perform. It is not -a value, just a keyword. The possible condition code are:

      -
        -
      1. eq: equal
      2. -
      3. ne: not equal
      4. -
      5. ugt: unsigned greater than
      6. -
      7. uge: unsigned greater or equal
      8. -
      9. ult: unsigned less than
      10. -
      11. ule: unsigned less or equal
      12. -
      13. sgt: signed greater than
      14. -
      15. sge: signed greater or equal
      16. -
      17. slt: signed less than
      18. -
      19. sle: signed less or equal
      20. -
      -

      The remaining two arguments must be vector or -integer typed. They must also be identical types.

      -
      Semantics:
      -

      The 'vicmp' instruction compares op1 and op2 -according to the condition code given as cond. The comparison yields a -vector of integer result, of -identical type as the values being compared. The most significant bit in each -element is 1 if the element-wise comparison evaluates to true, and is 0 -otherwise. All other bits of the result are undefined. The condition codes -are evaluated identically to the 'icmp' -instruction.

      - -
      Example:
      -
      -  <result> = vicmp eq <2 x i32> < i32 4, i32 0>, < i32 5, i32 0>   ; yields: result=<2 x i32> < i32 0, i32 -1 >
      -  <result> = vicmp ult <2 x i8 > < i8 1, i8 2>, < i8 2, i8 2 >        ; yields: result=<2 x i8> < i8 -1, i8 0 >
      -
      -
      - - - -
      -
      Syntax:
      -
        <result> = vfcmp <cond> <ty> <op1>, <op2>
      -
      Overview:
      -

      The 'vfcmp' instruction returns an integer vector value based on -element-wise comparison of its two floating point vector operands. The output -elements have the same width as the input elements.

      -
      Arguments:
      -

      The 'vfcmp' instruction takes three operands. The first operand is -the condition code indicating the kind of comparison to perform. It is not -a value, just a keyword. The possible condition code are:

      -
        -
      1. false: no comparison, always returns false
      2. -
      3. oeq: ordered and equal
      4. -
      5. ogt: ordered and greater than
      6. -
      7. oge: ordered and greater than or equal
      8. -
      9. olt: ordered and less than
      10. -
      11. ole: ordered and less than or equal
      12. -
      13. one: ordered and not equal
      14. -
      15. ord: ordered (no nans)
      16. -
      17. ueq: unordered or equal
      18. -
      19. ugt: unordered or greater than
      20. -
      21. uge: unordered or greater than or equal
      22. -
      23. ult: unordered or less than
      24. -
      25. ule: unordered or less than or equal
      26. -
      27. une: unordered or not equal
      28. -
      29. uno: unordered (either nans)
      30. -
      31. true: no comparison, always returns true
      32. -
      -

      The remaining two arguments must be vector of -floating point typed. They must also be identical -types.

      -
      Semantics:
      -

      The 'vfcmp' instruction compares op1 and op2 -according to the condition code given as cond. The comparison yields a -vector of integer result, with -an identical number of elements as the values being compared, and each element -having identical with to the width of the floating point elements. The most -significant bit in each element is 1 if the element-wise comparison evaluates to -true, and is 0 otherwise. All other bits of the result are undefined. The -condition codes are evaluated identically to the -'fcmp' instruction.

      - -
      Example:
      -
      -  ; yields: result=<2 x i32> < i32 0, i32 -1 >
      -  <result> = vfcmp oeq <2 x float> < float 4, float 0 >, < float 5, float 0 >
      -  
      -  ; yields: result=<2 x i64> < i64 -1, i64 0 >
      -  <result> = vfcmp ult <2 x double> < double 1, double 2 >, < double 2, double 2>
      -
      -
      -
      'phi' Instruction @@ -4472,29 +4921,35 @@ condition codes are evaluated identically to the
      Syntax:
      +
      +  <result> = phi <ty> [ <val0>, <label0>], ...
      +
      -
        <result> = phi <ty> [ <val0>, <label0>], ...
      Overview:
      -

      The 'phi' instruction is used to implement the φ node in -the SSA graph representing the function.

      -
      Arguments:
      - -

      The type of the incoming values is specified with the first type -field. After this, the 'phi' instruction takes a list of pairs -as arguments, with one pair for each predecessor basic block of the -current block. Only values of first class -type may be used as the value arguments to the PHI node. Only labels -may be used as the label arguments.

      +

      The 'phi' instruction is used to implement the φ node in the + SSA graph representing the function.

      -

      There must be no non-phi instructions between the start of a basic -block and the PHI instructions: i.e. PHI instructions must be first in -a basic block.

      +
      Arguments:
      +

      The type of the incoming values is specified with the first type field. After + this, the 'phi' instruction takes a list of pairs as arguments, with + one pair for each predecessor basic block of the current block. Only values + of first class type may be used as the value + arguments to the PHI node. Only labels may be used as the label + arguments.

      + +

      There must be no non-phi instructions between the start of a basic block and + the PHI instructions: i.e. PHI instructions must be first in a basic + block.

      + +

      For the purposes of the SSA form, the use of each incoming value is deemed to + occur on the edge from the corresponding predecessor block to the current + block (but after any definition of an 'invoke' instruction's return + value on the same edge).

      Semantics:
      -

      At runtime, the 'phi' instruction logically takes on the value -specified by the pair corresponding to the predecessor basic block that executed -just prior to the current block.

      + specified by the pair corresponding to the predecessor basic block that + executed just prior to the current block.

      Example:
      @@ -4503,6 +4958,7 @@ Loop:       ; Infinite loop that counts from 0 on up...
         %nextindvar = add i32 %indvar, 1
         br label %Loop
       
      +
      @@ -4513,7 +4969,6 @@ Loop: ; Infinite loop that counts from 0 on up...
      Syntax:
      -
         <result> = select selty <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
       
      @@ -4521,38 +4976,25 @@ Loop:       ; Infinite loop that counts from 0 on up...
       
      Overview:
      - -

      -The 'select' instruction is used to choose one value based on a -condition, without branching. -

      +

      The 'select' instruction is used to choose one value based on a + condition, without branching.

      Arguments:
      - -

      -The 'select' instruction requires an 'i1' value or -a vector of 'i1' values indicating the -condition, and two values of the same first class -type. If the val1/val2 are vectors and -the condition is a scalar, then entire vectors are selected, not -individual elements. -

      +

      The 'select' instruction requires an 'i1' value or a vector of 'i1' + values indicating the condition, and two values of the + same first class type. If the val1/val2 are + vectors and the condition is a scalar, then entire vectors are selected, not + individual elements.

      Semantics:
      +

      If the condition is an i1 and it evaluates to 1, the instruction returns the + first value argument; otherwise, it returns the second value argument.

      -

      -If the condition is an i1 and it evaluates to 1, the instruction returns the first -value argument; otherwise, it returns the second value argument. -

      -

      -If the condition is a vector of i1, then the value arguments must -be vectors of the same size, and the selection is done element -by element. -

      +

      If the condition is a vector of i1, then the value arguments must be vectors + of the same size, and the selection is done element by element.

      Example:
      -
         %X = select i1 true, i8 17, i8 42          ; yields i8:17
       
      @@ -4562,7 +5004,6 @@ by element.
      -
      'call' Instruction @@ -4576,75 +5017,76 @@ by element.
      Overview:
      -

      The 'call' instruction represents a simple function call.

      Arguments:
      -

      This instruction requires several arguments:

        -
      1. -

        The optional "tail" marker indicates whether the callee function accesses - any allocas or varargs in the caller. If the "tail" marker is present, the - function call is eligible for tail call optimization. Note that calls may - be marked "tail" even if they do not occur before a ret instruction.

        -
      2. -
      3. -

        The optional "cconv" marker indicates which calling - convention the call should use. If none is specified, the call defaults - to using C calling conventions.

        +
      4. The optional "tail" marker indicates that the callee function does not + access any allocas or varargs in the caller. Note that calls may be + marked "tail" even if they do not occur before + a ret instruction. If the "tail" marker is + present, the function call is eligible for tail call optimization, + but might not in fact be + optimized into a jump. As of this writing, the extra requirements for + a call to actually be optimized are: +
          +
        • Caller and callee both have the calling + convention fastcc.
        • +
        • The call is in tail position (ret immediately follows call and ret + uses value of call or is void).
        • +
        • Option -tailcallopt is enabled, + or llvm::PerformTailCallOpt is true.
        • +
        • Platform specific + constraints are met.
        • +
      5. -
      6. -

        The optional Parameter Attributes list for - return values. Only 'zeroext', 'signext', - and 'inreg' attributes are valid here.

        -
      7. +
      8. The optional "cconv" marker indicates which calling + convention the call should use. If none is specified, the call + defaults to using C calling conventions. The calling convention of the + call must match the calling convention of the target function, or else the + behavior is undefined.
      9. -
      10. -

        'ty': the type of the call instruction itself which is also - the type of the return value. Functions that return no value are marked - void.

        -
      11. -
      12. -

        'fnty': shall be the signature of the pointer to function - value being invoked. The argument types must match the types implied by - this signature. This type can be omitted if the function is not varargs - and if the function type does not return a pointer to a function.

        -
      13. -
      14. -

        'fnptrval': An LLVM value containing a pointer to a function to - be invoked. In most cases, this is a direct function invocation, but - indirect calls are just as possible, calling an arbitrary pointer - to function value.

        -
      15. -
      16. -

        'function args': argument list whose types match the - function signature argument types. All arguments must be of - first class type. If the function signature - indicates the function accepts a variable number of arguments, the extra - arguments can be specified.

        -
      17. -
      18. -

        The optional function attributes list. Only - 'noreturn', 'nounwind', 'readonly' and - 'readnone' attributes are valid here.

        -
      19. +
      20. The optional Parameter Attributes list for + return values. Only 'zeroext', 'signext', and + 'inreg' attributes are valid here.
      21. + +
      22. 'ty': the type of the call instruction itself which is also the + type of the return value. Functions that return no value are marked + void.
      23. + +
      24. 'fnty': shall be the signature of the pointer to function value + being invoked. The argument types must match the types implied by this + signature. This type can be omitted if the function is not varargs and if + the function type does not return a pointer to a function.
      25. + +
      26. 'fnptrval': An LLVM value containing a pointer to a function to + be invoked. In most cases, this is a direct function invocation, but + indirect calls are just as possible, calling an arbitrary pointer + to function value.
      27. + +
      28. 'function args': argument list whose types match the function + signature argument types. All arguments must be of + first class type. If the function signature + indicates the function accepts a variable number of arguments, the extra + arguments can be specified.
      29. + +
      30. The optional function attributes list. Only + 'noreturn', 'nounwind', 'readonly' and + 'readnone' attributes are valid here.
      Semantics:
      - -

      The 'call' instruction is used to cause control flow to -transfer to a specified function, with its incoming arguments bound to -the specified values. Upon a 'ret' -instruction in the called function, control flow continues with the -instruction after the function call, and the return value of the -function is bound to the result argument.

      +

      The 'call' instruction is used to cause control flow to transfer to + a specified function, with its incoming arguments bound to the specified + values. Upon a 'ret' instruction in the called + function, control flow continues with the instruction after the function + call, and the return value of the function is bound to the result + argument.

      Example:
      -
         %retval = call i32 @test(i32 %argc)
         call i32 (i8 *, ...)* @printf(i8 * %msg, i32 12, i8 42)      ; yields i32
      @@ -4660,6 +5102,12 @@ function is bound to the result argument.

      %ZZ = call zeroext i32 @bar() ; Return value is %zero extended
      +

      llvm treats calls to some functions with names and arguments that match the +standard C99 library as being the C99 library functions, and may perform +optimizations or generate code for them under that assumption. This is +something we'd like to change in the future to provide better support for +freestanding environments and non-C-based langauges.

      +
      @@ -4670,47 +5118,41 @@ function is bound to the result argument.

      Syntax:
      -
         <resultval> = va_arg <va_list*> <arglist>, <argty>
       
      Overview:
      -

      The 'va_arg' instruction is used to access arguments passed through -the "variable argument" area of a function call. It is used to implement the -va_arg macro in C.

      + the "variable argument" area of a function call. It is used to implement the + va_arg macro in C.

      Arguments:
      - -

      This instruction takes a va_list* value and the type of -the argument. It returns a value of the specified argument type and -increments the va_list to point to the next argument. The -actual type of va_list is target specific.

      +

      This instruction takes a va_list* value and the type of the + argument. It returns a value of the specified argument type and increments + the va_list to point to the next argument. The actual type + of va_list is target specific.

      Semantics:
      - -

      The 'va_arg' instruction loads an argument of the specified -type from the specified va_list and causes the -va_list to point to the next argument. For more information, -see the variable argument handling Intrinsic -Functions.

      +

      The 'va_arg' instruction loads an argument of the specified type + from the specified va_list and causes the va_list to point + to the next argument. For more information, see the variable argument + handling Intrinsic Functions.

      It is legal for this instruction to be called in a function which does not -take a variable number of arguments, for example, the vfprintf -function.

      + take a variable number of arguments, for example, the vfprintf + function.

      -

      va_arg is an LLVM instruction instead of an intrinsic function because it takes a type as an -argument.

      +

      va_arg is an LLVM instruction instead of + an intrinsic function because it takes a type as an + argument.

      Example:
      -

      See the variable argument processing section.

      -

      Note that the code generator does not yet fully support va_arg - on many targets. Also, it does not currently support va_arg with - aggregate types on any target.

      +

      Note that the code generator does not yet fully support va_arg on many + targets. Also, it does not currently support va_arg with aggregate types on + any target.

      @@ -4721,45 +5163,45 @@ argument.

      LLVM supports the notion of an "intrinsic function". These functions have -well known names and semantics and are required to follow certain restrictions. -Overall, these intrinsics represent an extension mechanism for the LLVM -language that does not require changing all of the transformations in LLVM when -adding to the language (or the bitcode reader/writer, the parser, etc...).

      + well known names and semantics and are required to follow certain + restrictions. Overall, these intrinsics represent an extension mechanism for + the LLVM language that does not require changing all of the transformations + in LLVM when adding to the language (or the bitcode reader/writer, the + parser, etc...).

      Intrinsic function names must all start with an "llvm." prefix. This -prefix is reserved in LLVM for intrinsic names; thus, function names may not -begin with this prefix. Intrinsic functions must always be external functions: -you cannot define the body of intrinsic functions. Intrinsic functions may -only be used in call or invoke instructions: it is illegal to take the address -of an intrinsic function. Additionally, because intrinsic functions are part -of the LLVM language, it is required if any are added that they be documented -here.

      - -

      Some intrinsic functions can be overloaded, i.e., the intrinsic represents -a family of functions that perform the same operation but on different data -types. Because LLVM can represent over 8 million different integer types, -overloading is used commonly to allow an intrinsic function to operate on any -integer type. One or more of the argument types or the result type can be -overloaded to accept any integer type. Argument types may also be defined as -exactly matching a previous argument's type or the result type. This allows an -intrinsic function which accepts multiple arguments, but needs all of them to -be of the same type, to only be overloaded with respect to a single argument or -the result.

      - -

      Overloaded intrinsics will have the names of its overloaded argument types -encoded into its function name, each preceded by a period. Only those types -which are overloaded result in a name suffix. Arguments whose type is matched -against another type do not. For example, the llvm.ctpop function can -take an integer of any width and returns an integer of exactly the same integer -width. This leads to a family of functions such as -i8 @llvm.ctpop.i8(i8 %val) and i29 @llvm.ctpop.i29(i29 %val). -Only one type, the return type, is overloaded, and only one type suffix is -required. Because the argument's type is matched against the return type, it -does not require its own name suffix.

      - -

      To learn how to add an intrinsic function, please see the -Extending LLVM Guide. -

      + prefix is reserved in LLVM for intrinsic names; thus, function names may not + begin with this prefix. Intrinsic functions must always be external + functions: you cannot define the body of intrinsic functions. Intrinsic + functions may only be used in call or invoke instructions: it is illegal to + take the address of an intrinsic function. Additionally, because intrinsic + functions are part of the LLVM language, it is required if any are added that + they be documented here.

      + +

      Some intrinsic functions can be overloaded, i.e., the intrinsic represents a + family of functions that perform the same operation but on different data + types. Because LLVM can represent over 8 million different integer types, + overloading is used commonly to allow an intrinsic function to operate on any + integer type. One or more of the argument types or the result type can be + overloaded to accept any integer type. Argument types may also be defined as + exactly matching a previous argument's type or the result type. This allows + an intrinsic function which accepts multiple arguments, but needs all of them + to be of the same type, to only be overloaded with respect to a single + argument or the result.

      + +

      Overloaded intrinsics will have the names of its overloaded argument types + encoded into its function name, each preceded by a period. Only those types + which are overloaded result in a name suffix. Arguments whose type is matched + against another type do not. For example, the llvm.ctpop function + can take an integer of any width and returns an integer of exactly the same + integer width. This leads to a family of functions such as + i8 @llvm.ctpop.i8(i8 %val) and i29 @llvm.ctpop.i29(i29 + %val). Only one type, the return type, is overloaded, and only one type + suffix is required. Because the argument's type is matched against the return + type, it does not require its own name suffix.

      + +

      To learn how to add an intrinsic function, please see the + Extending LLVM Guide.

      @@ -4770,20 +5212,19 @@ does not require its own name suffix.

      -

      Variable argument support is defined in LLVM with the va_arg instruction and these three -intrinsic functions. These functions are related to the similarly -named macros defined in the <stdarg.h> header file.

      +

      Variable argument support is defined in LLVM with + the va_arg instruction and these three + intrinsic functions. These functions are related to the similarly named + macros defined in the <stdarg.h> header file.

      -

      All of these functions operate on arguments that use a -target-specific value type "va_list". The LLVM assembly -language reference manual does not define what this type is, so all -transformations should be prepared to handle these functions regardless of -the type used.

      +

      All of these functions operate on arguments that use a target-specific value + type "va_list". The LLVM assembly language reference manual does + not define what this type is, so all transformations should be prepared to + handle these functions regardless of the type used.

      This example shows how the va_arg -instruction and the variable argument handling intrinsic functions are -used.

      + instruction and the variable argument handling intrinsic functions are + used.

      @@ -4822,25 +5263,27 @@ declare void @llvm.va_end(i8*)
       
       
       
      +
      Syntax:
      -
        declare void %llvm.va_start(i8* <arglist>)
      +
      +  declare void %llvm.va_start(i8* <arglist>)
      +
      +
      Overview:
      -

      The 'llvm.va_start' intrinsic initializes -*<arglist> for subsequent use by va_arg.

      +

      The 'llvm.va_start' intrinsic initializes *<arglist> + for subsequent use by va_arg.

      Arguments:
      -

      The argument is a pointer to a va_list element to initialize.

      Semantics:
      -

      The 'llvm.va_start' intrinsic works just like the va_start -macro available in C. In a target-dependent way, it initializes the -va_list element to which the argument points, so that the next call to -va_arg will produce the first variable argument passed to the function. -Unlike the C va_start macro, this intrinsic does not need to know the -last argument of the function as the compiler can figure that out.

      + macro available in C. In a target-dependent way, it initializes + the va_list element to which the argument points, so that the next + call to va_arg will produce the first variable argument passed to + the function. Unlike the C va_start macro, this intrinsic does not + need to know the last argument of the function as the compiler can figure + that out.

      @@ -4850,26 +5293,28 @@ last argument of the function as the compiler can figure that out.

      +
      Syntax:
      -
        declare void @llvm.va_end(i8* <arglist>)
      -
      Overview:
      +
      +  declare void @llvm.va_end(i8* <arglist>)
      +
      +
      Overview:

      The 'llvm.va_end' intrinsic destroys *<arglist>, -which has been initialized previously with llvm.va_start -or llvm.va_copy.

      + which has been initialized previously + with llvm.va_start + or llvm.va_copy.

      Arguments:
      -

      The argument is a pointer to a va_list to destroy.

      Semantics:
      -

      The 'llvm.va_end' intrinsic works just like the va_end -macro available in C. In a target-dependent way, it destroys the -va_list element to which the argument points. Calls to llvm.va_start and -llvm.va_copy must be matched exactly with calls to -llvm.va_end.

      + macro available in C. In a target-dependent way, it destroys + the va_list element to which the argument points. Calls + to llvm.va_start + and llvm.va_copy must be matched exactly + with calls to llvm.va_end.

      @@ -4881,30 +5326,26 @@ href="#int_va_start">llvm.va_start and @@ -4915,20 +5356,18 @@ example, memory allocation.

      -

      -LLVM support for Accurate Garbage +

      LLVM support for Accurate Garbage Collection (GC) requires the implementation and generation of these -intrinsics. -These intrinsics allow identification of GC roots on the -stack, as well as garbage collector implementations that require read and write barriers. -Front-ends for type-safe garbage collected languages should generate these -intrinsics to make use of the LLVM garbage collectors. For more details, see Accurate Garbage Collection with LLVM. -

      +intrinsics. These intrinsics allow identification of GC +roots on the stack, as well as garbage collector implementations that +require read and write +barriers. Front-ends for type-safe garbage collected languages should generate +these intrinsics to make use of the LLVM garbage collectors. For more details, +see Accurate Garbage Collection with +LLVM.

      -

      The garbage collection intrinsics only operate on objects in the generic - address space (address space zero).

      +

      The garbage collection intrinsics only operate on objects in the generic + address space (address space zero).

      @@ -4940,33 +5379,29 @@ href="GarbageCollection.html">Accurate Garbage Collection with LLVM.
      Syntax:
      -
         declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
       
      Overview:
      -

      The 'llvm.gcroot' intrinsic declares the existence of a GC root to -the code generator, and allows some metadata to be associated with it.

      + the code generator, and allows some metadata to be associated with it.

      Arguments:
      -

      The first argument specifies the address of a stack object that contains the -root pointer. The second pointer (which must be either a constant or a global -value address) contains the meta-data to be associated with the root.

      + root pointer. The second pointer (which must be either a constant or a + global value address) contains the meta-data to be associated with the + root.

      Semantics:
      -

      At runtime, a call to this intrinsic stores a null pointer into the "ptrloc" -location. At compile-time, the code generator generates information to allow -the runtime to find the pointer at GC safe points. The 'llvm.gcroot' -intrinsic may only be used in a function which specifies a GC -algorithm.

      + location. At compile-time, the code generator generates information to allow + the runtime to find the pointer at GC safe points. The 'llvm.gcroot' + intrinsic may only be used in a function which specifies a GC + algorithm.

      -
      'llvm.gcread' Intrinsic @@ -4975,35 +5410,30 @@ algorithm.

      Syntax:
      -
         declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
       
      Overview:
      -

      The 'llvm.gcread' intrinsic identifies reads of references from heap -locations, allowing garbage collector implementations that require read -barriers.

      + locations, allowing garbage collector implementations that require read + barriers.

      Arguments:
      -

      The second argument is the address to read from, which should be an address -allocated from the garbage collector. The first object is a pointer to the -start of the referenced object, if needed by the language runtime (otherwise -null).

      + allocated from the garbage collector. The first object is a pointer to the + start of the referenced object, if needed by the language runtime (otherwise + null).

      Semantics:
      -

      The 'llvm.gcread' intrinsic has the same semantics as a load -instruction, but may be replaced with substantially more complex code by the -garbage collector runtime, as needed. The 'llvm.gcread' intrinsic -may only be used in a function which specifies a GC -algorithm.

      + instruction, but may be replaced with substantially more complex code by the + garbage collector runtime, as needed. The 'llvm.gcread' intrinsic + may only be used in a function which specifies a GC + algorithm.

      -
      'llvm.gcwrite' Intrinsic @@ -5012,46 +5442,39 @@ algorithm.

      Syntax:
      -
         declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
       
      Overview:
      -

      The 'llvm.gcwrite' intrinsic identifies writes of references to heap -locations, allowing garbage collector implementations that require write -barriers (such as generational or reference counting collectors).

      + locations, allowing garbage collector implementations that require write + barriers (such as generational or reference counting collectors).

      Arguments:
      -

      The first argument is the reference to store, the second is the start of the -object to store it to, and the third is the address of the field of Obj to -store to. If the runtime does not require a pointer to the object, Obj may be -null.

      + object to store it to, and the third is the address of the field of Obj to + store to. If the runtime does not require a pointer to the object, Obj may + be null.

      Semantics:
      -

      The 'llvm.gcwrite' intrinsic has the same semantics as a store -instruction, but may be replaced with substantially more complex code by the -garbage collector runtime, as needed. The 'llvm.gcwrite' intrinsic -may only be used in a function which specifies a GC -algorithm.

      + instruction, but may be replaced with substantially more complex code by the + garbage collector runtime, as needed. The 'llvm.gcwrite' intrinsic + may only be used in a function which specifies a GC + algorithm.

      - -
      -

      -These intrinsics are provided by LLVM to expose special features that may only -be implemented with code generator support. -

      + +

      These intrinsics are provided by LLVM to expose special features that may + only be implemented with code generator support.

      @@ -5068,38 +5491,28 @@ be implemented with code generator support.
      Overview:
      - -

      -The 'llvm.returnaddress' intrinsic attempts to compute a -target-specific value indicating the return address of the current function -or one of its callers. -

      +

      The 'llvm.returnaddress' intrinsic attempts to compute a + target-specific value indicating the return address of the current function + or one of its callers.

      Arguments:
      - -

      -The argument to this intrinsic indicates which function to return the address -for. Zero indicates the calling function, one indicates its caller, etc. The -argument is required to be a constant integer value. -

      +

      The argument to this intrinsic indicates which function to return the address + for. Zero indicates the calling function, one indicates its caller, etc. + The argument is required to be a constant integer value.

      Semantics:
      +

      The 'llvm.returnaddress' intrinsic either returns a pointer + indicating the return address of the specified call frame, or zero if it + cannot be identified. The value returned by this intrinsic is likely to be + incorrect or 0 for arguments other than zero, so it should only be used for + debugging purposes.

      -

      -The 'llvm.returnaddress' intrinsic either returns a pointer indicating -the return address of the specified call frame, or zero if it cannot be -identified. The value returned by this intrinsic is likely to be incorrect or 0 -for arguments other than zero, so it should only be used for debugging purposes. -

      +

      Note that calling this intrinsic does not prevent function inlining or other + aggressive transformations, so the value returned may not be that of the + obvious source-language caller.

      -

      -Note that calling this intrinsic does not prevent function inlining or other -aggressive transformations, so the value returned may not be that of the obvious -source-language caller. -

      -
      'llvm.frameaddress' Intrinsic @@ -5113,34 +5526,25 @@ source-language caller.
      Overview:
      - -

      -The 'llvm.frameaddress' intrinsic attempts to return the -target-specific frame pointer value for the specified stack frame. -

      +

      The 'llvm.frameaddress' intrinsic attempts to return the + target-specific frame pointer value for the specified stack frame.

      Arguments:
      - -

      -The argument to this intrinsic indicates which function to return the frame -pointer for. Zero indicates the calling function, one indicates its caller, -etc. The argument is required to be a constant integer value. -

      +

      The argument to this intrinsic indicates which function to return the frame + pointer for. Zero indicates the calling function, one indicates its caller, + etc. The argument is required to be a constant integer value.

      Semantics:
      +

      The 'llvm.frameaddress' intrinsic either returns a pointer + indicating the frame address of the specified call frame, or zero if it + cannot be identified. The value returned by this intrinsic is likely to be + incorrect or 0 for arguments other than zero, so it should only be used for + debugging purposes.

      -

      -The 'llvm.frameaddress' intrinsic either returns a pointer indicating -the frame address of the specified call frame, or zero if it cannot be -identified. The value returned by this intrinsic is likely to be incorrect or 0 -for arguments other than zero, so it should only be used for debugging purposes. -

      +

      Note that calling this intrinsic does not prevent function inlining or other + aggressive transformations, so the value returned may not be that of the + obvious source-language caller.

      -

      -Note that calling this intrinsic does not prevent function inlining or other -aggressive transformations, so the value returned may not be that of the obvious -source-language caller. -

      @@ -5156,25 +5560,20 @@ source-language caller.
      Overview:
      - -

      -The 'llvm.stacksave' intrinsic is used to remember the current state of -the function stack, for use with -llvm.stackrestore. This is useful for implementing language -features like scoped automatic variable sized arrays in C99. -

      +

      The 'llvm.stacksave' intrinsic is used to remember the current state + of the function stack, for use + with llvm.stackrestore. This is + useful for implementing language features like scoped automatic variable + sized arrays in C99.

      Semantics:
      - -

      -This intrinsic returns a opaque pointer value that can be passed to llvm.stackrestore. When an -llvm.stackrestore intrinsic is executed with a value saved from -llvm.stacksave, it effectively restores the state of the stack to the -state it was in when the llvm.stacksave intrinsic executed. In -practice, this pops any alloca blocks from the stack -that were allocated after the llvm.stacksave was executed. -

      +

      This intrinsic returns a opaque pointer value that can be passed + to llvm.stackrestore. When + an llvm.stackrestore intrinsic is executed with a value saved + from llvm.stacksave, it effectively restores the state of the stack + to the state it was in when the llvm.stacksave intrinsic executed. + In practice, this pops any alloca blocks from the + stack that were allocated after the llvm.stacksave was executed.

      @@ -5191,24 +5590,18 @@ that were allocated after the llvm.stacksave was executed.
      Overview:
      - -

      -The 'llvm.stackrestore' intrinsic is used to restore the state of -the function stack to the state it was in when the corresponding llvm.stacksave intrinsic executed. This is -useful for implementing language features like scoped automatic variable sized -arrays in C99. -

      +

      The 'llvm.stackrestore' intrinsic is used to restore the state of + the function stack to the state it was in when the + corresponding llvm.stacksave intrinsic + executed. This is useful for implementing language features like scoped + automatic variable sized arrays in C99.

      Semantics:
      - -

      -See the description for llvm.stacksave. -

      +

      See the description + for llvm.stacksave.

      -
      'llvm.prefetch' Intrinsic @@ -5222,34 +5615,23 @@ See the description for llvm.stacksave.
      Overview:
      - - -

      -The 'llvm.prefetch' intrinsic is a hint to the code generator to insert -a prefetch instruction if supported; otherwise, it is a noop. Prefetches have -no -effect on the behavior of the program but can change its performance -characteristics. -

      +

      The 'llvm.prefetch' intrinsic is a hint to the code generator to + insert a prefetch instruction if supported; otherwise, it is a noop. + Prefetches have no effect on the behavior of the program but can change its + performance characteristics.

      Arguments:
      - -

      -address is the address to be prefetched, rw is the specifier -determining if the fetch should be for a read (0) or write (1), and -locality is a temporal locality specifier ranging from (0) - no -locality, to (3) - extremely local keep in cache. The rw and -locality arguments must be constant integers. -

      +

      address is the address to be prefetched, rw is the + specifier determining if the fetch should be for a read (0) or write (1), + and locality is a temporal locality specifier ranging from (0) - no + locality, to (3) - extremely local keep in cache. The rw + and locality arguments must be constant integers.

      Semantics:
      - -

      -This intrinsic does not modify the behavior of the program. In particular, -prefetches cannot trap and do not produce a value. On targets that support this -intrinsic, the prefetch can provide hints to the processor cache for better -performance. -

      +

      This intrinsic does not modify the behavior of the program. In particular, + prefetches cannot trap and do not produce a value. On targets that support + this intrinsic, the prefetch can provide hints to the processor cache for + better performance.

      @@ -5266,32 +5648,21 @@ performance.
      Overview:
      - - -

      -The 'llvm.pcmarker' intrinsic is a method to export a Program Counter -(PC) in a region of -code to simulators and other tools. The method is target specific, but it is -expected that the marker will use exported symbols to transmit the PC of the -marker. -The marker makes no guarantees that it will remain with any specific instruction -after optimizations. It is possible that the presence of a marker will inhibit -optimizations. The intended use is to be inserted after optimizations to allow -correlations of simulation runs. -

      +

      The 'llvm.pcmarker' intrinsic is a method to export a Program + Counter (PC) in a region of code to simulators and other tools. The method + is target specific, but it is expected that the marker will use exported + symbols to transmit the PC of the marker. The marker makes no guarantees + that it will remain with any specific instruction after optimizations. It is + possible that the presence of a marker will inhibit optimizations. The + intended use is to be inserted after optimizations to allow correlations of + simulation runs.

      Arguments:
      - -

      -id is a numerical id identifying the marker. -

      +

      id is a numerical id identifying the marker.

      Semantics:
      - -

      -This intrinsic does not modify the behavior of the program. Backends that do not -support this intrinisic may ignore it. -

      +

      This intrinsic does not modify the behavior of the program. Backends that do + not support this intrinisic may ignore it.

      @@ -5308,23 +5679,17 @@ support this intrinisic may ignore it.
      Overview:
      - - -

      -The 'llvm.readcyclecounter' intrinsic provides access to the cycle -counter register (or similar low latency, high accuracy clocks) on those targets -that support it. On X86, it should map to RDTSC. On Alpha, it should map to RPCC. -As the backing counters overflow quickly (on the order of 9 seconds on alpha), this -should only be used for small timings. -

      +

      The 'llvm.readcyclecounter' intrinsic provides access to the cycle + counter register (or similar low latency, high accuracy clocks) on those + targets that support it. On X86, it should map to RDTSC. On Alpha, it + should map to RPCC. As the backing counters overflow quickly (on the order + of 9 seconds on alpha), this should only be used for small timings.

      Semantics:
      - -

      -When directly supported, reading the cycle counter should not modify any memory. -Implementations are allowed to either return a application specific value or a -system wide value. On backends without support, this is lowered to a constant 0. -

      +

      When directly supported, reading the cycle counter should not modify any + memory. Implementations are allowed to either return a application specific + value or a system wide value. On backends without support, this is lowered + to a constant 0.

      @@ -5334,12 +5699,11 @@ system wide value. On backends without support, this is lowered to a constant 0
      -

      -LLVM provides intrinsics for a few important standard C library functions. -These intrinsics allow source-language front-ends to pass information about the -alignment of the pointer arguments to the code generator, providing opportunity -for more efficient code generation. -

      + +

      LLVM provides intrinsics for a few important standard C library functions. + These intrinsics allow source-language front-ends to pass information about + the alignment of the pointer arguments to the code generator, providing + opportunity for more efficient code generation.

      @@ -5351,11 +5715,12 @@ for more efficient code generation.
      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.memcpy on any integer bit -width. Not all targets support all bit widths however.

      +

      This is an overloaded intrinsic. You can use llvm.memcpy on any + integer bit width. Not all targets support all bit widths however.

      +
         declare void @llvm.memcpy.i8(i8 * <dest>, i8 * <src>,
      -                                i8 <len>, i32 <align>)
      +                               i8 <len>, i32 <align>)
         declare void @llvm.memcpy.i16(i8 * <dest>, i8 * <src>,
                                       i16 <len>, i32 <align>)
         declare void @llvm.memcpy.i32(i8 * <dest>, i8 * <src>,
      @@ -5365,44 +5730,31 @@ width. Not all targets support all bit widths however.

      Overview:
      +

      The 'llvm.memcpy.*' intrinsics copy a block of memory from the + source location to the destination location.

      -

      -The 'llvm.memcpy.*' intrinsics copy a block of memory from the source -location to the destination location. -

      - -

      -Note that, unlike the standard libc function, the llvm.memcpy.* -intrinsics do not return a value, and takes an extra alignment argument. -

      +

      Note that, unlike the standard libc function, the llvm.memcpy.* + intrinsics do not return a value, and takes an extra alignment argument.

      Arguments:
      +

      The first argument is a pointer to the destination, the second is a pointer + to the source. The third argument is an integer argument specifying the + number of bytes to copy, and the fourth argument is the alignment of the + source and destination locations.

      -

      -The first argument is a pointer to the destination, the second is a pointer to -the source. The third argument is an integer argument -specifying the number of bytes to copy, and the fourth argument is the alignment -of the source and destination locations. -

      - -

      -If the call to this intrinisic has an alignment value that is not 0 or 1, then -the caller guarantees that both the source and destination pointers are aligned -to that boundary. -

      +

      If the call to this intrinisic has an alignment value that is not 0 or 1, + then the caller guarantees that both the source and destination pointers are + aligned to that boundary.

      Semantics:
      +

      The 'llvm.memcpy.*' intrinsics copy a block of memory from the + source location to the destination location, which are not allowed to + overlap. It copies "len" bytes of memory over. If the argument is known to + be aligned to some boundary, this can be specified as the fourth argument, + otherwise it should be set to 0 or 1.

      -

      -The 'llvm.memcpy.*' intrinsics copy a block of memory from the source -location to the destination location, which are not allowed to overlap. It -copies "len" bytes of memory over. If the argument is known to be aligned to -some boundary, this can be specified as the fourth argument, otherwise it should -be set to 0 or 1. -

      -
      'llvm.memmove' Intrinsic @@ -5412,10 +5764,11 @@ be set to 0 or 1.
      Syntax:

      This is an overloaded intrinsic. You can use llvm.memmove on any integer bit -width. Not all targets support all bit widths however.

      + width. Not all targets support all bit widths however.

      +
         declare void @llvm.memmove.i8(i8 * <dest>, i8 * <src>,
      -                                 i8 <len>, i32 <align>)
      +                                i8 <len>, i32 <align>)
         declare void @llvm.memmove.i16(i8 * <dest>, i8 * <src>,
                                        i16 <len>, i32 <align>)
         declare void @llvm.memmove.i32(i8 * <dest>, i8 * <src>,
      @@ -5425,45 +5778,33 @@ width. Not all targets support all bit widths however.

      Overview:
      +

      The 'llvm.memmove.*' intrinsics move a block of memory from the + source location to the destination location. It is similar to the + 'llvm.memcpy' intrinsic but allows the two memory locations to + overlap.

      -

      -The 'llvm.memmove.*' intrinsics move a block of memory from the source -location to the destination location. It is similar to the -'llvm.memcpy' intrinsic but allows the two memory locations to overlap. -

      - -

      -Note that, unlike the standard libc function, the llvm.memmove.* -intrinsics do not return a value, and takes an extra alignment argument. -

      +

      Note that, unlike the standard libc function, the llvm.memmove.* + intrinsics do not return a value, and takes an extra alignment argument.

      Arguments:
      +

      The first argument is a pointer to the destination, the second is a pointer + to the source. The third argument is an integer argument specifying the + number of bytes to copy, and the fourth argument is the alignment of the + source and destination locations.

      -

      -The first argument is a pointer to the destination, the second is a pointer to -the source. The third argument is an integer argument -specifying the number of bytes to copy, and the fourth argument is the alignment -of the source and destination locations. -

      - -

      -If the call to this intrinisic has an alignment value that is not 0 or 1, then -the caller guarantees that the source and destination pointers are aligned to -that boundary. -

      +

      If the call to this intrinisic has an alignment value that is not 0 or 1, + then the caller guarantees that the source and destination pointers are + aligned to that boundary.

      Semantics:
      +

      The 'llvm.memmove.*' intrinsics copy a block of memory from the + source location to the destination location, which may overlap. It copies + "len" bytes of memory over. If the argument is known to be aligned to some + boundary, this can be specified as the fourth argument, otherwise it should + be set to 0 or 1.

      -

      -The 'llvm.memmove.*' intrinsics copy a block of memory from the source -location to the destination location, which may overlap. It -copies "len" bytes of memory over. If the argument is known to be aligned to -some boundary, this can be specified as the fourth argument, otherwise it should -be set to 0 or 1. -

      -
      'llvm.memset.*' Intrinsics @@ -5473,10 +5814,11 @@ be set to 0 or 1.
      Syntax:

      This is an overloaded intrinsic. You can use llvm.memset on any integer bit -width. Not all targets support all bit widths however.

      + width. Not all targets support all bit widths however.

      +
         declare void @llvm.memset.i8(i8 * <dest>, i8 <val>,
      -                                i8 <len>, i32 <align>)
      +                               i8 <len>, i32 <align>)
         declare void @llvm.memset.i16(i8 * <dest>, i8 <val>,
                                       i16 <len>, i32 <align>)
         declare void @llvm.memset.i32(i8 * <dest>, i8 <val>,
      @@ -5486,43 +5828,30 @@ width. Not all targets support all bit widths however.

      Overview:
      +

      The 'llvm.memset.*' intrinsics fill a block of memory with a + particular byte value.

      -

      -The 'llvm.memset.*' intrinsics fill a block of memory with a particular -byte value. -

      - -

      -Note that, unlike the standard libc function, the llvm.memset intrinsic -does not return a value, and takes an extra alignment argument. -

      +

      Note that, unlike the standard libc function, the llvm.memset + intrinsic does not return a value, and takes an extra alignment argument.

      Arguments:
      +

      The first argument is a pointer to the destination to fill, the second is the + byte value to fill it with, the third argument is an integer argument + specifying the number of bytes to fill, and the fourth argument is the known + alignment of destination location.

      -

      -The first argument is a pointer to the destination to fill, the second is the -byte value to fill it with, the third argument is an integer -argument specifying the number of bytes to fill, and the fourth argument is the -known alignment of destination location. -

      - -

      -If the call to this intrinisic has an alignment value that is not 0 or 1, then -the caller guarantees that the destination pointer is aligned to that boundary. -

      +

      If the call to this intrinisic has an alignment value that is not 0 or 1, + then the caller guarantees that the destination pointer is aligned to that + boundary.

      Semantics:
      +

      The 'llvm.memset.*' intrinsics fill "len" bytes of memory starting + at the destination location. If the argument is known to be aligned to some + boundary, this can be specified as the fourth argument, otherwise it should + be set to 0 or 1.

      -

      -The 'llvm.memset.*' intrinsics fill "len" bytes of memory starting at -the -destination location. If the argument is known to be aligned to some boundary, -this can be specified as the fourth argument, otherwise it should be set to 0 or -1. -

      -
      'llvm.sqrt.*' Intrinsic @@ -5531,9 +5860,10 @@ this can be specified as the fourth argument, otherwise it should be set to 0 or
      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.sqrt on any -floating point or vector of floating point type. Not all targets support all -types however.

      +

      This is an overloaded intrinsic. You can use llvm.sqrt on any + floating point or vector of floating point type. Not all targets support all + types however.

      +
         declare float     @llvm.sqrt.f32(float %Val)
         declare double    @llvm.sqrt.f64(double %Val)
      @@ -5543,28 +5873,21 @@ types however.

      Overview:
      - -

      -The 'llvm.sqrt' intrinsics return the sqrt of the specified operand, -returning the same value as the libm 'sqrt' functions would. Unlike -sqrt in libm, however, llvm.sqrt has undefined behavior for -negative numbers other than -0.0 (which allows for better optimization, because -there is no need to worry about errno being set). llvm.sqrt(-0.0) is -defined to return -0.0 like IEEE sqrt. -

      +

      The 'llvm.sqrt' intrinsics return the sqrt of the specified operand, + returning the same value as the libm 'sqrt' functions would. + Unlike sqrt in libm, however, llvm.sqrt has undefined + behavior for negative numbers other than -0.0 (which allows for better + optimization, because there is no need to worry about errno being + set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt.

      Arguments:
      - -

      -The argument and return value are floating point numbers of the same type. -

      +

      The argument and return value are floating point numbers of the same + type.

      Semantics:
      +

      This function returns the sqrt of the specified operand if it is a + nonnegative floating point number.

      -

      -This function returns the sqrt of the specified operand if it is a nonnegative -floating point number. -

      @@ -5575,9 +5898,10 @@ floating point number.
      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.powi on any -floating point or vector of floating point type. Not all targets support all -types however.

      +

      This is an overloaded intrinsic. You can use llvm.powi on any + floating point or vector of floating point type. Not all targets support all + types however.

      +
         declare float     @llvm.powi.f32(float  %Val, i32 %power)
         declare double    @llvm.powi.f64(double %Val, i32 %power)
      @@ -5587,26 +5911,19 @@ types however.

      Overview:
      - -

      -The 'llvm.powi.*' intrinsics return the first operand raised to the -specified (positive or negative) power. The order of evaluation of -multiplications is not defined. When a vector of floating point type is -used, the second argument remains a scalar integer value. -

      +

      The 'llvm.powi.*' intrinsics return the first operand raised to the + specified (positive or negative) power. The order of evaluation of + multiplications is not defined. When a vector of floating point type is + used, the second argument remains a scalar integer value.

      Arguments:
      - -

      -The second argument is an integer power, and the first is a value to raise to -that power. -

      +

      The second argument is an integer power, and the first is a value to raise to + that power.

      Semantics:
      +

      This function returns the first value raised to the second power with an + unspecified sequence of rounding operations.

      -

      -This function returns the first value raised to the second power with an -unspecified sequence of rounding operations.

      @@ -5617,9 +5934,10 @@ unspecified sequence of rounding operations.

      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.sin on any -floating point or vector of floating point type. Not all targets support all -types however.

      +

      This is an overloaded intrinsic. You can use llvm.sin on any + floating point or vector of floating point type. Not all targets support all + types however.

      +
         declare float     @llvm.sin.f32(float  %Val)
         declare double    @llvm.sin.f64(double %Val)
      @@ -5629,23 +5947,17 @@ types however.

      Overview:
      - -

      -The 'llvm.sin.*' intrinsics return the sine of the operand. -

      +

      The 'llvm.sin.*' intrinsics return the sine of the operand.

      Arguments:
      - -

      -The argument and return value are floating point numbers of the same type. -

      +

      The argument and return value are floating point numbers of the same + type.

      Semantics:
      +

      This function returns the sine of the specified operand, returning the same + values as the libm sin functions would, and handles error conditions + in the same way.

      -

      -This function returns the sine of the specified operand, returning the -same values as the libm sin functions would, and handles error -conditions in the same way.

      @@ -5656,9 +5968,10 @@ conditions in the same way.

      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.cos on any -floating point or vector of floating point type. Not all targets support all -types however.

      +

      This is an overloaded intrinsic. You can use llvm.cos on any + floating point or vector of floating point type. Not all targets support all + types however.

      +
         declare float     @llvm.cos.f32(float  %Val)
         declare double    @llvm.cos.f64(double %Val)
      @@ -5668,23 +5981,17 @@ types however.

      Overview:
      - -

      -The 'llvm.cos.*' intrinsics return the cosine of the operand. -

      +

      The 'llvm.cos.*' intrinsics return the cosine of the operand.

      Arguments:
      - -

      -The argument and return value are floating point numbers of the same type. -

      +

      The argument and return value are floating point numbers of the same + type.

      Semantics:
      +

      This function returns the cosine of the specified operand, returning the same + values as the libm cos functions would, and handles error conditions + in the same way.

      -

      -This function returns the cosine of the specified operand, returning the -same values as the libm cos functions would, and handles error -conditions in the same way.

      @@ -5695,9 +6002,10 @@ conditions in the same way.

      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.pow on any -floating point or vector of floating point type. Not all targets support all -types however.

      +

      This is an overloaded intrinsic. You can use llvm.pow on any + floating point or vector of floating point type. Not all targets support all + types however.

      +
         declare float     @llvm.pow.f32(float  %Val, float %Power)
         declare double    @llvm.pow.f64(double %Val, double %Power)
      @@ -5707,39 +6015,29 @@ types however.

      Overview:
      - -

      -The 'llvm.pow.*' intrinsics return the first operand raised to the -specified (positive or negative) power. -

      +

      The 'llvm.pow.*' intrinsics return the first operand raised to the + specified (positive or negative) power.

      Arguments:
      - -

      -The second argument is a floating point power, and the first is a value to -raise to that power. -

      +

      The second argument is a floating point power, and the first is a value to + raise to that power.

      Semantics:
      +

      This function returns the first value raised to the second power, returning + the same values as the libm pow functions would, and handles error + conditions in the same way.

      -

      -This function returns the first value raised to the second power, -returning the -same values as the libm pow functions would, and handles error -conditions in the same way.

      -
      -

      -LLVM provides intrinsics for a few important bit manipulation operations. -These allow efficient code generation for some algorithms. -

      + +

      LLVM provides intrinsics for a few important bit manipulation operations. + These allow efficient code generation for some algorithms.

      @@ -5752,7 +6050,8 @@ These allow efficient code generation for some algorithms.
      Syntax:

      This is an overloaded intrinsic function. You can use bswap on any integer -type that is an even number of bytes (i.e. BitWidth % 16 == 0).

      + type that is an even number of bytes (i.e. BitWidth % 16 == 0).

      +
         declare i16 @llvm.bswap.i16(i16 <id>)
         declare i32 @llvm.bswap.i32(i32 <id>)
      @@ -5760,25 +6059,20 @@ type that is an even number of bytes (i.e. BitWidth % 16 == 0).

      Overview:
      - -

      -The 'llvm.bswap' family of intrinsics is used to byte swap integer -values with an even number of bytes (positive multiple of 16 bits). These are -useful for performing operations on data that is not in the target's native -byte order. -

      +

      The 'llvm.bswap' family of intrinsics is used to byte swap integer + values with an even number of bytes (positive multiple of 16 bits). These + are useful for performing operations on data that is not in the target's + native byte order.

      Semantics:
      - -

      -The llvm.bswap.i16 intrinsic returns an i16 value that has the high -and low byte of the input i16 swapped. Similarly, the llvm.bswap.i32 -intrinsic returns an i32 value that has the four bytes of the input i32 -swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the returned -i32 will have its bytes in 3, 2, 1, 0 order. The llvm.bswap.i48, -llvm.bswap.i64 and other intrinsics extend this concept to -additional even-byte lengths (6 bytes, 8 bytes and more, respectively). -

      +

      The llvm.bswap.i16 intrinsic returns an i16 value that has the high + and low byte of the input i16 swapped. Similarly, + the llvm.bswap.i32 intrinsic returns an i32 value that has the four + bytes of the input i32 swapped, so that if the input bytes are numbered 0, 1, + 2, 3 then the returned i32 will have its bytes in 3, 2, 1, 0 order. + The llvm.bswap.i48, llvm.bswap.i64 and other intrinsics + extend this concept to additional even-byte lengths (6 bytes, 8 bytes and + more, respectively).

      @@ -5791,7 +6085,8 @@ additional even-byte lengths (6 bytes, 8 bytes and more, respectively).
      Syntax:

      This is an overloaded intrinsic. You can use llvm.ctpop on any integer bit -width. Not all targets support all bit widths however.

      + width. Not all targets support all bit widths however.

      +
         declare i8 @llvm.ctpop.i8(i8  <src>)
         declare i16 @llvm.ctpop.i16(i16 <src>)
      @@ -5801,24 +6096,16 @@ width. Not all targets support all bit widths however.

      Overview:
      - -

      -The 'llvm.ctpop' family of intrinsics counts the number of bits set in a -value. -

      +

      The 'llvm.ctpop' family of intrinsics counts the number of bits set + in a value.

      Arguments:
      - -

      -The only argument is the value to be counted. The argument may be of any -integer type. The return type must match the argument type. -

      +

      The only argument is the value to be counted. The argument may be of any + integer type. The return type must match the argument type.

      Semantics:
      +

      The 'llvm.ctpop' intrinsic counts the 1's in a variable.

      -

      -The 'llvm.ctpop' intrinsic counts the 1's in a variable. -

      @@ -5829,8 +6116,9 @@ The 'llvm.ctpop' intrinsic counts the 1's in a variable.
      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.ctlz on any -integer bit width. Not all targets support all bit widths however.

      +

      This is an overloaded intrinsic. You can use llvm.ctlz on any + integer bit width. Not all targets support all bit widths however.

      +
         declare i8 @llvm.ctlz.i8 (i8  <src>)
         declare i16 @llvm.ctlz.i16(i16 <src>)
      @@ -5840,30 +6128,20 @@ integer bit width. Not all targets support all bit widths however.

      Overview:
      - -

      -The 'llvm.ctlz' family of intrinsic functions counts the number of -leading zeros in a variable. -

      +

      The 'llvm.ctlz' family of intrinsic functions counts the number of + leading zeros in a variable.

      Arguments:
      - -

      -The only argument is the value to be counted. The argument may be of any -integer type. The return type must match the argument type. -

      +

      The only argument is the value to be counted. The argument may be of any + integer type. The return type must match the argument type.

      Semantics:
      +

      The 'llvm.ctlz' intrinsic counts the leading (most significant) + zeros in a variable. If the src == 0 then the result is the size in bits of + the type of src. For example, llvm.ctlz(i32 2) = 30.

      -

      -The 'llvm.ctlz' intrinsic counts the leading (most significant) zeros -in a variable. If the src == 0 then the result is the size in bits of the type -of src. For example, llvm.ctlz(i32 2) = 30. -

      - -
      'llvm.cttz.*' Intrinsic @@ -5872,8 +6150,9 @@ of src. For example, llvm.ctlz(i32 2) = 30.
      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.cttz on any -integer bit width. Not all targets support all bit widths however.

      +

      This is an overloaded intrinsic. You can use llvm.cttz on any + integer bit width. Not all targets support all bit widths however.

      +
         declare i8 @llvm.cttz.i8 (i8  <src>)
         declare i16 @llvm.cttz.i16(i16 <src>)
      @@ -5883,130 +6162,17 @@ integer bit width. Not all targets support all bit widths however.

      Overview:
      - -

      -The 'llvm.cttz' family of intrinsic functions counts the number of -trailing zeros. -

      - -
      Arguments:
      - -

      -The only argument is the value to be counted. The argument may be of any -integer type. The return type must match the argument type. -

      - -
      Semantics:
      - -

      -The 'llvm.cttz' intrinsic counts the trailing (least significant) zeros -in a variable. If the src == 0 then the result is the size in bits of the type -of src. For example, llvm.cttz(2) = 1. -

      -
      - - - - -
      - -
      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.part.select -on any integer bit width.

      -
      -  declare i17 @llvm.part.select.i17 (i17 %val, i32 %loBit, i32 %hiBit)
      -  declare i29 @llvm.part.select.i29 (i29 %val, i32 %loBit, i32 %hiBit)
      -
      - -
      Overview:
      -

      The 'llvm.part.select' family of intrinsic functions selects a -range of bits from an integer value and returns them in the same bit width as -the original value.

      - -
      Arguments:
      -

      The first argument, %val and the result may be integer types of -any bit width but they must have the same bit width. The second and third -arguments must be i32 type since they specify only a bit index.

      - -
      Semantics:
      -

      The operation of the 'llvm.part.select' intrinsic has two modes -of operation: forwards and reverse. If %loBit is greater than -%hiBits then the intrinsic operates in reverse mode. Otherwise it -operates in forward mode.

      -

      In forward mode, this intrinsic is the equivalent of shifting %val -right by %loBit bits and then ANDing it with a mask with -only the %hiBit - %loBit bits set, as follows:

      -
        -
      1. The %val is shifted right (LSHR) by the number of bits specified - by %loBits. This normalizes the value to the low order bits.
      2. -
      3. The %loBits value is subtracted from the %hiBits value - to determine the number of bits to retain.
      4. -
      5. A mask of the retained bits is created by shifting a -1 value.
      6. -
      7. The mask is ANDed with %val to produce the result.
      8. -
      -

      In reverse mode, a similar computation is made except that the bits are -returned in the reverse order. So, for example, if X has the value -i16 0x0ACF (101011001111) and we apply -part.select(i16 X, 8, 3) to it, we get back the value -i16 0x0026 (000000100110).

      -
      - - - -
      - -
      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.part.set -on any integer bit width.

      -
      -  declare i17 @llvm.part.set.i17.i9 (i17 %val, i9 %repl, i32 %lo, i32 %hi)
      -  declare i29 @llvm.part.set.i29.i9 (i29 %val, i9 %repl, i32 %lo, i32 %hi)
      -
      - -
      Overview:
      -

      The 'llvm.part.set' family of intrinsic functions replaces a range -of bits in an integer value with another integer value. It returns the integer -with the replaced bits.

      +

      The 'llvm.cttz' family of intrinsic functions counts the number of + trailing zeros.

      Arguments:
      -

      The first argument, %val, and the result may be integer types of -any bit width, but they must have the same bit width. %val is the value -whose bits will be replaced. The second argument, %repl may be an -integer of any bit width. The third and fourth arguments must be i32 -type since they specify only a bit index.

      +

      The only argument is the value to be counted. The argument may be of any + integer type. The return type must match the argument type.

      Semantics:
      -

      The operation of the 'llvm.part.set' intrinsic has two modes -of operation: forwards and reverse. If %lo is greater than -%hi then the intrinsic operates in reverse mode. Otherwise it -operates in forward mode.

      - -

      For both modes, the %repl value is prepared for use by either -truncating it down to the size of the replacement area or zero extending it -up to that size.

      - -

      In forward mode, the bits between %lo and %hi (inclusive) -are replaced with corresponding bits from %repl. That is the 0th bit -in %repl replaces the %loth bit in %val and etc. up -to the %hith bit.

      - -

      In reverse mode, a similar computation is made except that the bits are -reversed. That is, the 0th bit in %repl replaces the -%hi bit in %val and etc. down to the %loth bit.

      - -
      Examples:
      - -
      -  llvm.part.set(0xFFFF, 0, 4, 7) -> 0xFF0F
      -  llvm.part.set(0xFFFF, 0, 7, 4) -> 0xFF0F
      -  llvm.part.set(0xFFFF, 1, 7, 4) -> 0xFF8F
      -  llvm.part.set(0xFFFF, F, 8, 3) -> 0xFFE7
      -  llvm.part.set(0xFFFF, 0, 3, 8) -> 0xFE07
      -
      +

      The 'llvm.cttz' intrinsic counts the trailing (least significant) + zeros in a variable. If the src == 0 then the result is the size in bits of + the type of src. For example, llvm.cttz(2) = 1.

      @@ -6016,9 +6182,8 @@ reversed. That is, the 0th bit in %repl replaces the
      -

      -LLVM provides intrinsics for some arithmetic with overflow operations. -

      + +

      LLVM provides intrinsics for some arithmetic with overflow operations.

      @@ -6030,9 +6195,8 @@ LLVM provides intrinsics for some arithmetic with overflow operations.
      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.sadd.with.overflow -on any integer bit width.

      + on any integer bit width.

         declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
      @@ -6041,24 +6205,23 @@ on any integer bit width.

      Overview:
      -

      The 'llvm.sadd.with.overflow' family of intrinsic functions perform -a signed addition of the two arguments, and indicate whether an overflow -occurred during the signed summation.

      + a signed addition of the two arguments, and indicate whether an overflow + occurred during the signed summation.

      Arguments:
      -

      The arguments (%a and %b) and the first element of the result structure may -be of integer types of any bit width, but they must have the same bit width. The -second element of the result structure must be of type i1. %a -and %b are the two values that will undergo signed addition.

      + be of integer types of any bit width, but they must have the same bit + width. The second element of the result structure must be of + type i1. %a and %b are the two values that will + undergo signed addition.

      Semantics:
      -

      The 'llvm.sadd.with.overflow' family of intrinsic functions perform -a signed addition of the two variables. They return a structure — the -first element of which is the signed summation, and the second element of which -is a bit specifying if the signed summation resulted in an overflow.

      + a signed addition of the two variables. They return a structure — the + first element of which is the signed summation, and the second element of + which is a bit specifying if the signed summation resulted in an + overflow.

      Examples:
      @@ -6078,9 +6241,8 @@ is a bit specifying if the signed summation resulted in an overflow.

      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.uadd.with.overflow -on any integer bit width.

      + on any integer bit width.

         declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
      @@ -6089,24 +6251,22 @@ on any integer bit width.

      Overview:
      -

      The 'llvm.uadd.with.overflow' family of intrinsic functions perform -an unsigned addition of the two arguments, and indicate whether a carry occurred -during the unsigned summation.

      + an unsigned addition of the two arguments, and indicate whether a carry + occurred during the unsigned summation.

      Arguments:
      -

      The arguments (%a and %b) and the first element of the result structure may -be of integer types of any bit width, but they must have the same bit width. The -second element of the result structure must be of type i1. %a -and %b are the two values that will undergo unsigned addition.

      + be of integer types of any bit width, but they must have the same bit + width. The second element of the result structure must be of + type i1. %a and %b are the two values that will + undergo unsigned addition.

      Semantics:
      -

      The 'llvm.uadd.with.overflow' family of intrinsic functions perform -an unsigned addition of the two arguments. They return a structure — the -first element of which is the sum, and the second element of which is a bit -specifying if the unsigned summation resulted in a carry.

      + an unsigned addition of the two arguments. They return a structure — + the first element of which is the sum, and the second element of which is a + bit specifying if the unsigned summation resulted in a carry.

      Examples:
      @@ -6126,9 +6286,8 @@ specifying if the unsigned summation resulted in a carry.

      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.ssub.with.overflow -on any integer bit width.

      + on any integer bit width.

         declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
      @@ -6137,24 +6296,23 @@ on any integer bit width.

      Overview:
      -

      The 'llvm.ssub.with.overflow' family of intrinsic functions perform -a signed subtraction of the two arguments, and indicate whether an overflow -occurred during the signed subtraction.

      + a signed subtraction of the two arguments, and indicate whether an overflow + occurred during the signed subtraction.

      Arguments:
      -

      The arguments (%a and %b) and the first element of the result structure may -be of integer types of any bit width, but they must have the same bit width. The -second element of the result structure must be of type i1. %a -and %b are the two values that will undergo signed subtraction.

      + be of integer types of any bit width, but they must have the same bit + width. The second element of the result structure must be of + type i1. %a and %b are the two values that will + undergo signed subtraction.

      Semantics:
      -

      The 'llvm.ssub.with.overflow' family of intrinsic functions perform -a signed subtraction of the two arguments. They return a structure — the -first element of which is the subtraction, and the second element of which is a bit -specifying if the signed subtraction resulted in an overflow.

      + a signed subtraction of the two arguments. They return a structure — + the first element of which is the subtraction, and the second element of + which is a bit specifying if the signed subtraction resulted in an + overflow.

      Examples:
      @@ -6174,9 +6332,8 @@ specifying if the signed subtraction resulted in an overflow.

      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.usub.with.overflow -on any integer bit width.

      + on any integer bit width.

         declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
      @@ -6185,24 +6342,23 @@ on any integer bit width.

      Overview:
      -

      The 'llvm.usub.with.overflow' family of intrinsic functions perform -an unsigned subtraction of the two arguments, and indicate whether an overflow -occurred during the unsigned subtraction.

      + an unsigned subtraction of the two arguments, and indicate whether an + overflow occurred during the unsigned subtraction.

      Arguments:
      -

      The arguments (%a and %b) and the first element of the result structure may -be of integer types of any bit width, but they must have the same bit width. The -second element of the result structure must be of type i1. %a -and %b are the two values that will undergo unsigned subtraction.

      + be of integer types of any bit width, but they must have the same bit + width. The second element of the result structure must be of + type i1. %a and %b are the two values that will + undergo unsigned subtraction.

      Semantics:
      -

      The 'llvm.usub.with.overflow' family of intrinsic functions perform -an unsigned subtraction of the two arguments. They return a structure — the -first element of which is the subtraction, and the second element of which is a bit -specifying if the unsigned subtraction resulted in an overflow.

      + an unsigned subtraction of the two arguments. They return a structure — + the first element of which is the subtraction, and the second element of + which is a bit specifying if the unsigned subtraction resulted in an + overflow.

      Examples:
      @@ -6222,9 +6378,8 @@ specifying if the unsigned subtraction resulted in an overflow.

      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.smul.with.overflow -on any integer bit width.

      + on any integer bit width.

         declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
      @@ -6235,23 +6390,22 @@ on any integer bit width.

      Overview:

      The 'llvm.smul.with.overflow' family of intrinsic functions perform -a signed multiplication of the two arguments, and indicate whether an overflow -occurred during the signed multiplication.

      + a signed multiplication of the two arguments, and indicate whether an + overflow occurred during the signed multiplication.

      Arguments:
      -

      The arguments (%a and %b) and the first element of the result structure may -be of integer types of any bit width, but they must have the same bit width. The -second element of the result structure must be of type i1. %a -and %b are the two values that will undergo signed multiplication.

      + be of integer types of any bit width, but they must have the same bit + width. The second element of the result structure must be of + type i1. %a and %b are the two values that will + undergo signed multiplication.

      Semantics:
      -

      The 'llvm.smul.with.overflow' family of intrinsic functions perform -a signed multiplication of the two arguments. They return a structure — -the first element of which is the multiplication, and the second element of -which is a bit specifying if the signed multiplication resulted in an -overflow.

      + a signed multiplication of the two arguments. They return a structure — + the first element of which is the multiplication, and the second element of + which is a bit specifying if the signed multiplication resulted in an + overflow.

      Examples:
      @@ -6271,9 +6425,8 @@ overflow.

      Syntax:
      -

      This is an overloaded intrinsic. You can use llvm.umul.with.overflow -on any integer bit width.

      + on any integer bit width.

         declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
      @@ -6282,29 +6435,23 @@ on any integer bit width.

      Overview:
      - -

      Warning: 'llvm.umul.with.overflow' is badly broken. It is -actively being fixed, but it should not currently be used!

      -

      The 'llvm.umul.with.overflow' family of intrinsic functions perform -a unsigned multiplication of the two arguments, and indicate whether an overflow -occurred during the unsigned multiplication.

      + a unsigned multiplication of the two arguments, and indicate whether an + overflow occurred during the unsigned multiplication.

      Arguments:
      -

      The arguments (%a and %b) and the first element of the result structure may -be of integer types of any bit width, but they must have the same bit width. The -second element of the result structure must be of type i1. %a -and %b are the two values that will undergo unsigned -multiplication.

      + be of integer types of any bit width, but they must have the same bit + width. The second element of the result structure must be of + type i1. %a and %b are the two values that will + undergo unsigned multiplication.

      Semantics:
      -

      The 'llvm.umul.with.overflow' family of intrinsic functions perform -an unsigned multiplication of the two arguments. They return a structure — -the first element of which is the multiplication, and the second element of -which is a bit specifying if the unsigned multiplication resulted in an -overflow.

      + an unsigned multiplication of the two arguments. They return a structure + — the first element of which is the multiplication, and the second + element of which is a bit specifying if the unsigned multiplication resulted + in an overflow.

      Examples:
      @@ -6322,14 +6469,13 @@ overflow.

      -

      -The LLVM debugger intrinsics (which all start with llvm.dbg. prefix), -are described in the LLVM Source Level -Debugging document. -

      -
      +

      The LLVM debugger intrinsics (which all start with llvm.dbg. + prefix), are described in + the LLVM Source + Level Debugging document.

      + +
      @@ -6337,10 +6483,12 @@ Debugging document.
      -

      The LLVM exception handling intrinsics (which all start with -llvm.eh. prefix), are described in the LLVM Exception -Handling document.

      + +

      The LLVM exception handling intrinsics (which all start with + llvm.eh. prefix), are described in + the LLVM Exception + Handling document.

      +
      @@ -6349,70 +6497,74 @@ Handling document.

      -

      - This intrinsic makes it possible to excise one parameter, marked with - the nest attribute, from a function. The result is a callable - function pointer lacking the nest parameter - the caller does not need - to provide a value for it. Instead, the value to use is stored in - advance in a "trampoline", a block of memory usually allocated - on the stack, which also contains code to splice the nest value into the - argument list. This is used to implement the GCC nested function address - extension. -

      -

      - For example, if the function is - i32 f(i8* nest %c, i32 %x, i32 %y) then the resulting function - pointer has signature i32 (i32, i32)*. It can be created as follows:

      + +

      This intrinsic makes it possible to excise one parameter, marked with + the nest attribute, from a function. The result is a callable + function pointer lacking the nest parameter - the caller does not need to + provide a value for it. Instead, the value to use is stored in advance in a + "trampoline", a block of memory usually allocated on the stack, which also + contains code to splice the nest value into the argument list. This is used + to implement the GCC nested function address extension.

      + +

      For example, if the function is + i32 f(i8* nest %c, i32 %x, i32 %y) then the resulting function + pointer has signature i32 (i32, i32)*. It can be created as + follows:

      + +
         %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
         %tramp1 = getelementptr [10 x i8]* %tramp, i32 0, i32 0
         %p = call i8* @llvm.init.trampoline( i8* %tramp1, i8* bitcast (i32 (i8* nest , i32, i32)* @f to i8*), i8* %nval )
         %fp = bitcast i8* %p to i32 (i32, i32)*
       
      -

      The call %val = call i32 %fp( i32 %x, i32 %y ) is then equivalent - to %val = call i32 %f( i8* %nval, i32 %x, i32 %y ).

      +
      + +

      The call %val = call i32 %fp( i32 %x, i32 %y ) is then equivalent + to %val = call i32 %f( i8* %nval, i32 %x, i32 %y ).

      +
      +
      +
      Syntax:
      -declare i8* @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
      +  declare i8* @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
       
      +
      Overview:
      -

      - This fills the memory pointed to by tramp with code - and returns a function pointer suitable for executing it. -

      +

      This fills the memory pointed to by tramp with code and returns a + function pointer suitable for executing it.

      +
      Arguments:
      -

      - The llvm.init.trampoline intrinsic takes three arguments, all - pointers. The tramp argument must point to a sufficiently large - and sufficiently aligned block of memory; this memory is written to by the - intrinsic. Note that the size and the alignment are target-specific - LLVM - currently provides no portable way of determining them, so a front-end that - generates this intrinsic needs to have some target-specific knowledge. - The func argument must hold a function bitcast to an i8*. -

      +

      The llvm.init.trampoline intrinsic takes three arguments, all + pointers. The tramp argument must point to a sufficiently large and + sufficiently aligned block of memory; this memory is written to by the + intrinsic. Note that the size and the alignment are target-specific - LLVM + currently provides no portable way of determining them, so a front-end that + generates this intrinsic needs to have some target-specific knowledge. + The func argument must hold a function bitcast to + an i8*.

      +
      Semantics:
      -

      - The block of memory pointed to by tramp is filled with target - dependent code, turning it into a function. A pointer to this function is - returned, but needs to be bitcast to an - appropriate function pointer type - before being called. The new function's signature is the same as that of - func with any arguments marked with the nest attribute - removed. At most one such nest argument is allowed, and it must be - of pointer type. Calling the new function is equivalent to calling - func with the same argument list, but with nval used for the - missing nest argument. If, after calling - llvm.init.trampoline, the memory pointed to by tramp is - modified, then the effect of any later call to the returned function pointer is - undefined. -

      +

      The block of memory pointed to by tramp is filled with target + dependent code, turning it into a function. A pointer to this function is + returned, but needs to be bitcast to an appropriate + function pointer type before being called. The new function's signature + is the same as that of func with any arguments marked with + the nest attribute removed. At most one such nest argument + is allowed, and it must be of pointer type. Calling the new function is + equivalent to calling func with the same argument list, but + with nval used for the missing nest argument. If, after + calling llvm.init.trampoline, the memory pointed to + by tramp is modified, then the effect of any later call to the + returned function pointer is undefined.

      +
      @@ -6421,27 +6573,25 @@ declare i8* @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <n
      -

      - These intrinsic functions expand the "universal IR" of LLVM to represent - hardware constructs for atomic operations and memory synchronization. This - provides an interface to the hardware, not an interface to the programmer. It - is aimed at a low enough level to allow any programming models or APIs - (Application Programming Interfaces) which - need atomic behaviors to map cleanly onto it. It is also modeled primarily on - hardware behavior. Just as hardware provides a "universal IR" for source - languages, it also provides a starting point for developing a "universal" - atomic operation and synchronization IR. -

      -

      - These do not form an API such as high-level threading libraries, - software transaction memory systems, atomic primitives, and intrinsic - functions as found in BSD, GNU libc, atomic_ops, APR, and other system and - application libraries. The hardware interface provided by LLVM should allow - a clean implementation of all of these APIs and parallel programming models. - No one model or paradigm should be selected above others unless the hardware - itself ubiquitously does so. -

      +

      These intrinsic functions expand the "universal IR" of LLVM to represent + hardware constructs for atomic operations and memory synchronization. This + provides an interface to the hardware, not an interface to the programmer. It + is aimed at a low enough level to allow any programming models or APIs + (Application Programming Interfaces) which need atomic behaviors to map + cleanly onto it. It is also modeled primarily on hardware behavior. Just as + hardware provides a "universal IR" for source languages, it also provides a + starting point for developing a "universal" atomic operation and + synchronization IR.

      + +

      These do not form an API such as high-level threading libraries, + software transaction memory systems, atomic primitives, and intrinsic + functions as found in BSD, GNU libc, atomic_ops, APR, and other system and + application libraries. The hardware interface provided by LLVM should allow + a clean implementation of all of these APIs and parallel programming models. + No one model or paradigm should be selected above others unless the hardware + itself ubiquitously does so.

      +
      @@ -6451,62 +6601,60 @@ declare i8* @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <n
      Syntax:
      -declare void @llvm.memory.barrier( i1 <ll>, i1 <ls>, i1 <sl>, i1 <ss>, 
      -i1 <device> )
      -
      +  declare void @llvm.memory.barrier( i1 <ll>, i1 <ls>, i1 <sl>, i1 <ss>, i1 <device> )
       
      +
      Overview:
      -

      - The llvm.memory.barrier intrinsic guarantees ordering between - specific pairs of memory access types. -

      +

      The llvm.memory.barrier intrinsic guarantees ordering between + specific pairs of memory access types.

      +
      Arguments:
      -

      - The llvm.memory.barrier intrinsic requires five boolean arguments. - The first four arguments enables a specific barrier as listed below. The fith - argument specifies that the barrier applies to io or device or uncached memory. +

      The llvm.memory.barrier intrinsic requires five boolean arguments. + The first four arguments enables a specific barrier as listed below. The + fith argument specifies that the barrier applies to io or device or uncached + memory.

      + +
        +
      • ll: load-load barrier
      • +
      • ls: load-store barrier
      • +
      • sl: store-load barrier
      • +
      • ss: store-store barrier
      • +
      • device: barrier applies to device and uncached memory also.
      • +
      -

      -
        -
      • ll: load-load barrier
      • -
      • ls: load-store barrier
      • -
      • sl: store-load barrier
      • -
      • ss: store-store barrier
      • -
      • device: barrier applies to device and uncached memory also.
      • -
      Semantics:
      -

      - This intrinsic causes the system to enforce some ordering constraints upon - the loads and stores of the program. This barrier does not indicate - when any events will occur, it only enforces an order in - which they occur. For any of the specified pairs of load and store operations - (f.ex. load-load, or store-load), all of the first operations preceding the - barrier will complete before any of the second operations succeeding the - barrier begin. Specifically the semantics for each pairing is as follows: -

      -
        -
      • ll: All loads before the barrier must complete before any load - after the barrier begins.
      • - -
      • ls: All loads before the barrier must complete before any - store after the barrier begins.
      • -
      • ss: All stores before the barrier must complete before any - store after the barrier begins.
      • -
      • sl: All stores before the barrier must complete before any - load after the barrier begins.
      • -
      -

      - These semantics are applied with a logical "and" behavior when more than one - is enabled in a single memory barrier intrinsic. -

      -

      - Backends may implement stronger barriers than those requested when they do not - support as fine grained a barrier as requested. Some architectures do not - need all types of barriers and on such architectures, these become noops. -

      +

      This intrinsic causes the system to enforce some ordering constraints upon + the loads and stores of the program. This barrier does not + indicate when any events will occur, it only enforces + an order in which they occur. For any of the specified pairs of load + and store operations (f.ex. load-load, or store-load), all of the first + operations preceding the barrier will complete before any of the second + operations succeeding the barrier begin. Specifically the semantics for each + pairing is as follows:

      + +
        +
      • ll: All loads before the barrier must complete before any load + after the barrier begins.
      • +
      • ls: All loads before the barrier must complete before any + store after the barrier begins.
      • +
      • ss: All stores before the barrier must complete before any + store after the barrier begins.
      • +
      • sl: All stores before the barrier must complete before any + load after the barrier begins.
      • +
      + +

      These semantics are applied with a logical "and" behavior when more than one + is enabled in a single memory barrier intrinsic.

      + +

      Backends may implement stronger barriers than those requested when they do + not support as fine grained a barrier as requested. Some architectures do + not need all types of barriers and on such architectures, these become + noops.

      +
      Example:
      -%ptr      = malloc i32
      +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
      +%ptr      = bitcast i8* %mallocP to i32*
                   store i32 4, %ptr
       
       %result1  = load i32* %ptr      ; yields {i32}:result1 = 4
      @@ -6514,52 +6662,51 @@ i1 <device> )
                                       ; guarantee the above finishes
                   store i32 8, %ptr   ; before this begins
       
      +
      +
      +
      Syntax:
      -

      - This is an overloaded intrinsic. You can use llvm.atomic.cmp.swap on - any integer bit width and for different address spaces. Not all targets - support all bit widths however.

      +

      This is an overloaded intrinsic. You can use llvm.atomic.cmp.swap on + any integer bit width and for different address spaces. Not all targets + support all bit widths however.

      -declare i8 @llvm.atomic.cmp.swap.i8.p0i8( i8* <ptr>, i8 <cmp>, i8 <val> )
      -declare i16 @llvm.atomic.cmp.swap.i16.p0i16( i16* <ptr>, i16 <cmp>, i16 <val> )
      -declare i32 @llvm.atomic.cmp.swap.i32.p0i32( i32* <ptr>, i32 <cmp>, i32 <val> )
      -declare i64 @llvm.atomic.cmp.swap.i64.p0i64( i64* <ptr>, i64 <cmp>, i64 <val> )
      -
      +  declare i8 @llvm.atomic.cmp.swap.i8.p0i8( i8* <ptr>, i8 <cmp>, i8 <val> )
      +  declare i16 @llvm.atomic.cmp.swap.i16.p0i16( i16* <ptr>, i16 <cmp>, i16 <val> )
      +  declare i32 @llvm.atomic.cmp.swap.i32.p0i32( i32* <ptr>, i32 <cmp>, i32 <val> )
      +  declare i64 @llvm.atomic.cmp.swap.i64.p0i64( i64* <ptr>, i64 <cmp>, i64 <val> )
       
      +
      Overview:
      -

      - This loads a value in memory and compares it to a given value. If they are - equal, it stores a new value into the memory. -

      +

      This loads a value in memory and compares it to a given value. If they are + equal, it stores a new value into the memory.

      +
      Arguments:
      -

      - The llvm.atomic.cmp.swap intrinsic takes three arguments. The result as - well as both cmp and val must be integer values with the - same bit width. The ptr argument must be a pointer to a value of - this integer type. While any bit width integer may be used, targets may only - lower representations they support in hardware. +

      The llvm.atomic.cmp.swap intrinsic takes three arguments. The result + as well as both cmp and val must be integer values with the + same bit width. The ptr argument must be a pointer to a value of + this integer type. While any bit width integer may be used, targets may only + lower representations they support in hardware.

      -

      Semantics:
      -

      - This entire intrinsic must be executed atomically. It first loads the value - in memory pointed to by ptr and compares it with the value - cmp. If they are equal, val is stored into the memory. The - loaded value is yielded in all cases. This provides the equivalent of an - atomic compare-and-swap operation within the SSA framework. -

      -
      Examples:
      +

      This entire intrinsic must be executed atomically. It first loads the value + in memory pointed to by ptr and compares it with the + value cmp. If they are equal, val is stored into the + memory. The loaded value is yielded in all cases. This provides the + equivalent of an atomic compare-and-swap operation within the SSA + framework.

      +
      Examples:
      -%ptr      = malloc i32
      +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
      +%ptr      = bitcast i8* %mallocP to i32*
                   store i32 4, %ptr
       
       %val1     = add i32 4, 4
      @@ -6575,6 +6722,7 @@ declare i64 @llvm.atomic.cmp.swap.i64.p0i64( i64* <ptr>, i64 <cmp>,
       
       %memval2  = load i32* %ptr                ; yields {i32}:memval2 = 8
       
      +
      @@ -6584,41 +6732,37 @@ declare i64 @llvm.atomic.cmp.swap.i64.p0i64( i64* <ptr>, i64 <cmp>,
      Syntax:
      -

      - This is an overloaded intrinsic. You can use llvm.atomic.swap on any - integer bit width. Not all targets support all bit widths however.

      -
      -declare i8 @llvm.atomic.swap.i8.p0i8( i8* <ptr>, i8 <val> )
      -declare i16 @llvm.atomic.swap.i16.p0i16( i16* <ptr>, i16 <val> )
      -declare i32 @llvm.atomic.swap.i32.p0i32( i32* <ptr>, i32 <val> )
      -declare i64 @llvm.atomic.swap.i64.p0i64( i64* <ptr>, i64 <val> )
      +

      This is an overloaded intrinsic. You can use llvm.atomic.swap on any + integer bit width. Not all targets support all bit widths however.

      +
      +  declare i8 @llvm.atomic.swap.i8.p0i8( i8* <ptr>, i8 <val> )
      +  declare i16 @llvm.atomic.swap.i16.p0i16( i16* <ptr>, i16 <val> )
      +  declare i32 @llvm.atomic.swap.i32.p0i32( i32* <ptr>, i32 <val> )
      +  declare i64 @llvm.atomic.swap.i64.p0i64( i64* <ptr>, i64 <val> )
       
      +
      Overview:
      -

      - This intrinsic loads the value stored in memory at ptr and yields - the value from memory. It then stores the value in val in the memory - at ptr. -

      +

      This intrinsic loads the value stored in memory at ptr and yields + the value from memory. It then stores the value in val in the memory + at ptr.

      +
      Arguments:
      +

      The llvm.atomic.swap intrinsic takes two arguments. Both + the val argument and the result must be integers of the same bit + width. The first argument, ptr, must be a pointer to a value of this + integer type. The targets may only lower integer representations they + support.

      -

      - The llvm.atomic.swap intrinsic takes two arguments. Both the - val argument and the result must be integers of the same bit width. - The first argument, ptr, must be a pointer to a value of this - integer type. The targets may only lower integer representations they - support. -

      Semantics:
      -

      - This intrinsic loads the value pointed to by ptr, yields it, and - stores val back into ptr atomically. This provides the - equivalent of an atomic swap operation within the SSA framework. +

      This intrinsic loads the value pointed to by ptr, yields it, and + stores val back into ptr atomically. This provides the + equivalent of an atomic swap operation within the SSA framework.

      -

      Examples:
      -%ptr      = malloc i32
      +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
      +%ptr      = bitcast i8* %mallocP to i32*
                   store i32 4, %ptr
       
       %val1     = add i32 4, 4
      @@ -6634,6 +6778,7 @@ declare i64 @llvm.atomic.swap.i64.p0i64( i64* <ptr>, i64 <val> )
       %stored2  = icmp eq i32 %result2, 8     ; yields {i1}:stored2 = true
       %memval2  = load i32* %ptr              ; yields {i32}:memval2 = 2
       
      +
      @@ -6641,42 +6786,40 @@ declare i64 @llvm.atomic.swap.i64.p0i64( i64* <ptr>, i64 <val> ) 'llvm.atomic.load.add.*' Intrinsic
      +
      +
      Syntax:
      -

      - This is an overloaded intrinsic. You can use llvm.atomic.load.add on any - integer bit width. Not all targets support all bit widths however.

      -
      -declare i8 @llvm.atomic.load.add.i8..p0i8( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.add.i16..p0i16( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.add.i32..p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.add.i64..p0i64( i64* <ptr>, i64 <delta> )
      +

      This is an overloaded intrinsic. You can use llvm.atomic.load.add on + any integer bit width. Not all targets support all bit widths however.

      +
      +  declare i8 @llvm.atomic.load.add.i8..p0i8( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.add.i16..p0i16( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.add.i32..p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.add.i64..p0i64( i64* <ptr>, i64 <delta> )
       
      +
      Overview:
      -

      - This intrinsic adds delta to the value stored in memory at - ptr. It yields the original value at ptr. -

      +

      This intrinsic adds delta to the value stored in memory + at ptr. It yields the original value at ptr.

      +
      Arguments:
      -

      +

      The intrinsic takes two arguments, the first a pointer to an integer value + and the second an integer value. The result is also an integer value. These + integer types can have any bit width, but they must all have the same bit + width. The targets may only lower integer representations they support.

      - The intrinsic takes two arguments, the first a pointer to an integer value - and the second an integer value. The result is also an integer value. These - integer types can have any bit width, but they must all have the same bit - width. The targets may only lower integer representations they support. -

      Semantics:
      -

      - This intrinsic does a series of operations atomically. It first loads the - value stored at ptr. It then adds delta, stores the result - to ptr. It yields the original value stored at ptr. -

      +

      This intrinsic does a series of operations atomically. It first loads the + value stored at ptr. It then adds delta, stores the result + to ptr. It yields the original value stored at ptr.

      Examples:
      -%ptr      = malloc i32
      -        store i32 4, %ptr
      +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
      +%ptr      = bitcast i8* %mallocP to i32*
      +            store i32 4, %ptr
       %result1  = call i32 @llvm.atomic.load.add.i32.p0i32( i32* %ptr, i32 4 )
                                       ; yields {i32}:result1 = 4
       %result2  = call i32 @llvm.atomic.load.add.i32.p0i32( i32* %ptr, i32 2 )
      @@ -6685,6 +6828,7 @@ declare i64 @llvm.atomic.load.add.i64..p0i64( i64* <ptr>, i64 <delta>
                                       ; yields {i32}:result3 = 10
       %memval1  = load i32* %ptr      ; yields {i32}:memval1 = 15
       
      +
      @@ -6692,43 +6836,42 @@ declare i64 @llvm.atomic.load.add.i64..p0i64( i64* <ptr>, i64 <delta> 'llvm.atomic.load.sub.*' Intrinsic
      +
      +
      Syntax:
      -

      - This is an overloaded intrinsic. You can use llvm.atomic.load.sub on - any integer bit width and for different address spaces. Not all targets - support all bit widths however.

      -
      -declare i8 @llvm.atomic.load.sub.i8.p0i32( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.sub.i16.p0i32( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.sub.i32.p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.sub.i64.p0i32( i64* <ptr>, i64 <delta> )
      +

      This is an overloaded intrinsic. You can use llvm.atomic.load.sub on + any integer bit width and for different address spaces. Not all targets + support all bit widths however.

      +
      +  declare i8 @llvm.atomic.load.sub.i8.p0i32( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.sub.i16.p0i32( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.sub.i32.p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.sub.i64.p0i32( i64* <ptr>, i64 <delta> )
       
      +
      Overview:
      -

      - This intrinsic subtracts delta to the value stored in memory at - ptr. It yields the original value at ptr. -

      +

      This intrinsic subtracts delta to the value stored in memory at + ptr. It yields the original value at ptr.

      +
      Arguments:
      -

      +

      The intrinsic takes two arguments, the first a pointer to an integer value + and the second an integer value. The result is also an integer value. These + integer types can have any bit width, but they must all have the same bit + width. The targets may only lower integer representations they support.

      - The intrinsic takes two arguments, the first a pointer to an integer value - and the second an integer value. The result is also an integer value. These - integer types can have any bit width, but they must all have the same bit - width. The targets may only lower integer representations they support. -

      Semantics:
      -

      - This intrinsic does a series of operations atomically. It first loads the - value stored at ptr. It then subtracts delta, stores the - result to ptr. It yields the original value stored at ptr. -

      +

      This intrinsic does a series of operations atomically. It first loads the + value stored at ptr. It then subtracts delta, stores the + result to ptr. It yields the original value stored + at ptr.

      Examples:
      -%ptr      = malloc i32
      -        store i32 8, %ptr
      +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
      +%ptr      = bitcast i8* %mallocP to i32*
      +            store i32 8, %ptr
       %result1  = call i32 @llvm.atomic.load.sub.i32.p0i32( i32* %ptr, i32 4 )
                                       ; yields {i32}:result1 = 8
       %result2  = call i32 @llvm.atomic.load.sub.i32.p0i32( i32* %ptr, i32 2 )
      @@ -6737,6 +6880,7 @@ declare i64 @llvm.atomic.load.sub.i64.p0i32( i64* <ptr>, i64 <delta>
                                       ; yields {i32}:result3 = 2
       %memval1  = load i32* %ptr      ; yields {i32}:memval1 = -3
       
      +
      @@ -6745,72 +6889,67 @@ declare i64 @llvm.atomic.load.sub.i64.p0i32( i64* <ptr>, i64 <delta> 'llvm.atomic.load.nand.*' Intrinsic
      'llvm.atomic.load.or.*' Intrinsic
      'llvm.atomic.load.xor.*' Intrinsic
      - +
      +
      Syntax:
      -

      - These are overloaded intrinsics. You can use llvm.atomic.load_and, - llvm.atomic.load_nand, llvm.atomic.load_or, and - llvm.atomic.load_xor on any integer bit width and for different - address spaces. Not all targets support all bit widths however.

      -
      -declare i8 @llvm.atomic.load.and.i8.p0i8( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.and.i16.p0i16( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.and.i32.p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.and.i64.p0i64( i64* <ptr>, i64 <delta> )
      +

      These are overloaded intrinsics. You can + use llvm.atomic.load_and, llvm.atomic.load_nand, + llvm.atomic.load_or, and llvm.atomic.load_xor on any integer + bit width and for different address spaces. Not all targets support all bit + widths however.

      +
      +  declare i8 @llvm.atomic.load.and.i8.p0i8( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.and.i16.p0i16( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.and.i32.p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.and.i64.p0i64( i64* <ptr>, i64 <delta> )
       
      -declare i8 @llvm.atomic.load.or.i8.p0i8( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.or.i16.p0i16( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.or.i32.p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.or.i64.p0i64( i64* <ptr>, i64 <delta> )
      -
      +  declare i8 @llvm.atomic.load.or.i8.p0i8( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.or.i16.p0i16( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.or.i32.p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.or.i64.p0i64( i64* <ptr>, i64 <delta> )
       
      -declare i8 @llvm.atomic.load.nand.i8.p0i32( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.nand.i16.p0i32( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.nand.i32.p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.nand.i64.p0i32( i64* <ptr>, i64 <delta> )
      -
      +  declare i8 @llvm.atomic.load.nand.i8.p0i32( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.nand.i16.p0i32( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.nand.i32.p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.nand.i64.p0i32( i64* <ptr>, i64 <delta> )
       
      -declare i8 @llvm.atomic.load.xor.i8.p0i32( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.xor.i16.p0i32( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.xor.i32.p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.xor.i64.p0i32( i64* <ptr>, i64 <delta> )
      -
      +  declare i8 @llvm.atomic.load.xor.i8.p0i32( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.xor.i16.p0i32( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.xor.i32.p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.xor.i64.p0i32( i64* <ptr>, i64 <delta> )
       
      +
      Overview:
      -

      - These intrinsics bitwise the operation (and, nand, or, xor) delta to - the value stored in memory at ptr. It yields the original value - at ptr. -

      +

      These intrinsics bitwise the operation (and, nand, or, xor) delta to + the value stored in memory at ptr. It yields the original value + at ptr.

      +
      Arguments:
      -

      +

      These intrinsics take two arguments, the first a pointer to an integer value + and the second an integer value. The result is also an integer value. These + integer types can have any bit width, but they must all have the same bit + width. The targets may only lower integer representations they support.

      - These intrinsics take two arguments, the first a pointer to an integer value - and the second an integer value. The result is also an integer value. These - integer types can have any bit width, but they must all have the same bit - width. The targets may only lower integer representations they support. -

      Semantics:
      -

      - These intrinsics does a series of operations atomically. They first load the - value stored at ptr. They then do the bitwise operation - delta, store the result to ptr. They yield the original - value stored at ptr. -

      +

      These intrinsics does a series of operations atomically. They first load the + value stored at ptr. They then do the bitwise + operation delta, store the result to ptr. They yield the + original value stored at ptr.

      Examples:
      -%ptr      = malloc i32
      -        store i32 0x0F0F, %ptr
      +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
      +%ptr      = bitcast i8* %mallocP to i32*
      +            store i32 0x0F0F, %ptr
       %result0  = call i32 @llvm.atomic.load.nand.i32.p0i32( i32* %ptr, i32 0xFF )
                                       ; yields {i32}:result0 = 0x0F0F
       %result1  = call i32 @llvm.atomic.load.and.i32.p0i32( i32* %ptr, i32 0xFF )
      @@ -6821,8 +6960,8 @@ declare i64 @llvm.atomic.load.xor.i64.p0i32( i64* <ptr>, i64 <delta>
                                       ; yields {i32}:result3 = FF
       %memval1  = load i32* %ptr      ; yields {i32}:memval1 = F0
       
      -
      +
      @@ -6830,73 +6969,66 @@ declare i64 @llvm.atomic.load.xor.i64.p0i32( i64* <ptr>, i64 <delta> 'llvm.atomic.load.min.*' Intrinsic
      'llvm.atomic.load.umax.*' Intrinsic
      'llvm.atomic.load.umin.*' Intrinsic
      -
      +
      +
      Syntax:
      -

      - These are overloaded intrinsics. You can use llvm.atomic.load_max, - llvm.atomic.load_min, llvm.atomic.load_umax, and - llvm.atomic.load_umin on any integer bit width and for different - address spaces. Not all targets - support all bit widths however.

      -
      -declare i8 @llvm.atomic.load.max.i8.p0i8( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.max.i16.p0i16( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.max.i32.p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.max.i64.p0i64( i64* <ptr>, i64 <delta> )
      +

      These are overloaded intrinsics. You can use llvm.atomic.load_max, + llvm.atomic.load_min, llvm.atomic.load_umax, and + llvm.atomic.load_umin on any integer bit width and for different + address spaces. Not all targets support all bit widths however.

      +
      +  declare i8 @llvm.atomic.load.max.i8.p0i8( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.max.i16.p0i16( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.max.i32.p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.max.i64.p0i64( i64* <ptr>, i64 <delta> )
       
      -declare i8 @llvm.atomic.load.min.i8.p0i8( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.min.i16.p0i16( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.min.i32..p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.min.i64..p0i64( i64* <ptr>, i64 <delta> )
      -
      +  declare i8 @llvm.atomic.load.min.i8.p0i8( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.min.i16.p0i16( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.min.i32..p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.min.i64..p0i64( i64* <ptr>, i64 <delta> )
       
      -declare i8 @llvm.atomic.load.umax.i8.p0i8( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.umax.i16.p0i16( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.umax.i32.p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.umax.i64.p0i64( i64* <ptr>, i64 <delta> )
      -
      +  declare i8 @llvm.atomic.load.umax.i8.p0i8( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.umax.i16.p0i16( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.umax.i32.p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.umax.i64.p0i64( i64* <ptr>, i64 <delta> )
       
      -declare i8 @llvm.atomic.load.umin.i8..p0i8( i8* <ptr>, i8 <delta> )
      -declare i16 @llvm.atomic.load.umin.i16.p0i16( i16* <ptr>, i16 <delta> )
      -declare i32 @llvm.atomic.load.umin.i32..p0i32( i32* <ptr>, i32 <delta> )
      -declare i64 @llvm.atomic.load.umin.i64..p0i64( i64* <ptr>, i64 <delta> )
      -
      +  declare i8 @llvm.atomic.load.umin.i8..p0i8( i8* <ptr>, i8 <delta> )
      +  declare i16 @llvm.atomic.load.umin.i16.p0i16( i16* <ptr>, i16 <delta> )
      +  declare i32 @llvm.atomic.load.umin.i32..p0i32( i32* <ptr>, i32 <delta> )
      +  declare i64 @llvm.atomic.load.umin.i64..p0i64( i64* <ptr>, i64 <delta> )
       
      +
      Overview:
      -

      - These intrinsics takes the signed or unsigned minimum or maximum of - delta and the value stored in memory at ptr. It yields the - original value at ptr. -

      +

      These intrinsics takes the signed or unsigned minimum or maximum of + delta and the value stored in memory at ptr. It yields the + original value at ptr.

      +
      Arguments:
      -

      +

      These intrinsics take two arguments, the first a pointer to an integer value + and the second an integer value. The result is also an integer value. These + integer types can have any bit width, but they must all have the same bit + width. The targets may only lower integer representations they support.

      - These intrinsics take two arguments, the first a pointer to an integer value - and the second an integer value. The result is also an integer value. These - integer types can have any bit width, but they must all have the same bit - width. The targets may only lower integer representations they support. -

      Semantics:
      -

      - These intrinsics does a series of operations atomically. They first load the - value stored at ptr. They then do the signed or unsigned min or max - delta and the value, store the result to ptr. They yield - the original value stored at ptr. -

      +

      These intrinsics does a series of operations atomically. They first load the + value stored at ptr. They then do the signed or unsigned min or + max delta and the value, store the result to ptr. They + yield the original value stored at ptr.

      Examples:
      -%ptr      = malloc i32
      -        store i32 7, %ptr
      +%mallocP  = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null, i32 1) to i32))
      +%ptr      = bitcast i8* %mallocP to i32*
      +            store i32 7, %ptr
       %result0  = call i32 @llvm.atomic.load.min.i32.p0i32( i32* %ptr, i32 -2 )
                                       ; yields {i32}:result0 = 7
       %result1  = call i32 @llvm.atomic.load.max.i32.p0i32( i32* %ptr, i32 8 )
      @@ -6907,6 +7039,134 @@ declare i64 @llvm.atomic.load.umin.i64..p0i64( i64* <ptr>, i64 <delta&g
                                       ; yields {i32}:result3 = 8
       %memval1  = load i32* %ptr      ; yields {i32}:memval1 = 30
       
      + +
      + + + + + +
      + +

      This class of intrinsics exists to information about the lifetime of memory + objects and ranges where variables are immutable.

      + +
      + + + + +
      + +
      Syntax:
      +
      +  declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
      +
      + +
      Overview:
      +

      The 'llvm.lifetime.start' intrinsic specifies the start of a memory + object's lifetime.

      + +
      Arguments:
      +

      The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.

      + +
      Semantics:
      +

      This intrinsic indicates that before this point in the code, the value of the + memory pointed to by ptr is dead. This means that it is known to + never be used and has an undefined value. A load from the pointer that + precedes this intrinsic can be replaced with + 'undef'.

      + +
      + + + + +
      + +
      Syntax:
      +
      +  declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
      +
      + +
      Overview:
      +

      The 'llvm.lifetime.end' intrinsic specifies the end of a memory + object's lifetime.

      + +
      Arguments:
      +

      The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.

      + +
      Semantics:
      +

      This intrinsic indicates that after this point in the code, the value of the + memory pointed to by ptr is dead. This means that it is known to + never be used and has an undefined value. Any stores into the memory object + following this intrinsic may be removed as dead. + +

      + + + + +
      + +
      Syntax:
      +
      +  declare {}* @llvm.invariant.start(i64 <size>, i8* nocapture <ptr>) readonly
      +
      + +
      Overview:
      +

      The 'llvm.invariant.start' intrinsic specifies that the contents of + a memory object will not change.

      + +
      Arguments:
      +

      The first argument is a constant integer representing the size of the + object, or -1 if it is variable sized. The second argument is a pointer to + the object.

      + +
      Semantics:
      +

      This intrinsic indicates that until an llvm.invariant.end that uses + the return value, the referenced memory location is constant and + unchanging.

      + +
      + + + + +
      + +
      Syntax:
      +
      +  declare void @llvm.invariant.end({}* <start>, i64 <size>, i8* nocapture <ptr>)
      +
      + +
      Overview:
      +

      The 'llvm.invariant.end' intrinsic specifies that the contents of + a memory object are mutable.

      + +
      Arguments:
      +

      The first argument is the matching llvm.invariant.start intrinsic. + The second argument is a constant integer representing the size of the + object, or -1 if it is variable sized and the third argument is a pointer + to the object.

      + +
      Semantics:
      +

      This intrinsic indicates that the memory is mutable again.

      +
      @@ -6915,8 +7175,10 @@ declare i64 @llvm.atomic.load.umin.i64..p0i64( i64* <ptr>, i64 <delta&g
      -

      This class of intrinsics is designed to be generic and has -no specific purpose.

      + +

      This class of intrinsics is designed to be generic and has no specific + purpose.

      +
      @@ -6932,27 +7194,19 @@ no specific purpose.

      Overview:
      - -

      -The 'llvm.var.annotation' intrinsic -

      +

      The 'llvm.var.annotation' intrinsic.

      Arguments:
      - -

      -The first argument is a pointer to a value, the second is a pointer to a -global string, the third is a pointer to a global string which is the source -file name, and the last argument is the line number. -

      +

      The first argument is a pointer to a value, the second is a pointer to a + global string, the third is a pointer to a global string which is the source + file name, and the last argument is the line number.

      Semantics:
      +

      This intrinsic allows annotation of local variables with arbitrary strings. + This can be useful for special purpose optimizations that want to look for + these annotations. These have no other defined use, they are ignored by code + generation and optimization.

      -

      -This intrinsic allows annotation of local variables with arbitrary strings. -This can be useful for special purpose optimizations that want to look for these -annotations. These have no other defined use, they are ignored by code -generation and optimization. -

      @@ -6963,9 +7217,9 @@ generation and optimization.
      Syntax:
      -

      This is an overloaded intrinsic. You can use 'llvm.annotation' on -any integer bit width. -

      +

      This is an overloaded intrinsic. You can use 'llvm.annotation' on + any integer bit width.

      +
         declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32  <int> )
         declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32  <int> )
      @@ -6975,28 +7229,20 @@ any integer bit width.
       
      Overview:
      - -

      -The 'llvm.annotation' intrinsic. -

      +

      The 'llvm.annotation' intrinsic.

      Arguments:
      - -

      -The first argument is an integer value (result of some expression), -the second is a pointer to a global string, the third is a pointer to a global -string which is the source file name, and the last argument is the line number. -It returns the value of the first argument. -

      +

      The first argument is an integer value (result of some expression), the + second is a pointer to a global string, the third is a pointer to a global + string which is the source file name, and the last argument is the line + number. It returns the value of the first argument.

      Semantics:
      +

      This intrinsic allows annotations to be put on arbitrary expressions with + arbitrary strings. This can be useful for special purpose optimizations that + want to look for these annotations. These have no other defined use, they + are ignored by code generation and optimization.

      -

      -This intrinsic allows annotations to be put on arbitrary expressions -with arbitrary strings. This can be useful for special purpose optimizations -that want to look for these annotations. These have no other defined use, they -are ignored by code generation and optimization. -

      @@ -7012,58 +7258,86 @@ are ignored by code generation and optimization.
      Overview:
      - -

      -The 'llvm.trap' intrinsic -

      +

      The 'llvm.trap' intrinsic.

      Arguments:
      - -

      -None -

      +

      None.

      Semantics:
      +

      This intrinsics is lowered to the target dependent trap instruction. If the + target does not have a trap instruction, this intrinsic will be lowered to + the call of the abort() function.

      -

      -This intrinsics is lowered to the target dependent trap instruction. If the -target does not have a trap instruction, this intrinsic will be lowered to the -call of the abort() function. -

      +
      +
      Syntax:
      -declare void @llvm.stackprotector( i8* <guard>, i8** <slot> )
      +  declare void @llvm.stackprotector( i8* <guard>, i8** <slot> )
      +
      + +
      Overview:
      +

      The llvm.stackprotector intrinsic takes the guard and + stores it onto the stack at slot. The stack slot is adjusted to + ensure that it is placed on the stack before local variables.

      + +
      Arguments:
      +

      The llvm.stackprotector intrinsic requires two pointer + arguments. The first argument is the value loaded from the stack + guard @__stack_chk_guard. The second variable is an alloca + that has enough space to hold the value of the guard.

      + +
      Semantics:
      +

      This intrinsic causes the prologue/epilogue inserter to force the position of + the AllocaInst stack slot to be before local variables on the + stack. This is to ensure that if a local variable on the stack is + overwritten, it will destroy the value of the guard. When the function exits, + the guard on the stack is checked against the original guard. If they're + different, then the program aborts by calling the __stack_chk_fail() + function.

      +
      + + + + +
      + +
      Syntax:
      +
      +  declare i32 @llvm.objectsize.i32( i8* <object>, i1 <type> )
      +  declare i64 @llvm.objectsize.i64( i8* <object>, i1 <type> )
       
      +
      Overview:
      -

      - The llvm.stackprotector intrinsic takes the guard and stores - it onto the stack at slot. The stack slot is adjusted to ensure that - it is placed on the stack before local variables. -

      +

      The llvm.objectsize intrinsic is designed to provide information + to the optimizers to discover at compile time either a) when an + operation like memcpy will either overflow a buffer that corresponds to + an object, or b) to determine that a runtime check for overflow isn't + necessary. An object in this context means an allocation of a + specific class, structure, array, or other object.

      +
      Arguments:
      -

      - The llvm.stackprotector intrinsic requires two pointer arguments. The - first argument is the value loaded from the stack guard - @__stack_chk_guard. The second variable is an alloca that - has enough space to hold the value of the guard. -

      +

      The llvm.objectsize intrinsic takes two arguments. The first + argument is a pointer to or into the object. The second argument + is a boolean 0 or 1. This argument determines whether you want the + maximum (0) or minimum (1) bytes remaining. This needs to be a literal 0 or + 1, variables are not allowed.

      +
      Semantics:
      -

      - This intrinsic causes the prologue/epilogue inserter to force the position of - the AllocaInst stack slot to be before local variables on the - stack. This is to ensure that if a local variable on the stack is overwritten, - it will destroy the value of the guard. When the function exits, the guard on - the stack is checked against the original guard. If they're different, then - the program aborts by calling the __stack_chk_fail() function. -

      +

      The llvm.objectsize intrinsic is lowered to either a constant + representing the size of the object concerned or i32/i64 -1 or 0 + (depending on the type argument if the size cannot be determined + at compile time.

      +