A model can also be specified in a alias, but then it only governs how
the alias is accessed. It will not have any effect in the aliasee.
+For platforms without linker support of ELF TLS model, the -femulated-tls
+flag can be used to generate GCC compatible emulated TLS code.
+
.. _namedtypes:
Structure Types
COMDAT groups and COMDATs, at the object file level, are represented by
sections.
-Note that certain IR constructs like global variables and functions may create
-COMDATs in the object file in addition to any which are specified using COMDAT
-IR. This arises, for example, when a global variable has linkonce_odr linkage.
+Note that certain IR constructs like global variables and functions may
+create COMDATs in the object file in addition to any which are specified using
+COMDAT IR. This arises when the code generator is configured to emit globals
+in individual sections (e.g. when `-data-sections` or `-function-sections`
+is supplied to `llc`).
.. _namedmetadatastructure:
On an argument, this attribute indicates that the function does not write
through this pointer argument, even though it may write to the memory that
the pointer points to.
+``argmemonly``
+ This attribute indicates that the only memory accesses inside function are
+ loads and stores from objects pointed to by its pointer-typed arguments,
+ with arbitrary offsets. Or in other words, all memory operations in the
+ function can refer to memory only using pointers based on its function
+ arguments.
+ Note that ``argmemonly`` can be used together with ``readonly`` attribute
+ in order to specify that function reads only from its arguments.
``returns_twice``
This attribute indicates that this function can return twice. The C
``setjmp`` is an example of such a function. The compiler disables
characters. The escape sequence used is simply "\\xx" where "xx" is the
two digit hex code for the number.
-The inline asm code is simply printed to the machine code .s file when
-assembly code is generated.
+Note that the assembly string *must* be parseable by LLVM's integrated assembler
+(unless it is disabled), even when emitting a ``.s`` file.
.. _langref_datalayout:
LLVM IR floating-point binary ops (:ref:`fadd <i_fadd>`,
:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
-:ref:`frem <i_frem>`) have the following flags that can be set to enable
-otherwise unsafe floating point operations
+:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) have the following flags that can
+be set to enable otherwise unsafe floating point operations
``nnan``
No NaNs - Allow optimizations to assume the arguments and result are not
----------------------------
LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
-Inline Assembly <moduleasm>`) through the use of a special value. This
-value represents the inline assembler as a string (containing the
-instructions to emit), a list of operand constraints (stored as a
-string), a flag that indicates whether or not the inline asm expression
-has side effects, and a flag indicating whether the function containing
-the asm needs to align its stack conservatively. An example inline
-assembler expression is:
+Inline Assembly <moduleasm>`) through the use of a special value. This value
+represents the inline assembler as a template string (containing the
+instructions to emit), a list of operand constraints (stored as a string), a
+flag that indicates whether or not the inline asm expression has side effects,
+and a flag indicating whether the function containing the asm needs to align its
+stack conservatively.
+
+The template string supports argument substitution of the operands using "``$``"
+followed by a number, to indicate substitution of the given register/memory
+location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
+be used, where ``MODIFIER`` is a target-specific annotation for how to print the
+operand (See :ref:`inline-asm-modifiers`).
+
+A literal "``$``" may be included by using "``$$``" in the template. To include
+other special characters into the output, the usual "``\XX``" escapes may be
+used, just as in other strings. Note that after template substitution, the
+resulting assembly string is parsed by LLVM's integrated assembler unless it is
+disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
+syntax known to LLVM.
+
+LLVM's support for inline asm is modeled closely on the requirements of Clang's
+GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
+modifier codes listed here are similar or identical to those in GCC's inline asm
+support. However, to be clear, the syntax of the template and constraint strings
+described here is *not* the same as the syntax accepted by GCC and Clang, and,
+while most constraint letters are passed through as-is by Clang, some get
+translated to other codes when converting from the C source to the LLVM
+assembly.
+
+An example inline assembler expression is:
.. code-block:: llvm
first, the '``alignstack``' keyword second and the '``inteldialect``'
keyword last.
+Inline Asm Constraint String
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The constraint list is a comma-separated string, each element containing one or
+more constraint codes.
+
+For each element in the constraint list an appropriate register or memory
+operand will be chosen, and it will be made available to assembly template
+string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
+second, etc.
+
+There are three different types of constraints, which are distinguished by a
+prefix symbol in front of the constraint code: Output, Input, and Clobber. The
+constraints must always be given in that order: outputs first, then inputs, then
+clobbers. They cannot be intermingled.
+
+There are also three different categories of constraint codes:
+
+- Register constraint. This is either a register class, or a fixed physical
+ register. This kind of constraint will allocate a register, and if necessary,
+ bitcast the argument or result to the appropriate type.
+- Memory constraint. This kind of constraint is for use with an instruction
+ taking a memory operand. Different constraints allow for different addressing
+ modes used by the target.
+- Immediate value constraint. This kind of constraint is for an integer or other
+ immediate value which can be rendered directly into an instruction. The
+ various target-specific constraints allow the selection of a value in the
+ proper range for the instruction you wish to use it with.
+
+Output constraints
+""""""""""""""""""
+
+Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
+indicates that the assembly will write to this operand, and the operand will
+then be made available as a return value of the ``asm`` expression. Output
+constraints do not consume an argument from the call instruction. (Except, see
+below about indirect outputs).
+
+Normally, it is expected that no output locations are written to by the assembly
+expression until *all* of the inputs have been read. As such, LLVM may assign
+the same register to an output and an input. If this is not safe (e.g. if the
+assembly contains two instructions, where the first writes to one output, and
+the second reads an input and writes to a second output), then the "``&``"
+modifier must be used (e.g. "``=&r``") to specify that the output is an
+"early-clobber" output. Marking an ouput as "early-clobber" ensures that LLVM
+will not use the same register for any inputs (other than an input tied to this
+output).
+
+Input constraints
+"""""""""""""""""
+
+Input constraints do not have a prefix -- just the constraint codes. Each input
+constraint will consume one argument from the call instruction. It is not
+permitted for the asm to write to any input register or memory location (unless
+that input is tied to an output). Note also that multiple inputs may all be
+assigned to the same register, if LLVM can determine that they necessarily all
+contain the same value.
+
+Instead of providing a Constraint Code, input constraints may also "tie"
+themselves to an output constraint, by providing an integer as the constraint
+string. Tied inputs still consume an argument from the call instruction, and
+take up a position in the asm template numbering as is usual -- they will simply
+be constrained to always use the same register as the output they've been tied
+to. For example, a constraint string of "``=r,0``" says to assign a register for
+output, and use that register as an input as well (it being the 0'th
+constraint).
+
+It is permitted to tie an input to an "early-clobber" output. In that case, no
+*other* input may share the same register as the input tied to the early-clobber
+(even when the other input has the same value).
+
+You may only tie an input to an output which has a register constraint, not a
+memory constraint. Only a single input may be tied to an output.
+
+There is also an "interesting" feature which deserves a bit of explanation: if a
+register class constraint allocates a register which is too small for the value
+type operand provided as input, the input value will be split into multiple
+registers, and all of them passed to the inline asm.
+
+However, this feature is often not as useful as you might think.
+
+Firstly, the registers are *not* guaranteed to be consecutive. So, on those
+architectures that have instructions which operate on multiple consecutive
+instructions, this is not an appropriate way to support them. (e.g. the 32-bit
+SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
+hardware then loads into both the named register, and the next register. This
+feature of inline asm would not be useful to support that.)
+
+A few of the targets provide a template string modifier allowing explicit access
+to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
+``D``). On such an architecture, you can actually access the second allocated
+register (yet, still, not any subsequent ones). But, in that case, you're still
+probably better off simply splitting the value into two separate operands, for
+clarity. (e.g. see the description of the ``A`` constraint on X86, which,
+despite existing only for use with this feature, is not really a good idea to
+use)
+
+Indirect inputs and outputs
+"""""""""""""""""""""""""""
+
+Indirect output or input constraints can be specified by the "``*``" modifier
+(which goes after the "``=``" in case of an output). This indicates that the asm
+will write to or read from the contents of an *address* provided as an input
+argument. (Note that in this way, indirect outputs act more like an *input* than
+an output: just like an input, they consume an argument of the call expression,
+rather than producing a return value. An indirect output constraint is an
+"output" only in that the asm is expected to write to the contents of the input
+memory location, instead of just read from it).
+
+This is most typically used for memory constraint, e.g. "``=*m``", to pass the
+address of a variable as a value.
+
+It is also possible to use an indirect *register* constraint, but only on output
+(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
+value normally, and then, separately emit a store to the address provided as
+input, after the provided inline asm. (It's not clear what value this
+functionality provides, compared to writing the store explicitly after the asm
+statement, and it can only produce worse code, since it bypasses many
+optimization passes. I would recommend not using it.)
+
+
+Clobber constraints
+"""""""""""""""""""
+
+A clobber constraint is indicated by a "``~``" prefix. A clobber does not
+consume an input operand, nor generate an output. Clobbers cannot use any of the
+general constraint code letters -- they may use only explicit register
+constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
+"``~{memory}``" indicates that the assembly writes to arbitrary undeclared
+memory locations -- not only the memory pointed to by a declared indirect
+output.
+
+
+Constraint Codes
+""""""""""""""""
+After a potential prefix comes constraint code, or codes.
+
+A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
+followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
+(e.g. "``{eax}``").
+
+The one and two letter constraint codes are typically chosen to be the same as
+GCC's constraint codes.
+
+A single constraint may include one or more than constraint code in it, leaving
+it up to LLVM to choose which one to use. This is included mainly for
+compatibility with the translation of GCC inline asm coming from clang.
+
+There are two ways to specify alternatives, and either or both may be used in an
+inline asm constraint list:
+
+1) Append the codes to each other, making a constraint code set. E.g. "``im``"
+ or "``{eax}m``". This means "choose any of the options in the set". The
+ choice of constraint is made independently for each constraint in the
+ constraint list.
+
+2) Use "``|``" between constraint code sets, creating alternatives. Every
+ constraint in the constraint list must have the same number of alternative
+ sets. With this syntax, the same alternative in *all* of the items in the
+ constraint list will be chosen together.
+
+Putting those together, you might have a two operand constraint string like
+``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
+operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
+may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
+
+However, the use of either of the alternatives features is *NOT* recommended, as
+LLVM is not able to make an intelligent choice about which one to use. (At the
+point it currently needs to choose, not enough information is available to do so
+in a smart way.) Thus, it simply tries to make a choice that's most likely to
+compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
+always choose to use memory, not registers). And, if given multiple registers,
+or multiple register classes, it will simply choose the first one. (In fact, it
+doesn't currently even ensure explicitly specified physical registers are
+unique, so specifying multiple physical registers as alternatives, like
+``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
+intended.)
+
+Supported Constraint Code List
+""""""""""""""""""""""""""""""
+
+The constraint codes are, in general, expected to behave the same way they do in
+GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
+inline asm code which was supported by GCC. A mismatch in behavior between LLVM
+and GCC likely indicates a bug in LLVM.
+
+Some constraint codes are typically supported by all targets:
+
+- ``r``: A register in the target's general purpose register class.
+- ``m``: A memory address operand. It is target-specific what addressing modes
+ are supported, typical examples are register, or register + register offset,
+ or register + immediate offset (of some target-specific size).
+- ``i``: An integer constant (of target-specific width). Allows either a simple
+ immediate, or a relocatable value.
+- ``n``: An integer constant -- *not* including relocatable values.
+- ``s``: An integer constant, but allowing *only* relocatable values.
+- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
+ useful to pass a label for an asm branch or call.
+
+ .. FIXME: but that surely isn't actually okay to jump out of an asm
+ block without telling llvm about the control transfer???)
+
+- ``{register-name}``: Requires exactly the named physical register.
+
+Other constraints are target-specific:
+
+AArch64:
+
+- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
+- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
+ i.e. 0 to 4095 with optional shift by 12.
+- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
+ ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
+- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
+ logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
+- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
+ logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
+- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
+ 32-bit register. This is a superset of ``K``: in addition to the bitmask
+ immediate, also allows immediate integers which can be loaded with a single
+ ``MOVZ`` or ``MOVL`` instruction.
+- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
+ 64-bit register. This is a superset of ``L``.
+- ``Q``: Memory address operand must be in a single register (no
+ offsets). (However, LLVM currently does this for the ``m`` constraint as
+ well.)
+- ``r``: A 32 or 64-bit integer register (W* or X*).
+- ``w``: A 32, 64, or 128-bit floating-point/SIMD register.
+- ``x``: A lower 128-bit floating-point/SIMD register (``V0`` to ``V15``).
+
+AMDGPU:
+
+- ``r``: A 32 or 64-bit integer register.
+- ``[0-9]v``: The 32-bit VGPR register, number 0-9.
+- ``[0-9]s``: The 32-bit SGPR register, number 0-9.
+
+
+All ARM modes:
+
+- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
+ operand. Treated the same as operand ``m``, at the moment.
+
+ARM and ARM's Thumb2 mode:
+
+- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
+- ``I``: An immediate integer valid for a data-processing instruction.
+- ``J``: An immediate integer between -4095 and 4095.
+- ``K``: An immediate integer whose bitwise inverse is valid for a
+ data-processing instruction. (Can be used with template modifier "``B``" to
+ print the inverted value).
+- ``L``: An immediate integer whose negation is valid for a data-processing
+ instruction. (Can be used with template modifier "``n``" to print the negated
+ value).
+- ``M``: A power of two or a integer between 0 and 32.
+- ``N``: Invalid immediate constraint.
+- ``O``: Invalid immediate constraint.
+- ``r``: A general-purpose 32-bit integer register (``r0-r15``).
+- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
+ as ``r``.
+- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
+ invalid.
+- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
+ ``d0-d31``, or ``q0-q15``.
+- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
+ ``d0-d7``, or ``q0-q3``.
+- ``t``: A floating-point/SIMD register, only supports 32-bit values:
+ ``s0-s31``.
+
+ARM's Thumb1 mode:
+
+- ``I``: An immediate integer between 0 and 255.
+- ``J``: An immediate integer between -255 and -1.
+- ``K``: An immediate integer between 0 and 255, with optional left-shift by
+ some amount.
+- ``L``: An immediate integer between -7 and 7.
+- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
+- ``N``: An immediate integer between 0 and 31.
+- ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
+- ``r``: A low 32-bit GPR register (``r0-r7``).
+- ``l``: A low 32-bit GPR register (``r0-r7``).
+- ``h``: A high GPR register (``r0-r7``).
+- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
+ ``d0-d31``, or ``q0-q15``.
+- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
+ ``d0-d7``, or ``q0-q3``.
+- ``t``: A floating-point/SIMD register, only supports 32-bit values:
+ ``s0-s31``.
+
+
+Hexagon:
+
+- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
+ at the moment.
+- ``r``: A 32 or 64-bit register.
+
+MSP430:
+
+- ``r``: An 8 or 16-bit register.
+
+MIPS:
+
+- ``I``: An immediate signed 16-bit integer.
+- ``J``: An immediate integer zero.
+- ``K``: An immediate unsigned 16-bit integer.
+- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
+- ``N``: An immediate integer between -65535 and -1.
+- ``O``: An immediate signed 15-bit integer.
+- ``P``: An immediate integer between 1 and 65535.
+- ``m``: A memory address operand. In MIPS-SE mode, allows a base address
+ register plus 16-bit immediate offset. In MIPS mode, just a base register.
+- ``R``: A memory address operand. In MIPS-SE mode, allows a base address
+ register plus a 9-bit signed offset. In MIPS mode, the same as constraint
+ ``m``.
+- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
+ ``sc`` instruction on the given subtarget (details vary).
+- ``r``, ``d``, ``y``: A 32 or 64-bit GPR register.
+- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
+ (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
+ argument modifier for compatibility with GCC.
+- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
+ ``25``).
+- ``l``: The ``lo`` register, 32 or 64-bit.
+- ``x``: Invalid.
+
+NVPTX:
+
+- ``b``: A 1-bit integer register.
+- ``c`` or ``h``: A 16-bit integer register.
+- ``r``: A 32-bit integer register.
+- ``l`` or ``N``: A 64-bit integer register.
+- ``f``: A 32-bit float register.
+- ``d``: A 64-bit float register.
+
+
+PowerPC:
+
+- ``I``: An immediate signed 16-bit integer.
+- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
+- ``K``: An immediate unsigned 16-bit integer.
+- ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
+- ``M``: An immediate integer greater than 31.
+- ``N``: An immediate integer that is an exact power of 2.
+- ``O``: The immediate integer constant 0.
+- ``P``: An immediate integer constant whose negation is a signed 16-bit
+ constant.
+- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
+ treated the same as ``m``.
+- ``r``: A 32 or 64-bit integer register.
+- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
+ ``R1-R31``).
+- ``f``: A 32 or 64-bit float register (``F0-F31``), or when QPX is enabled, a
+ 128 or 256-bit QPX register (``Q0-Q31``; aliases the ``F`` registers).
+- ``v``: For ``4 x f32`` or ``4 x f64`` types, when QPX is enabled, a
+ 128 or 256-bit QPX register (``Q0-Q31``), otherwise a 128-bit
+ altivec vector register (``V0-V31``).
+
+ .. FIXME: is this a bug that v accepts QPX registers? I think this
+ is supposed to only use the altivec vector registers?
+
+- ``y``: Condition register (``CR0-CR7``).
+- ``wc``: An individual CR bit in a CR register.
+- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
+ register set (overlapping both the floating-point and vector register files).
+- ``ws``: A 32 or 64-bit floating point register, from the full VSX register
+ set.
+
+Sparc:
+
+- ``I``: An immediate 13-bit signed integer.
+- ``r``: A 32-bit integer register.
+
+SystemZ:
+
+- ``I``: An immediate unsigned 8-bit integer.
+- ``J``: An immediate unsigned 12-bit integer.
+- ``K``: An immediate signed 16-bit integer.
+- ``L``: An immediate signed 20-bit integer.
+- ``M``: An immediate integer 0x7fffffff.
+- ``Q``, ``R``, ``S``, ``T``: A memory address operand, treated the same as
+ ``m``, at the moment.
+- ``r`` or ``d``: A 32, 64, or 128-bit integer register.
+- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
+ address context evaluates as zero).
+- ``h``: A 32-bit value in the high part of a 64bit data register
+ (LLVM-specific)
+- ``f``: A 32, 64, or 128-bit floating point register.
+
+X86:
+
+- ``I``: An immediate integer between 0 and 31.
+- ``J``: An immediate integer between 0 and 64.
+- ``K``: An immediate signed 8-bit integer.
+- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
+ 0xffffffff.
+- ``M``: An immediate integer between 0 and 3.
+- ``N``: An immediate unsigned 8-bit integer.
+- ``O``: An immediate integer between 0 and 127.
+- ``e``: An immediate 32-bit signed integer.
+- ``Z``: An immediate 32-bit unsigned integer.
+- ``o``, ``v``: Treated the same as ``m``, at the moment.
+- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
+ ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
+ registers, and on X86-64, it is all of the integer registers.
+- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
+ ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
+- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
+- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
+ existed since i386, and can be accessed without the REX prefix.
+- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
+- ``y``: A 64-bit MMX register, if MMX is enabled.
+- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
+ operand in a SSE register. If AVX is also enabled, can also be a 256-bit
+ vector operand in an AVX register. If AVX-512 is also enabled, can also be a
+ 512-bit vector operand in an AVX512 register, Otherwise, an error.
+- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
+- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
+ 32-bit mode, a 64-bit integer operand will get split into two registers). It
+ is not recommended to use this constraint, as in 64-bit mode, the 64-bit
+ operand will get allocated only to RAX -- if two 32-bit operands are needed,
+ you're better off splitting it yourself, before passing it to the asm
+ statement.
+
+XCore:
+
+- ``r``: A 32-bit integer register.
+
+
+.. _inline-asm-modifiers:
+
+Asm template argument modifiers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In the asm template string, modifiers can be used on the operand reference, like
+"``${0:n}``".
+
+The modifiers are, in general, expected to behave the same way they do in
+GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
+inline asm code which was supported by GCC. A mismatch in behavior between LLVM
+and GCC likely indicates a bug in LLVM.
+
+Target-independent:
+
+- ``c``: Print an immediate integer constant unadorned, without
+ the target-specific immediate punctuation (e.g. no ``$`` prefix).
+- ``n``: Negate and print immediate integer constant unadorned, without the
+ target-specific immediate punctuation (e.g. no ``$`` prefix).
+- ``l``: Print as an unadorned label, without the target-specific label
+ punctuation (e.g. no ``$`` prefix).
+
+AArch64:
+
+- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
+ instead of ``x30``, print ``w30``.
+- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
+- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
+ ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
+ ``v*``.
+
+AMDGPU:
+
+- ``r``: No effect.
+
+ARM:
+
+- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
+ register).
+- ``P``: No effect.
+- ``q``: No effect.
+- ``y``: Print a VFP single-precision register as an indexed double (e.g. print
+ as ``d4[1]`` instead of ``s9``)
+- ``B``: Bitwise invert and print an immediate integer constant without ``#``
+ prefix.
+- ``L``: Print the low 16-bits of an immediate integer constant.
+- ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
+ register operands subsequent to the specified one (!), so use carefully.
+- ``Q``: Print the low-order register of a register-pair, or the low-order
+ register of a two-register operand.
+- ``R``: Print the high-order register of a register-pair, or the high-order
+ register of a two-register operand.
+- ``H``: Print the second register of a register-pair. (On a big-endian system,
+ ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
+ to ``R``.)
+
+ .. FIXME: H doesn't currently support printing the second register
+ of a two-register operand.
+
+- ``e``: Print the low doubleword register of a NEON quad register.
+- ``f``: Print the high doubleword register of a NEON quad register.
+- ``m``: Print the base register of a memory operand without the ``[`` and ``]``
+ adornment.
+
+Hexagon:
+
+- ``L``: Print the second register of a two-register operand. Requires that it
+ has been allocated consecutively to the first.
+
+ .. FIXME: why is it restricted to consecutive ones? And there's
+ nothing that ensures that happens, is there?
+
+- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
+ nothing. Used to print 'addi' vs 'add' instructions.
+
+MSP430:
+
+No additional modifiers.
+
+MIPS:
+
+- ``X``: Print an immediate integer as hexadecimal
+- ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
+- ``d``: Print an immediate integer as decimal.
+- ``m``: Subtract one and print an immediate integer as decimal.
+- ``z``: Print $0 if an immediate zero, otherwise print normally.
+- ``L``: Print the low-order register of a two-register operand, or prints the
+ address of the low-order word of a double-word memory operand.
+
+ .. FIXME: L seems to be missing memory operand support.
+
+- ``M``: Print the high-order register of a two-register operand, or prints the
+ address of the high-order word of a double-word memory operand.
+
+ .. FIXME: M seems to be missing memory operand support.
+
+- ``D``: Print the second register of a two-register operand, or prints the
+ second word of a double-word memory operand. (On a big-endian system, ``D`` is
+ equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
+ ``M``.)
+- ``w``: No effect. Provided for compatibility with GCC which requires this
+ modifier in order to print MSA registers (``W0-W31``) with the ``f``
+ constraint.
+
+NVPTX:
+
+- ``r``: No effect.
+
+PowerPC:
+
+- ``L``: Print the second register of a two-register operand. Requires that it
+ has been allocated consecutively to the first.
+
+ .. FIXME: why is it restricted to consecutive ones? And there's
+ nothing that ensures that happens, is there?
+
+- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
+ nothing. Used to print 'addi' vs 'add' instructions.
+- ``y``: For a memory operand, prints formatter for a two-register X-form
+ instruction. (Currently always prints ``r0,OPERAND``).
+- ``U``: Prints 'u' if the memory operand is an update form, and nothing
+ otherwise. (NOTE: LLVM does not support update form, so this will currently
+ always print nothing)
+- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
+ not support indexed form, so this will currently always print nothing)
+
+Sparc:
+
+- ``r``: No effect.
+
+SystemZ:
+
+SystemZ implements only ``n``, and does *not* support any of the other
+target-independent modifiers.
+
+X86:
+
+- ``c``: Print an unadorned integer or symbol name. (The latter is
+ target-specific behavior for this typically target-independent modifier).
+- ``A``: Print a register name with a '``*``' before it.
+- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
+ operand.
+- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
+ memory operand.
+- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
+ operand.
+- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
+ operand.
+- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
+ available, otherwise the 32-bit register name; do nothing on a memory operand.
+- ``n``: Negate and print an unadorned integer, or, for operands other than an
+ immediate integer (e.g. a relocatable symbol expression), print a '-' before
+ the operand. (The behavior for relocatable symbol expressions is a
+ target-specific behavior for this typically target-independent modifier)
+- ``H``: Print a memory reference with additional offset +8.
+- ``P``: Print a memory reference or operand for use as the argument of a call
+ instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
+
+XCore:
+
+No additional modifiers.
+
+
Inline Asm Metadata
^^^^^^^^^^^^^^^^^^^
.. code-block:: llvm
- !0 = !DILocalVariable(tag: DW_TAG_arg_variable, name: "this", arg: 0,
+ !0 = !DILocalVariable(tag: DW_TAG_arg_variable, name: "this", arg: 1,
scope: !3, file: !2, line: 7, type: !3,
flags: DIFlagArtificial)
- !1 = !DILocalVariable(tag: DW_TAG_arg_variable, name: "x", arg: 1,
+ !1 = !DILocalVariable(tag: DW_TAG_arg_variable, name: "x", arg: 2,
scope: !4, file: !2, line: 7, type: !3)
- !1 = !DILocalVariable(tag: DW_TAG_auto_variable, name: "y",
+ !2 = !DILocalVariable(tag: DW_TAG_auto_variable, name: "y",
scope: !5, file: !2, line: 7, type: !3)
DIExpression
'``llvm.loop.unroll.disable``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-This metadata either disables loop unrolling. The metadata has a single operand
+This metadata disables loop unrolling. The metadata has a single operand
which is the string ``llvm.loop.unroll.disable``. For example:
.. code-block:: llvm
'``llvm.loop.unroll.runtime.disable``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-This metadata either disables runtime loop unrolling. The metadata has a single
+This metadata disables runtime loop unrolling. The metadata has a single
operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
.. code-block:: llvm
'``llvm.loop.unroll.full``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-This metadata either suggests that the loop should be unrolled fully. The
-metadata has a single operand which is the string ``llvm.loop.unroll.disable``.
+This metadata suggests that the loop should be unrolled fully. The
+metadata has a single operand which is the string ``llvm.loop.unroll.full``.
For example:
.. code-block:: llvm
The terminator instructions are: ':ref:`ret <i_ret>`',
':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
-':ref:`resume <i_resume>`', and ':ref:`unreachable <i_unreachable>`'.
+':ref:`resume <i_resume>`', ':ref:`catchpad <i_catchpad>`',
+':ref:`catchendpad <i_catchendpad>`',
+':ref:`catchret <i_catchret>`',
+':ref:`cleanupret <i_cleanupret>`',
+':ref:`terminatepad <i_terminatepad>`',
+and ':ref:`unreachable <i_unreachable>`'.
.. _i_ret:
resume { i8*, i32 } %exn
+.. _i_catchpad:
+
+'``catchpad``' Instruction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+ <resultval> = catchpad <resultty> [<args>*]
+ to label <normal label> unwind label <exception label>
+
+Overview:
+"""""""""
+
+The '``catchpad``' instruction is used by `LLVM's exception handling
+system <ExceptionHandling.html#overview>`_ to specify that a basic block
+is a catch block --- one where a personality routine attempts to transfer
+control to catch an exception.
+The ``args`` correspond to whatever information the personality
+routine requires to know if this is an appropriate place to catch the
+exception. Control is tranfered to the ``exception`` label if the
+``catchpad`` is not an appropriate handler for the in-flight exception.
+The ``normal`` label should contain the code found in the ``catch``
+portion of a ``try``/``catch`` sequence. It defines values supplied by
+the :ref:`personality function <personalityfn>` upon re-entry to the
+function. The ``resultval`` has the type ``resultty``.
+
+Arguments:
+""""""""""
+
+The instruction takes a list of arbitrary values which are interpreted
+by the :ref:`personality function <personalityfn>`.
+
+The ``catchpad`` must be provided a ``normal`` label to transfer control
+to if the ``catchpad`` matches the exception and an ``exception``
+label to transfer control to if it doesn't.
+
+Semantics:
+""""""""""
+
+The '``catchpad``' instruction defines the values which are set by the
+:ref:`personality function <personalityfn>` upon re-entry to the function, and
+therefore the "result type" of the ``catchpad`` instruction. As with
+calling conventions, how the personality function results are
+represented in LLVM IR is target specific.
+
+When the call stack is being unwound due to an exception being thrown,
+the exception is compared against the ``args``. If it doesn't match,
+then control is transfered to the ``exception`` basic block.
+
+The ``catchpad`` instruction has several restrictions:
+
+- A catch block is a basic block which is the unwind destination of
+ an exceptional instruction.
+- A catch block must have a '``catchpad``' instruction as its
+ first non-PHI instruction.
+- A catch block's ``exception`` edge must refer to a catch block or a
+ catch-end block.
+- There can be only one '``catchpad``' instruction within the
+ catch block.
+- A basic block that is not a catch block may not include a
+ '``catchpad``' instruction.
+- It is undefined behavior for control to transfer from a ``catchpad`` to a
+ ``cleanupret`` without first executing a ``catchret`` and a subsequent
+ ``cleanuppad``.
+- It is undefined behavior for control to transfer from a ``catchpad`` to a
+ ``ret`` without first executing a ``catchret``.
+
+Example:
+""""""""
+
+.. code-block:: llvm
+
+ ;; A catch block which can catch an integer.
+ %res = catchpad { i8*, i32 } [i8** @_ZTIi]
+ to label %int.handler unwind label %terminate
+
+.. _i_catchendpad:
+
+'``catchendpad``' Instruction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+ catchendpad unwind label <nextaction>
+ catchendpad unwind to caller
+
+Overview:
+"""""""""
+
+The '``catchendpad``' instruction is used by `LLVM's exception handling
+system <ExceptionHandling.html#overview>`_ to communicate to the
+:ref:`personality function <personalityfn>` which invokes are associated
+with a chain of :ref:`catchpad <i_catchpad>` instructions.
+
+The ``nextaction`` label indicates where control should transfer to if
+none of the ``catchpad`` instructions are suitable for catching the
+in-flight exception.
+
+If a ``nextaction`` label is not present, the instruction unwinds out of
+its parent function. The
+:ref:`personality function <personalityfn>` will continue processing
+exception handling actions in the caller.
+
+Arguments:
+""""""""""
+
+The instruction optionally takes a label, ``nextaction``, indicating
+where control should transfer to if none of the preceding
+``catchpad`` instructions are suitable for the in-flight exception.
+
+Semantics:
+""""""""""
+
+When the call stack is being unwound due to an exception being thrown
+and none of the constituent ``catchpad`` instructions match, then
+control is transfered to ``nextaction`` if it is present. If it is not
+present, control is transfered to the caller.
+
+The ``catchendpad`` instruction has several restrictions:
+
+- A catch-end block is a basic block which is the unwind destination of
+ an exceptional instruction.
+- A catch-end block must have a '``catchendpad``' instruction as its
+ first non-PHI instruction.
+- There can be only one '``catchendpad``' instruction within the
+ catch block.
+- A basic block that is not a catch-end block may not include a
+ '``catchendpad``' instruction.
+- Exactly one catch block may unwind to a ``catchendpad``.
+- The unwind target of invokes between a ``catchpad`` and a
+ corresponding ``catchret`` must be its ``catchendpad``.
+
+Example:
+""""""""
+
+.. code-block:: llvm
+
+ catchendpad unwind label %terminate
+ catchendpad unwind to caller
+
+.. _i_catchret:
+
+'``catchret``' Instruction
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+ catchret label <normal>
+
+Overview:
+"""""""""
+
+The '``catchret``' instruction is a terminator instruction that has a
+single successor.
+
+
+Arguments:
+""""""""""
+
+The '``catchret``' instruction requires one argument which specifies
+where control will transfer to next.
+
+Semantics:
+""""""""""
+
+The '``catchret``' instruction ends the existing (in-flight) exception
+whose unwinding was interrupted with a
+:ref:`catchpad <i_catchpad>` instruction.
+The :ref:`personality function <personalityfn>` gets a chance to execute
+arbitrary code to, for example, run a C++ destructor.
+Control then transfers to ``normal``.
+
+Example:
+""""""""
+
+.. code-block:: llvm
+
+ catchret label %continue
+
+.. _i_cleanupret:
+
+'``cleanupret``' Instruction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+ cleanupret <type> <value> unwind label <continue>
+ cleanupret <type> <value> unwind to caller
+
+Overview:
+"""""""""
+
+The '``cleanupret``' instruction is a terminator instruction that has
+an optional successor.
+
+
+Arguments:
+""""""""""
+
+The '``cleanupret``' instruction requires one argument, which must have the
+same type as the result of any '``cleanuppad``' instruction in the same
+function. It also has an optional successor, ``continue``.
+
+Semantics:
+""""""""""
+
+The '``cleanupret``' instruction indicates to the
+:ref:`personality function <personalityfn>` that one
+:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
+It transfers control to ``continue`` or unwinds out of the function.
+
+Example:
+""""""""
+
+.. code-block:: llvm
+
+ cleanupret void unwind to caller
+ cleanupret { i8*, i32 } %exn unwind label %continue
+
+.. _i_terminatepad:
+
+'``terminatepad``' Instruction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+ terminatepad [<args>*] unwind label <exception label>
+ terminatepad [<args>*] unwind to caller
+
+Overview:
+"""""""""
+
+The '``terminatepad``' instruction is used by `LLVM's exception handling
+system <ExceptionHandling.html#overview>`_ to specify that a basic block
+is a terminate block --- one where a personality routine may decide to
+terminate the program.
+The ``args`` correspond to whatever information the personality
+routine requires to know if this is an appropriate place to terminate the
+program. Control is transferred to the ``exception`` label if the
+personality routine decides not to terminate the program for the
+in-flight exception.
+
+Arguments:
+""""""""""
+
+The instruction takes a list of arbitrary values which are interpreted
+by the :ref:`personality function <personalityfn>`.
+
+The ``terminatepad`` may be given an ``exception`` label to
+transfer control to if the in-flight exception matches the ``args``.
+
+Semantics:
+""""""""""
+
+When the call stack is being unwound due to an exception being thrown,
+the exception is compared against the ``args``. If it matches,
+then control is transfered to the ``exception`` basic block. Otherwise,
+the program is terminated via personality-specific means. Typically,
+the first argument to ``terminatepad`` specifies what function the
+personality should defer to in order to terminate the program.
+
+The ``terminatepad`` instruction has several restrictions:
+
+- A terminate block is a basic block which is the unwind destination of
+ an exceptional instruction.
+- A terminate block must have a '``terminatepad``' instruction as its
+ first non-PHI instruction.
+- There can be only one '``terminatepad``' instruction within the
+ terminate block.
+- A basic block that is not a terminate block may not include a
+ '``terminatepad``' instruction.
+
+Example:
+""""""""
+
+.. code-block:: llvm
+
+ ;; A terminate block which only permits integers.
+ terminatepad [i8** @_ZTIi] unwind label %continue
+
.. _i_unreachable:
'``unreachable``' Instruction
The '``getelementptr``' instruction is used to get the address of a
subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
-address calculation only and does not access memory.
+address calculation only and does not access memory. The instruction can also
+be used to calculate a vector of such addresses.
Arguments:
""""""""""
; yields i32*:iptr
%iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0
-In cases where the pointer argument is a vector of pointers, each index
-must be a vector with the same number of elements. For example:
+Vector of pointers:
+"""""""""""""""""""
+
+The ``getelementptr`` returns a vector of pointers, instead of a single address,
+when one or more of its arguments is a vector. In such cases, all vector
+arguments should have the same number of elements, and every scalar argument
+will be effectively broadcast into a vector during address calculation.
+
+.. code-block:: llvm
+
+ ; All arguments are vectors:
+ ; A[i] = ptrs[i] + offsets[i]*sizeof(i8)
+ %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
+
+ ; Add the same scalar offset to each pointer of a vector:
+ ; A[i] = ptrs[i] + offset*sizeof(i8)
+ %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
+
+ ; Add distinct offsets to the same pointer:
+ ; A[i] = ptr + offsets[i]*sizeof(i8)
+ %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
+
+ ; In all cases described above the type of the result is <4 x i8*>
+
+The two following instructions are equivalent:
+
+.. code-block:: llvm
+
+ getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
+ <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
+ <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
+ <4 x i32> %ind4,
+ <4 x i64> <i64 13, i64 13, i64 13, i64 13>
+
+ getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
+ i32 2, i32 1, <4 x i32> %ind4, i64 13
+
+Let's look at the C code, where the vector version of ``getelementptr``
+makes sense:
+
+.. code-block:: c
+
+ // Let's assume that we vectorize the following loop:
+ double *A, B; int *C;
+ for (int i = 0; i < size; ++i) {
+ A[i] = B[C[i]];
+ }
.. code-block:: llvm
- %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets,
+ ; get pointers for 8 elements from array B
+ %ptrs = getelementptr double, double* %B, <8 x i32> %C
+ ; load 8 elements from array B into A
+ %A = call <8 x double> @llvm.masked.gather.v8f64(<8 x double*> %ptrs,
+ i32 8, <8 x i1> %mask, <8 x double> %passthru)
Conversion Operations
---------------------
::
- <result> = fcmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
+ <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Overview:
"""""""""
#. ``uno``: yields ``true`` if either operand is a QNAN.
#. ``true``: always yields ``true``, regardless of operands.
+The ``fcmp`` instruction can also optionally take any number of
+:ref:`fast-math flags <fastmath>`, which are optimization hints to enable
+otherwise unsafe floating point optimizations.
+
+Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
+only flags that have any effect on its semantics are those that allow
+assumptions to be made about the values of input arguments; namely
+``nnan``, ``ninf``, and ``nsz``. See :ref:`fastmath` for more information.
+
Example:
""""""""
catch i8** @_ZTIi
filter [1 x i8**] [@_ZTId]
+.. _i_cleanuppad:
+
+'``cleanuppad``' Instruction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+ <resultval> = cleanuppad <resultty> [<args>*]
+
+Overview:
+"""""""""
+
+The '``cleanuppad``' instruction is used by `LLVM's exception handling
+system <ExceptionHandling.html#overview>`_ to specify that a basic block
+is a cleanup block --- one where a personality routine attempts to
+transfer control to run cleanup actions.
+The ``args`` correspond to whatever additional
+information the :ref:`personality function <personalityfn>` requires to
+execute the cleanup.
+The ``resultval`` has the type ``resultty``.
+
+Arguments:
+""""""""""
+
+The instruction takes a list of arbitrary values which are interpreted
+by the :ref:`personality function <personalityfn>`.
+
+Semantics:
+""""""""""
+
+The '``cleanuppad``' instruction defines the values which are set by the
+:ref:`personality function <personalityfn>` upon re-entry to the function, and
+therefore the "result type" of the ``cleanuppad`` instruction. As with
+calling conventions, how the personality function results are
+represented in LLVM IR is target specific.
+
+When the call stack is being unwound due to an exception being thrown,
+the :ref:`personality function <personalityfn>` transfers control to the
+``cleanuppad`` with the aid of the personality-specific arguments.
+
+The ``cleanuppad`` instruction has several restrictions:
+
+- A cleanup block is a basic block which is the unwind destination of
+ an exceptional instruction.
+- A cleanup block must have a '``cleanuppad``' instruction as its
+ first non-PHI instruction.
+- There can be only one '``cleanuppad``' instruction within the
+ cleanup block.
+- A basic block that is not a cleanup block may not include a
+ '``cleanuppad``' instruction.
+- It is undefined behavior for control to transfer from a ``cleanuppad`` to a
+ ``catchret`` without first executing a ``cleanupret`` and a subsequent
+ ``catchpad``.
+- It is undefined behavior for control to transfer from a ``cleanuppad`` to a
+ ``ret`` without first executing a ``cleanupret``.
+
+Example:
+""""""""
+
+.. code-block:: llvm
+
+ %res = cleanuppad { i8*, i32 } [label %nextaction]
+
.. _intrinsics:
Intrinsic Functions
other aggressive transformations, so the value returned may not be that
of the obvious source-language caller.
-'``llvm.frameescape``' and '``llvm.framerecover``' Intrinsics
+'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Syntax:
::
- declare void @llvm.frameescape(...)
- declare i8* @llvm.framerecover(i8* %func, i8* %fp, i32 %idx)
+ declare void @llvm.localescape(...)
+ declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx)
Overview:
"""""""""
-The '``llvm.frameescape``' intrinsic escapes offsets of a collection of static
-allocas, and the '``llvm.framerecover``' intrinsic applies those offsets to a
+The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
+allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
live frame pointer to recover the address of the allocation. The offset is
-computed during frame layout of the caller of ``llvm.frameescape``.
+computed during frame layout of the caller of ``llvm.localescape``.
Arguments:
""""""""""
-All arguments to '``llvm.frameescape``' must be pointers to static allocas or
-casts of static allocas. Each function can only call '``llvm.frameescape``'
+All arguments to '``llvm.localescape``' must be pointers to static allocas or
+casts of static allocas. Each function can only call '``llvm.localescape``'
once, and it can only do so from the entry block.
-The ``func`` argument to '``llvm.framerecover``' must be a constant
+The ``func`` argument to '``llvm.localrecover``' must be a constant
bitcasted pointer to a function defined in the current module. The code
generator cannot determine the frame allocation offset of functions defined in
other modules.
-The ``fp`` argument to '``llvm.framerecover``' must be a frame
-pointer of a call frame that is currently live. The return value of
-'``llvm.frameaddress``' is one way to produce such a value, but most platforms
-also expose the frame pointer through stack unwinding mechanisms.
+The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
+call frame that is currently live. The return value of '``llvm.localaddress``'
+is one way to produce such a value, but various runtimes also expose a suitable
+pointer in platform-specific ways.
-The ``idx`` argument to '``llvm.framerecover``' indicates which alloca passed to
-'``llvm.frameescape``' to recover. It is zero-indexed.
+The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
+'``llvm.localescape``' to recover. It is zero-indexed.
Semantics:
""""""""""
-These intrinsics allow a group of functions to access one stack memory
-allocation in an ancestor stack frame. The memory returned from
-'``llvm.frameallocate``' may be allocated prior to stack realignment, so the
-memory is only aligned to the ABI-required stack alignment. Each function may
-only call '``llvm.frameallocate``' one or zero times from the function entry
-block. The frame allocation intrinsic inhibits inlining, as any frame
-allocations in the inlined function frame are likely to be at a different
-offset from the one used by '``llvm.framerecover``' called with the
-uninlined function.
+These intrinsics allow a group of functions to share access to a set of local
+stack allocations of a one parent function. The parent function may call the
+'``llvm.localescape``' intrinsic once from the function entry block, and the
+child functions can use '``llvm.localrecover``' to access the escaped allocas.
+The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
+the escaped allocas are allocated, which would break attempts to use
+'``llvm.localrecover``'.
.. _int_read_register:
.. _int_write_register:
Specialised Arithmetic Intrinsics
---------------------------------
+'``llvm.canonicalize.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+ declare float @llvm.canonicalize.f32(float %a)
+ declare double @llvm.canonicalize.f64(double %b)
+
+Overview:
+"""""""""
+
+The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
+encoding of a floating point number. This canonicalization is useful for
+implementing certain numeric primitives such as frexp. The canonical encoding is
+defined by IEEE-754-2008 to be:
+
+::
+
+ 2.1.8 canonical encoding: The preferred encoding of a floating-point
+ representation in a format. Applied to declets, significands of finite
+ numbers, infinities, and NaNs, especially in decimal formats.
+
+This operation can also be considered equivalent to the IEEE-754-2008
+conversion of a floating-point value to the same format. NaNs are handled
+according to section 6.2.
+
+Examples of non-canonical encodings:
+
+- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
+ converted to a canonical representation per hardware-specific protocol.
+- Many normal decimal floating point numbers have non-canonical alternative
+ encodings.
+- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
+ These are treated as non-canonical encodings of zero and with be flushed to
+ a zero of the same sign by this operation.
+
+Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
+default exception handling must signal an invalid exception, and produce a
+quiet NaN result.
+
+This function should always be implementable as multiplication by 1.0, provided
+that the compiler does not constant fold the operation. Likewise, division by
+1.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
+-0.0 is also sufficient provided that the rounding mode is not -Infinity.
+
+``@llvm.canonicalize`` must preserve the equality relation. That is:
+
+- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
+- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
+ to ``(x == y)``
+
+Additionally, the sign of zero must be conserved:
+``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
+
+The payload bits of a NaN must be conserved, with two exceptions.
+First, environments which use only a single canonical representation of NaN
+must perform said canonicalization. Second, SNaNs must be quieted per the
+usual methods.
+
+The canonicalization operation may be optimized away if:
+
+- The input is known to be canonical. For example, it was produced by a
+ floating-point operation that is required by the standard to be canonical.
+- The result is consumed only by (or fused with) other floating-point
+ operations. That is, the bits of the floating point value are not examined.
+
'``llvm.fmuladd.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
%r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
+
+'``llvm.uabsdiff.*``' and '``llvm.sabsdiff.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+This is an overloaded intrinsic. The loaded data is a vector of any integer bit width.
+
+.. code-block:: llvm
+
+ declare <4 x integer> @llvm.uabsdiff.v4i32(<4 x integer> %a, <4 x integer> %b)
+
+
+Overview:
+"""""""""
+
+The ``llvm.uabsdiff`` intrinsic returns a vector result of the absolute difference of the two operands,
+treating them both as unsigned integers.
+
+The ``llvm.sabsdiff`` intrinsic returns a vector result of the absolute difference of the two operands,
+treating them both as signed integers.
+
+.. note::
+
+ These intrinsics are primarily used during the code generation stage of compilation.
+ They are generated by compiler passes such as the Loop and SLP vectorizers.it is not
+ recommended for users to create them manually.
+
+Arguments:
+""""""""""
+
+Both intrinsics take two integer of the same bitwidth.
+
+Semantics:
+""""""""""
+
+The expression::
+
+ call <4 x i32> @llvm.uabsdiff.v4i32(<4 x i32> %a, <4 x i32> %b)
+
+is equivalent to::
+
+ %sub = sub <4 x i32> %a, %b
+ %ispos = icmp ugt <4 x i32> %sub, <i32 -1, i32 -1, i32 -1, i32 -1>
+ %neg = sub <4 x i32> zeroinitializer, %sub
+ %1 = select <4 x i1> %ispos, <4 x i32> %sub, <4 x i32> %neg
+
+Similarly the expression::
+
+ call <4 x i32> @llvm.sabsdiff.v4i32(<4 x i32> %a, <4 x i32> %b)
+
+is equivalent to::
+
+ %sub = sub nsw <4 x i32> %a, %b
+ %ispos = icmp sgt <4 x i32> %sub, <i32 -1, i32 -1, i32 -1, i32 -1>
+ %neg = sub nsw <4 x i32> zeroinitializer, %sub
+ %1 = select <4 x i1> %ispos, <4 x i32> %sub, <4 x i32> %neg
+
+
Half Precision Floating Point Intrinsics
----------------------------------------