.. code-block:: llvm
- %0 = add i32 %X, %X ; yields {i32}:%0
- %1 = add i32 %0, %0 ; yields {i32}:%1
+ %0 = add i32 %X, %X ; yields i32:%0
+ %1 = add i32 %0, %0 ; yields i32:%1
%result = add i32 %1, %1
This last way of multiplying ``%X`` by 8 illustrates several important
A symbol with ``internal`` or ``private`` linkage must have ``default``
visibility.
-.. _namedtypes:
+.. _dllstorageclass:
DLL Storage Classes
-------------------
exists for defining a dll interface, the compiler, assembler and linker know
it is externally referenced and must refrain from deleting the symbol.
+.. _tls_model:
+
+Thread Local Storage Models
+---------------------------
+
+A variable may be defined as ``thread_local``, which means that it will
+not be shared by threads (each thread will have a separated copy of the
+variable). Not all targets support thread-local variables. Optionally, a
+TLS model may be specified:
+
+``localdynamic``
+ For variables that are only used within the current shared library.
+``initialexec``
+ For variables in modules that will not be loaded dynamically.
+``localexec``
+ For variables defined in the executable and only used within it.
+
+If no explicit model is given, the "general dynamic" model is used.
+
+The models correspond to the ELF TLS models; see `ELF Handling For
+Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
+more information on under which circumstances the different models may
+be used. The target may choose a different TLS model if the specified
+model is not supported, or if a better choice of model can be made.
+
+A model can also be specified in a alias, but then it only governs how
+the alias is accessed. It will not have any effect in the aliasee.
+
+.. _namedtypes:
+
Structure Types
---------------
Global variables define regions of memory allocated at compilation time
instead of run-time.
-Global variables definitions must be initialized, may have an explicit section
-to be placed in, and may have an optional explicit alignment specified.
+Global variables definitions must be initialized.
Global variables in other translation units can also be declared, in which
case they don't have an initializer.
-A variable may be defined as ``thread_local``, which means that it will
-not be shared by threads (each thread will have a separated copy of the
-variable). Not all targets support thread-local variables. Optionally, a
-TLS model may be specified:
-
-``localdynamic``
- For variables that are only used within the current shared library.
-``initialexec``
- For variables in modules that will not be loaded dynamically.
-``localexec``
- For variables defined in the executable and only used within it.
-
-The models correspond to the ELF TLS models; see `ELF Handling For
-Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
-more information on under which circumstances the different models may
-be used. The target may choose a different TLS model if the specified
-model is not supported, or if a better choice of model can be made.
+Either global variable definitions or declarations may have an explicit section
+to be placed in and may have an optional explicit alignment specified.
A variable may be defined as a global ``constant``, which indicates that
the contents of the variable will **never** be modified (enabling better
Globals can also have a :ref:`DLL storage class <dllstorageclass>`.
+Variables and aliasaes can have a
+:ref:`Thread Local Storage Model <tls_model>`.
+
Syntax::
[@<GlobalVarName> =] [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal]
- [AddrSpace] [unnamed_addr] [ExternallyInitialized]
- <global | constant> <Type>
+ [unnamed_addr] [AddrSpace] [ExternallyInitialized]
+ <global | constant> <Type> [<InitializerConstant>]
[, section "name"] [, align <Alignment>]
For example, the following defines a global in a numbered address space
Aliases
-------
-Aliases act as "second name" for the aliasee value (which can be either
-function, global variable, another alias or bitcast of global value).
+Aliases, unlike function or variables, don't create any new data. They
+are just a new symbol and metadata for an existing position.
+
+Aliases have a name and an aliasee that is either a global value or a
+constant expression.
+
Aliases may have an optional :ref:`linkage type <linkage>`, an optional
-:ref:`visibility style <visibility>`, and an optional :ref:`DLL storage class
-<dllstorageclass>`.
+:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
+<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
Syntax::
- @<Name> = [Visibility] [DLLStorageClass] alias [Linkage] <AliaseeTy> @<Aliasee>
+ @<Name> = [Visibility] [DLLStorageClass] [ThreadLocal] [unnamed_addr] alias [Linkage] <AliaseeTy> @<Aliasee>
The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
-might not correctly handle dropping a weak symbol that is aliased by a non-weak
-alias.
+might not correctly handle dropping a weak symbol that is aliased.
Alias that are not ``unnamed_addr`` are guaranteed to have the same address as
-the aliasee.
+the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
+to the same content.
+
+Since aliases are only a second name, some restrictions apply, of which
+some can only be checked when producing an object file:
+
+* The expression defining the aliasee must be computable at assembly
+ time. Since it is just a name, no relocations can be used.
-The aliasee must be a definition.
+* No alias in the expression can be weak as the possibility of the
+ intermediate alias being overridden cannot be represented in an
+ object file.
-Aliases are not allowed to point to aliases with linkages that can be
-overridden. Since they are only a second name, the possibility of the
-intermediate alias being overridden cannot be represented in an object file.
+* No global value in the expression can be a declaration, since that
+ would require a relocation, which is not possible.
.. _namedmetadatastructure:
inlining this function is desirable (such as the "inline" keyword in
C/C++). It is just a hint; it imposes no requirements on the
inliner.
+``jumptable``
+ This attribute indicates that the function should be added to a
+ jump-instruction table at code-generation time, and that all address-taken
+ references to this function should be replaced with a reference to the
+ appropriate jump-instruction-table function pointer. Note that this creates
+ a new pointer for the original function, which means that code that depends
+ on function-pointer identity can break. So, any function annotated with
+ ``jumptable`` must also be ``unnamed_addr``.
``minsize``
This attribute suggests that optimization passes and code generator
passes make choices that keep the code size of this function as small
'``llvm.mem.parallel_loop_access``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-For a loop to be parallel, in addition to using
-the ``llvm.loop`` metadata to mark the loop latch branch instruction,
-also all of the memory accessing instructions in the loop body need to be
-marked with the ``llvm.mem.parallel_loop_access`` metadata. If there
-is at least one memory accessing instruction not marked with the metadata,
-the loop must be considered a sequential loop. This causes parallel loops to be
-converted to sequential loops due to optimization passes that are unaware of
-the parallel semantics and that insert new memory instructions to the loop
-body.
+The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier,
+or metadata containing a list of loop identifiers for nested loops.
+The metadata is attached to memory accessing instructions and denotes that
+no loop carried memory dependence exist between it and other instructions denoted
+with the same loop identifier.
+
+Precisely, given two instructions ``m1`` and ``m2`` that both have the
+``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the
+set of loops associated with that metadata, respectively, then there is no loop
+carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and
+``L2``.
+
+As a special case, if all memory accessing instructions in a loop have
+``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the
+loop has no loop carried memory dependences and is considered to be a parallel
+loop.
+
+Note that if not all memory access instructions have such metadata referring to
+the loop, then the loop is considered not being trivially parallel. Additional
+memory dependence analysis is required to make that determination. As a fail
+safe mechanism, this causes loops that were originally parallel to be considered
+sequential (if optimization passes that are unaware of the parallel semantics
+insert new memory instructions into the loop body).
Example of a loop that is considered parallel due to its correct use of
both ``llvm.loop`` and ``llvm.mem.parallel_loop_access``
The ``@llvm.global_dtors`` array contains a list of destructor
functions, priorities, and an optional associated global or function.
The functions referenced by this array will be called in descending
-order of priority (i.e. highest first) when the module is loaded. The
+order of priority (i.e. highest first) when the module is unloaded. The
order of functions with the same priority is not defined.
If the third field is present, non-null, and points to a global variable
.. code-block:: llvm
%retval = invoke i32 @Test(i32 15) to label %Continue
- unwind label %TestCleanup ; {i32}:retval set
+ unwind label %TestCleanup ; i32:retval set
%retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
- unwind label %TestCleanup ; {i32}:retval set
+ unwind label %TestCleanup ; i32:retval set
.. _i_resume:
::
- <result> = add <ty> <op1>, <op2> ; yields {ty}:result
- <result> = add nuw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = add nsw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = add nuw nsw <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = add <ty> <op1>, <op2> ; yields ty:result
+ <result> = add nuw <ty> <op1>, <op2> ; yields ty:result
+ <result> = add nsw <ty> <op1>, <op2> ; yields ty:result
+ <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = add i32 4, %var ; yields {i32}:result = 4 + %var
+ <result> = add i32 4, %var ; yields i32:result = 4 + %var
.. _i_fadd:
::
- <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = fadd float 4.0, %var ; yields {float}:result = 4.0 + %var
+ <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var
'``sub``' Instruction
^^^^^^^^^^^^^^^^^^^^^
::
- <result> = sub <ty> <op1>, <op2> ; yields {ty}:result
- <result> = sub nuw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = sub nsw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = sub nuw nsw <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = sub <ty> <op1>, <op2> ; yields ty:result
+ <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result
+ <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result
+ <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = sub i32 4, %var ; yields {i32}:result = 4 - %var
- <result> = sub i32 0, %val ; yields {i32}:result = -%var
+ <result> = sub i32 4, %var ; yields i32:result = 4 - %var
+ <result> = sub i32 0, %val ; yields i32:result = -%var
.. _i_fsub:
::
- <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = fsub float 4.0, %var ; yields {float}:result = 4.0 - %var
- <result> = fsub float -0.0, %val ; yields {float}:result = -%var
+ <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
+ <result> = fsub float -0.0, %val ; yields float:result = -%var
'``mul``' Instruction
^^^^^^^^^^^^^^^^^^^^^
::
- <result> = mul <ty> <op1>, <op2> ; yields {ty}:result
- <result> = mul nuw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = mul nsw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = mul nuw nsw <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = mul <ty> <op1>, <op2> ; yields ty:result
+ <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result
+ <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result
+ <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = mul i32 4, %var ; yields {i32}:result = 4 * %var
+ <result> = mul i32 4, %var ; yields i32:result = 4 * %var
.. _i_fmul:
::
- <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = fmul float 4.0, %var ; yields {float}:result = 4.0 * %var
+ <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var
'``udiv``' Instruction
^^^^^^^^^^^^^^^^^^^^^^
::
- <result> = udiv <ty> <op1>, <op2> ; yields {ty}:result
- <result> = udiv exact <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = udiv <ty> <op1>, <op2> ; yields ty:result
+ <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = udiv i32 4, %var ; yields {i32}:result = 4 / %var
+ <result> = udiv i32 4, %var ; yields i32:result = 4 / %var
'``sdiv``' Instruction
^^^^^^^^^^^^^^^^^^^^^^
::
- <result> = sdiv <ty> <op1>, <op2> ; yields {ty}:result
- <result> = sdiv exact <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = sdiv <ty> <op1>, <op2> ; yields ty:result
+ <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = sdiv i32 4, %var ; yields {i32}:result = 4 / %var
+ <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var
.. _i_fdiv:
::
- <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = fdiv float 4.0, %var ; yields {float}:result = 4.0 / %var
+ <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var
'``urem``' Instruction
^^^^^^^^^^^^^^^^^^^^^^
::
- <result> = urem <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = urem <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = urem i32 4, %var ; yields {i32}:result = 4 % %var
+ <result> = urem i32 4, %var ; yields i32:result = 4 % %var
'``srem``' Instruction
^^^^^^^^^^^^^^^^^^^^^^
::
- <result> = srem <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = srem <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = srem i32 4, %var ; yields {i32}:result = 4 % %var
+ <result> = srem i32 4, %var ; yields i32:result = 4 % %var
.. _i_frem:
::
- <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = frem float 4.0, %var ; yields {float}:result = 4.0 % %var
+ <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var
.. _bitwiseops:
::
- <result> = shl <ty> <op1>, <op2> ; yields {ty}:result
- <result> = shl nuw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = shl nsw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = shl nuw nsw <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = shl <ty> <op1>, <op2> ; yields ty:result
+ <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result
+ <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result
+ <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = shl i32 4, %var ; yields {i32}: 4 << %var
- <result> = shl i32 4, 2 ; yields {i32}: 16
- <result> = shl i32 1, 10 ; yields {i32}: 1024
+ <result> = shl i32 4, %var ; yields i32: 4 << %var
+ <result> = shl i32 4, 2 ; yields i32: 16
+ <result> = shl i32 1, 10 ; yields i32: 1024
<result> = shl i32 1, 32 ; undefined
<result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>
::
- <result> = lshr <ty> <op1>, <op2> ; yields {ty}:result
- <result> = lshr exact <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = lshr <ty> <op1>, <op2> ; yields ty:result
+ <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = lshr i32 4, 1 ; yields {i32}:result = 2
- <result> = lshr i32 4, 2 ; yields {i32}:result = 1
- <result> = lshr i8 4, 3 ; yields {i8}:result = 0
- <result> = lshr i8 -2, 1 ; yields {i8}:result = 0x7F
+ <result> = lshr i32 4, 1 ; yields i32:result = 2
+ <result> = lshr i32 4, 2 ; yields i32:result = 1
+ <result> = lshr i8 4, 3 ; yields i8:result = 0
+ <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
<result> = lshr i32 1, 32 ; undefined
<result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
::
- <result> = ashr <ty> <op1>, <op2> ; yields {ty}:result
- <result> = ashr exact <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = ashr <ty> <op1>, <op2> ; yields ty:result
+ <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = ashr i32 4, 1 ; yields {i32}:result = 2
- <result> = ashr i32 4, 2 ; yields {i32}:result = 1
- <result> = ashr i8 4, 3 ; yields {i8}:result = 0
- <result> = ashr i8 -2, 1 ; yields {i8}:result = -1
+ <result> = ashr i32 4, 1 ; yields i32:result = 2
+ <result> = ashr i32 4, 2 ; yields i32:result = 1
+ <result> = ashr i8 4, 3 ; yields i8:result = 0
+ <result> = ashr i8 -2, 1 ; yields i8:result = -1
<result> = ashr i32 1, 32 ; undefined
<result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
::
- <result> = and <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = and <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = and i32 4, %var ; yields {i32}:result = 4 & %var
- <result> = and i32 15, 40 ; yields {i32}:result = 8
- <result> = and i32 4, 8 ; yields {i32}:result = 0
+ <result> = and i32 4, %var ; yields i32:result = 4 & %var
+ <result> = and i32 15, 40 ; yields i32:result = 8
+ <result> = and i32 4, 8 ; yields i32:result = 0
'``or``' Instruction
^^^^^^^^^^^^^^^^^^^^
::
- <result> = or <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = or <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
::
- <result> = or i32 4, %var ; yields {i32}:result = 4 | %var
- <result> = or i32 15, 40 ; yields {i32}:result = 47
- <result> = or i32 4, 8 ; yields {i32}:result = 12
+ <result> = or i32 4, %var ; yields i32:result = 4 | %var
+ <result> = or i32 15, 40 ; yields i32:result = 47
+ <result> = or i32 4, 8 ; yields i32:result = 12
'``xor``' Instruction
^^^^^^^^^^^^^^^^^^^^^
::
- <result> = xor <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = xor <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
.. code-block:: llvm
- <result> = xor i32 4, %var ; yields {i32}:result = 4 ^ %var
- <result> = xor i32 15, 40 ; yields {i32}:result = 39
- <result> = xor i32 4, 8 ; yields {i32}:result = 12
- <result> = xor i32 %V, -1 ; yields {i32}:result = ~%V
+ <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var
+ <result> = xor i32 15, 40 ; yields i32:result = 39
+ <result> = xor i32 4, 8 ; yields i32:result = 12
+ <result> = xor i32 %V, -1 ; yields i32:result = ~%V
Vector Operations
-----------------
::
- <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] ; yields {type*}:result
+ <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] ; yields type*:result
Overview:
"""""""""
.. code-block:: llvm
- %ptr = alloca i32 ; yields {i32*}:ptr
- %ptr = alloca i32, i32 4 ; yields {i32*}:ptr
- %ptr = alloca i32, i32 4, align 1024 ; yields {i32*}:ptr
- %ptr = alloca i32, align 1024 ; yields {i32*}:ptr
+ %ptr = alloca i32 ; yields i32*:ptr
+ %ptr = alloca i32, i32 4 ; yields i32*:ptr
+ %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr
+ %ptr = alloca i32, align 1024 ; yields i32*:ptr
.. _i_load:
.. code-block:: llvm
- %ptr = alloca i32 ; yields {i32*}:ptr
- store i32 3, i32* %ptr ; yields {void}
- %val = load i32* %ptr ; yields {i32}:val = i32 3
+ %ptr = alloca i32 ; yields i32*:ptr
+ store i32 3, i32* %ptr ; yields void
+ %val = load i32* %ptr ; yields i32:val = i32 3
.. _i_store:
::
- store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] ; yields {void}
- store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> ; yields {void}
+ store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] ; yields void
+ store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> ; yields void
Overview:
"""""""""
.. code-block:: llvm
- %ptr = alloca i32 ; yields {i32*}:ptr
- store i32 3, i32* %ptr ; yields {void}
- %val = load i32* %ptr ; yields {i32}:val = i32 3
+ %ptr = alloca i32 ; yields i32*:ptr
+ store i32 3, i32* %ptr ; yields void
+ %val = load i32* %ptr ; yields i32:val = i32 3
.. _i_fence:
::
- fence [singlethread] <ordering> ; yields {void}
+ fence [singlethread] <ordering> ; yields void
Overview:
"""""""""
.. code-block:: llvm
- fence acquire ; yields {void}
- fence singlethread seq_cst ; yields {void}
+ fence acquire ; yields void
+ fence singlethread seq_cst ; yields void
.. _i_cmpxchg:
::
- cmpxchg [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields {ty}
+ cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields { ty, i1 }
Overview:
"""""""""
The '``cmpxchg``' instruction is used to atomically modify memory. It
loads a value in memory and compares it to a given value. If they are
-equal, it stores a new value into the memory.
+equal, it tries to store a new value into the memory.
Arguments:
""""""""""
other :ref:`volatile operations <volatile>`.
The success and failure :ref:`ordering <ordering>` arguments specify how this
-``cmpxchg`` synchronizes with other atomic operations. The both ordering
-parameters must be at least ``monotonic``, the ordering constraint on failure
-must be no stronger than that on success, and the failure ordering cannot be
-either ``release`` or ``acq_rel``.
+``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
+must be at least ``monotonic``, the ordering constraint on failure must be no
+stronger than that on success, and the failure ordering cannot be either
+``release`` or ``acq_rel``.
The optional "``singlethread``" argument declares that the ``cmpxchg``
is only atomic with respect to code (usually signal handlers) running in
Semantics:
""""""""""
-The contents of memory at the location specified by the '``<pointer>``'
-operand is read and compared to '``<cmp>``'; if the read value is the
-equal, '``<new>``' is written. The original value at the location is
-returned.
+The contents of memory at the location specified by the '``<pointer>``' operand
+is read and compared to '``<cmp>``'; if the read value is the equal, the
+'``<new>``' is written. The original value at the location is returned, together
+with a flag indicating success (true) or failure (false).
+
+If the cmpxchg operation is marked as ``weak`` then a spurious failure is
+permitted: the operation may not write ``<new>`` even if the comparison
+matched.
+
+If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
+if the value loaded equals ``cmp``.
A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
.. code-block:: llvm
entry:
- %orig = atomic load i32* %ptr unordered ; yields {i32}
+ %orig = atomic load i32* %ptr unordered ; yields i32
br label %loop
loop:
%cmp = phi i32 [ %orig, %entry ], [%old, %loop]
%squared = mul i32 %cmp, %cmp
- %old = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields {i32}
- %success = icmp eq i32 %cmp, %old
+ %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
+ %value_loaded = extractvalue { i32, i1 } %val_success, 0
+ %success = extractvalue { i32, i1 } %val_success, 1
br i1 %success, label %done, label %loop
done:
::
- atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering> ; yields {ty}
+ atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering> ; yields ty
Overview:
"""""""""
.. code-block:: llvm
- %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields {i32}
+ %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32
.. _i_getelementptr:
::
- <result> = icmp <cond> <ty> <op1>, <op2> ; yields {i1} or {<N x i1>}:result
+ <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Overview:
"""""""""
::
- <result> = fcmp <cond> <ty> <op1>, <op2> ; yields {i1} or {<N x i1>}:result
+ <result> = fcmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Overview:
"""""""""
call void %foo(i8 97 signext)
%struct.A = type { i32, i8 }
- %r = call %struct.A @foo() ; yields { 32, i8 }
+ %r = call %struct.A @foo() ; yields { i32, i8 }
%gr = extractvalue %struct.A %r, 0 ; yields i32
%gr1 = extractvalue %struct.A %r, 1 ; yields i8
%Z = call void @foo() noreturn ; indicates that %foo never returns normally
allocatable registers are not supported.
Warning: So far it only works with the stack pointer on selected
-architectures (ARM, ARM64, AArch64, PowerPC and x86_64). Significant amount of
+architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
work is needed to support other registers and even more so, allocatable
registers.
.. code-block:: llvm
- %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields {float}:r2 = (a * b) + c
+ %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
Half Precision Floating Point Intrinsics
----------------------------------------