X-Git-Url: http://plrg.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FLangRef.html;h=1b94ab5a9a705faa034ac9f76a3faac702937e03;hb=83821c8941b7e9e70de9d5e76556b07872ac371b;hp=7cfa05205cead0d9264a0acfdabdf00c95d263c7;hpb=e1d50cd5e4ff7f4d977cc8bda720a58737e7cf8d;p=oota-llvm.git

diff --git a/docs/LangRef.html b/docs/LangRef.html
index 7cfa05205ce..1b94ab5a9a7 100644
--- a/docs/LangRef.html
+++ b/docs/LangRef.html
@@ -255,6 +255,12 @@
           <li><a href="#int_umul_overflow">'<tt>llvm.umul.with.overflow.*</tt> Intrinsics</a></li>
         </ol>
       </li>
+      <li><a href="#int_fp16">Half Precision Floating Point Intrinsics</a>
+        <ol>
+          <li><a href="#int_convert_to_fp16">'<tt>llvm.convert.to.fp16</tt>' Intrinsic</a></li>
+          <li><a href="#int_convert_from_fp16">'<tt>llvm.convert.from.fp16</tt>' Intrinsic</a></li>
+        </ol>
+      </li>
       <li><a href="#int_debugger">Debugger intrinsics</a></li>
       <li><a href="#int_eh">Exception Handling intrinsics</a></li>
       <li><a href="#int_trampoline">Trampoline Intrinsic</a>
@@ -691,9 +697,9 @@ define i32 @main() {                                        <i>; i32()* </i>
       target, without having to conform to an externally specified ABI
       (Application Binary Interface).
       <a href="CodeGenerator.html#tailcallopt">Tail calls can only be optimized
-      when this convention is used.</a>  This calling convention does not
-      support varargs and requires the prototype of all callees to exactly match
-      the prototype of the function definition.</dd>
+      when this or the GHC convention is used.</a>  This calling convention
+      does not support varargs and requires the prototype of all callees to
+      exactly match the prototype of the function definition.</dd>
 
   <dt><b>"<tt>coldcc</tt>" - The cold calling convention</b>:</dt>
   <dd>This calling convention attempts to make code in the caller as efficient
@@ -703,6 +709,26 @@ define i32 @main() {                                        <i>; i32()* </i>
       does not support varargs and requires the prototype of all callees to
       exactly match the prototype of the function definition.</dd>
 
+  <dt><b>"<tt>cc <em>10</em></tt>" - GHC convention</b>:</dt>
+  <dd>This calling convention has been implemented specifically for use by the
+      <a href="http://www.haskell.org/ghc">Glasgow Haskell Compiler (GHC)</a>.
+      It passes everything in registers, going to extremes to achieve this by
+      disabling callee save registers. This calling convention should not be
+      used lightly but only for specific situations such as an alternative to
+      the <em>register pinning</em> performance technique often used when
+      implementing functional programming languages.At the moment only X86
+      supports this convention and it has the following limitations:
+      <ul>
+        <li>On <em>X86-32</em> only supports up to 4 bit type parameters. No
+            floating point types are supported.</li>
+        <li>On <em>X86-64</em> only supports up to 10 bit type parameters and
+            6 floating point parameters.</li>
+      </ul>
+      This calling convention supports
+      <a href="CodeGenerator.html#tailcallopt">tail call optimization</a> but
+      requires both the caller and callee are using it.
+  </dd>
+
   <dt><b>"<tt>cc &lt;<em>n</em>&gt;</tt>" - Numbered convention</b>:</dt>
   <dd>Any calling convention may be specified by number, allowing
       target-specific calling conventions to be used.  Target specific calling
@@ -2490,6 +2516,31 @@ call void asm alignstack "eieio", ""()
    documented here.  Constraints on what can be done (e.g. duplication, moving,
    etc need to be documented).  This is probably best done by reference to
    another document that covers inline asm from a holistic perspective.</p>
+</div>
+
+<div class="doc_subsubsection">
+<a name="inlineasm_md">Inline Asm Metadata</a>
+</div>
+
+<div class="doc_text">
+
+<p>The call instructions that wrap inline asm nodes may have a "!srcloc" MDNode
+   attached to it that contains a constant integer.  If present, the code
+   generator will use the integer as the location cookie value when report
+   errors through the LLVMContext error reporting mechanisms.  This allows a
+   front-end to corrolate backend errors that occur with inline asm back to the
+   source code that produced it.  For example:</p>
+
+<div class="doc_code">
+<pre>
+call void asm sideeffect "something bad", ""()<b>, !srcloc !42</b>
+...
+!42 = !{ i32 1234567 }
+</pre>
+</div>
+
+<p>It is up to the front-end to make sense of the magic numbers it places in the
+   IR.</p>
 
 </div>
 
@@ -2656,7 +2707,7 @@ Instructions</a> </div>
    control flow, not values (the one exception being the
    '<a href="#i_invoke"><tt>invoke</tt></a>' instruction).</p>
 
-<p>There are six different terminator instructions: the
+<p>There are seven different terminator instructions: the
    '<a href="#i_ret"><tt>ret</tt></a>' instruction, the
    '<a href="#i_br"><tt>br</tt></a>' instruction, the
    '<a href="#i_switch"><tt>switch</tt></a>' instruction, the
@@ -5149,8 +5200,11 @@ Loop:       ; Infinite loop that counts from 0 on up...
       a <a href="#i_ret"><tt>ret</tt></a> instruction.  If the "tail" marker is
       present, the function call is eligible for tail call optimization,
       but <a href="CodeGenerator.html#tailcallopt">might not in fact be
-      optimized into a jump</a>.  As of this writing, the extra requirements for
-      a call to actually be optimized are:
+      optimized into a jump</a>.  The code generator may optimize calls marked
+      "tail" with either 1) automatic <a href="CodeGenerator.html#sibcallopt">
+      sibling call optimization</a> when the caller and callee have
+      matching signatures, or 2) forced tail call optimization when the
+      following extra requirements are met:
       <ul>
         <li>Caller and callee both have the calling
             convention <tt>fastcc</tt>.</li>
@@ -5836,17 +5890,14 @@ LLVM</a>.</p>
 
 <h5>Syntax:</h5>
 <p>This is an overloaded intrinsic. You can use <tt>llvm.memcpy</tt> on any
-   integer bit width. Not all targets support all bit widths however.</p>
+   integer bit width and for different address spaces. Not all targets support
+   all bit widths however.</p>
 
 <pre>
-  declare void @llvm.memcpy.i8(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
-                               i8 &lt;len&gt;, i32 &lt;align&gt;)
-  declare void @llvm.memcpy.i16(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
-                                i16 &lt;len&gt;, i32 &lt;align&gt;)
-  declare void @llvm.memcpy.i32(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
-                                i32 &lt;len&gt;, i32 &lt;align&gt;)
-  declare void @llvm.memcpy.i64(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
-                                i64 &lt;len&gt;, i32 &lt;align&gt;)
+  declare void @llvm.memcpy.p0i8.p0i8.i32(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
+                                          i32 &lt;len&gt;, i32 &lt;align&gt;, i1 &lt;isvolatile&gt;)
+  declare void @llvm.memcpy.p0i8.p0i8.i64(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
+                                          i64 &lt;len&gt;, i32 &lt;align&gt;, i1 &lt;isvolatile&gt;)
 </pre>
 
 <h5>Overview:</h5>
@@ -5854,19 +5905,26 @@ LLVM</a>.</p>
    source location to the destination location.</p>
 
 <p>Note that, unlike the standard libc function, the <tt>llvm.memcpy.*</tt>
-   intrinsics do not return a value, and takes an extra alignment argument.</p>
+   intrinsics do not return a value, takes extra alignment/isvolatile arguments
+   and the pointers can be in specified address spaces.</p>
 
 <h5>Arguments:</h5>
+
 <p>The first argument is a pointer to the destination, the second is a pointer
    to the source.  The third argument is an integer argument specifying the
-   number of bytes to copy, and the fourth argument is the alignment of the
-   source and destination locations.</p>
+   number of bytes to copy, the fourth argument is the alignment of the
+   source and destination locations, and the fifth is a boolean indicating a
+   volatile access.</p>
 
 <p>If the call to this intrinsic has an alignment value that is not 0 or 1,
    then the caller guarantees that both the source and destination pointers are
    aligned to that boundary.</p>
 
+<p>Volatile accesses should not be deleted if dead, but the access behavior is
+   not very cleanly specified and it is unwise to depend on it.</p>
+
 <h5>Semantics:</h5>
+
 <p>The '<tt>llvm.memcpy.*</tt>' intrinsics copy a block of memory from the
    source location to the destination location, which are not allowed to
    overlap.  It copies "len" bytes of memory over.  If the argument is known to
@@ -5884,17 +5942,14 @@ LLVM</a>.</p>
 
 <h5>Syntax:</h5>
 <p>This is an overloaded intrinsic. You can use llvm.memmove on any integer bit
-   width. Not all targets support all bit widths however.</p>
+   width and for different address space. Not all targets support all bit
+   widths however.</p>
 
 <pre>
-  declare void @llvm.memmove.i8(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
-                                i8 &lt;len&gt;, i32 &lt;align&gt;)
-  declare void @llvm.memmove.i16(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
-                                 i16 &lt;len&gt;, i32 &lt;align&gt;)
-  declare void @llvm.memmove.i32(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
-                                 i32 &lt;len&gt;, i32 &lt;align&gt;)
-  declare void @llvm.memmove.i64(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
-                                 i64 &lt;len&gt;, i32 &lt;align&gt;)
+  declare void @llvm.memmove.p0i8.p0i8.i32(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
+                                           i32 &lt;len&gt;, i32 &lt;align&gt;, i1 &lt;isvolatile&gt;)
+  declare void @llvm.memmove.p0i8.p0i8.i64(i8 * &lt;dest&gt;, i8 * &lt;src&gt;,
+                                           i64 &lt;len&gt;, i32 &lt;align&gt;, i1 &lt;isvolatile&gt;)
 </pre>
 
 <h5>Overview:</h5>
@@ -5904,19 +5959,26 @@ LLVM</a>.</p>
    overlap.</p>
 
 <p>Note that, unlike the standard libc function, the <tt>llvm.memmove.*</tt>
-   intrinsics do not return a value, and takes an extra alignment argument.</p>
+   intrinsics do not return a value, takes extra alignment/isvolatile arguments
+   and the pointers can be in specified address spaces.</p>
 
 <h5>Arguments:</h5>
+
 <p>The first argument is a pointer to the destination, the second is a pointer
    to the source.  The third argument is an integer argument specifying the
-   number of bytes to copy, and the fourth argument is the alignment of the
-   source and destination locations.</p>
+   number of bytes to copy, the fourth argument is the alignment of the
+   source and destination locations, and the fifth is a boolean indicating a
+   volatile access.</p>
 
 <p>If the call to this intrinsic has an alignment value that is not 0 or 1,
    then the caller guarantees that the source and destination pointers are
    aligned to that boundary.</p>
 
+<p>Volatile accesses should not be deleted if dead, but the access behavior is
+   not very cleanly specified and it is unwise to depend on it.</p>
+
 <h5>Semantics:</h5>
+
 <p>The '<tt>llvm.memmove.*</tt>' intrinsics copy a block of memory from the
    source location to the destination location, which may overlap.  It copies
    "len" bytes of memory over.  If the argument is known to be aligned to some
@@ -5934,17 +5996,14 @@ LLVM</a>.</p>
 
 <h5>Syntax:</h5>
 <p>This is an overloaded intrinsic. You can use llvm.memset on any integer bit
-   width. Not all targets support all bit widths however.</p>
+   width and for different address spaces. Not all targets support all bit
+   widths however.</p>
 
 <pre>
-  declare void @llvm.memset.i8(i8 * &lt;dest&gt;, i8 &lt;val&gt;,
-                               i8 &lt;len&gt;, i32 &lt;align&gt;)
-  declare void @llvm.memset.i16(i8 * &lt;dest&gt;, i8 &lt;val&gt;,
-                                i16 &lt;len&gt;, i32 &lt;align&gt;)
-  declare void @llvm.memset.i32(i8 * &lt;dest&gt;, i8 &lt;val&gt;,
-                                i32 &lt;len&gt;, i32 &lt;align&gt;)
-  declare void @llvm.memset.i64(i8 * &lt;dest&gt;, i8 &lt;val&gt;,
-                                i64 &lt;len&gt;, i32 &lt;align&gt;)
+  declare void @llvm.memset.p0i8.i32(i8 * &lt;dest&gt;, i8 &lt;val&gt;,
+                                     i32 &lt;len&gt;, i32 &lt;align&gt;, i1 &lt;isvolatile&gt;)
+  declare void @llvm.memset.p0i8.i64(i8 * &lt;dest&gt;, i8 &lt;val&gt;,
+                                     i64 &lt;len&gt;, i32 &lt;align&gt;, i1 &lt;isvolatile&gt;)
 </pre>
 
 <h5>Overview:</h5>
@@ -5952,7 +6011,8 @@ LLVM</a>.</p>
    particular byte value.</p>
 
 <p>Note that, unlike the standard libc function, the <tt>llvm.memset</tt>
-   intrinsic does not return a value, and takes an extra alignment argument.</p>
+   intrinsic does not return a value, takes extra alignment/volatile arguments,
+   and the destination can be in an arbitrary address space.</p>
 
 <h5>Arguments:</h5>
 <p>The first argument is a pointer to the destination to fill, the second is the
@@ -5964,6 +6024,9 @@ LLVM</a>.</p>
    then the caller guarantees that the destination pointer is aligned to that
    boundary.</p>
 
+<p>Volatile accesses should not be deleted if dead, but the access behavior is
+   not very cleanly specified and it is unwise to depend on it.</p>
+
 <h5>Semantics:</h5>
 <p>The '<tt>llvm.memset.*</tt>' intrinsics fill "len" bytes of memory starting
    at the destination location.  If the argument is known to be aligned to some
@@ -6583,6 +6646,97 @@ LLVM</a>.</p>
 
 </div>
 
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+  <a name="int_fp16">Half Precision Floating Point Intrinsics</a>
+</div>
+
+<div class="doc_text">
+
+<p>Half precision floating point is a storage-only format. This means that it is
+   a dense encoding (in memory) but does not support computation in the
+   format.</p>
+   
+<p>This means that code must first load the half-precision floating point
+   value as an i16, then convert it to float with <a
+   href="#int_convert_from_fp16"><tt>llvm.convert.from.fp16</tt></a>.
+   Computation can then be performed on the float value (including extending to
+   double etc).  To store the value back to memory, it is first converted to
+   float if needed, then converted to i16 with
+   <a href="#int_convert_to_fp16"><tt>llvm.convert.to.fp16</tt></a>, then
+   storing as an i16 value.</p>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+  <a name="int_convert_to_fp16">'<tt>llvm.convert.to.fp16</tt>' Intrinsic</a>
+</div>
+
+<div class="doc_text">
+
+<h5>Syntax:</h5>
+<pre>
+  declare i16 @llvm.convert.to.fp16(f32 %a)
+</pre>
+
+<h5>Overview:</h5>
+<p>The '<tt>llvm.convert.to.fp16</tt>' intrinsic function performs
+   a conversion from single precision floating point format to half precision
+   floating point format.</p>
+
+<h5>Arguments:</h5>
+<p>The intrinsic function contains single argument - the value to be
+   converted.</p>
+
+<h5>Semantics:</h5>
+<p>The '<tt>llvm.convert.to.fp16</tt>' intrinsic function performs
+   a conversion from single precision floating point format to half precision
+   floating point format. The return value is an <tt>i16</tt> which
+   contains the converted number.</p>
+
+<h5>Examples:</h5>
+<pre>
+  %res = call i16 @llvm.convert.to.fp16(f32 %a)
+  store i16 %res, i16* @x, align 2
+</pre>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection">
+ <a name="int_convert_from_fp16">'<tt>llvm.convert.from.fp16</tt>' Intrinsic</a>
+</div>
+
+<div class="doc_text">
+
+<h5>Syntax:</h5>
+<pre>
+  declare f32 @llvm.convert.from.fp16(i16 %a)
+</pre>
+
+<h5>Overview:</h5>
+<p>The '<tt>llvm.convert.from.fp16</tt>' intrinsic function performs
+   a conversion from half precision floating point format to single precision
+   floating point format.</p>
+
+<h5>Arguments:</h5>
+<p>The intrinsic function contains single argument - the value to be
+   converted.</p>
+
+<h5>Semantics:</h5>
+<p>The '<tt>llvm.convert.from.fp16</tt>' intrinsic function performs a
+   conversion from half single precision floating point format to single
+   precision floating point format. The input half-float value is represented by
+   an <tt>i16</tt> value.</p>
+
+<h5>Examples:</h5>
+<pre>
+  %a = load i16* @x, align 2
+  %res = call f32 @llvm.convert.from.fp16(i16 %a)
+</pre>
+
+</div>
+
 <!-- ======================================================================= -->
 <div class="doc_subsection">
   <a name="int_debugger">Debugger Intrinsics</a>