final updates

[oota-llvm.git] / docs / TestingGuide.html
diff --git a/docs/TestingGuide.html b/docs/TestingGuide.html

index 09d08c3356018d110646b5dda5129dfd783b4eb4..f03adaaef64b0cdf326b27078e35114a473ceb88 100644 (file)
--- a/docs/TestingGuide.html
+++ b/docs/TestingGuide.html
@@ -216,19 +216,27 @@ module.</p>
  subtrees of the test suite directory tree are as follows:</p>
      
  <ul>
-<li><tt>llvm/test/Features</tt>
-<p>This directory contains sample codes that test various features of the
-LLVM language.  These pieces of sample code are run through various
-assembler, disassembler, and optimizer passes.</p>
-</li>
-
-<li><tt>llvm/test/Regression</tt>
-<p>This directory contains regression tests for LLVM.  When a bug is found
-in LLVM, a regression test containing just enough code to reproduce the
-problem should be written and placed somewhere underneath this directory.
-In most cases, this will be a small piece of LLVM assembly language code,
-often distilled from an actual application or benchmark.</p>
-</li>
+  <li><tt>llvm/test</tt>
+  <p>This directory contains a large array of small tests
+  that exercise various features of LLVM and to ensure that regressions do not
+  occur. The directory is broken into several sub-directories, each focused on
+  a particular area of LLVM. A few of the important ones are:<ul>
+    <li><tt>Analysis</tt>: checks Analysis passes.</li>
+    <li><tt>Archive</tt>: checks the Archive library.</li>
+    <li><tt>Assembler</tt>: checks Assembly reader/writer functionality.</li>
+    <li><tt>Bytecode</tt>: checks Bytecode reader/writer functionality.</li>
+    <li><tt>CodeGen</tt>: checks code generation and each target.</li>
+    <li><tt>Features</tt>: checks various features of the LLVM language.</li>
+    <li><tt>Linker</tt>: tests bytecode linking.</li>
+    <li><tt>Transforms</tt>: tests each of the scalar, IPO, and utility
+    transforms to ensure they make the right transformations.</li>
+    <li><tt>Verifier</tt>: tests the IR verifier.</li>
+  </ul></p>
+  <p>Typically when a bug is found in LLVM, a regression test containing 
+  just enough code to reproduce the problem should be written and placed 
+  somewhere underneath this directory.  In most cases, this will be a small 
+  piece of LLVM assembly language code, often distilled from an actual 
+  application or benchmark.</p></li>
  
  <li><tt>llvm-test</tt>
  <p>The <tt>llvm-test</tt> CVS module contains programs that can be compiled 
@@ -267,81 +275,259 @@ location of these external programs is configured by the llvm-test
  <!--=========================================================================-->
  <div class="doc_section"><a name="dgstructure">DejaGNU Structure</a></div>
  <!--=========================================================================-->
-
  <div class="doc_text">
-<p>The LLVM test suite is partially driven by DejaGNU and partially
-driven by GNU Make. Specifically, the Features and Regression tests
-are all driven by DejaGNU. The <tt>llvm-test</tt>
-module is currently driven by a set of Makefiles.</p>
-
-<p>The DejaGNU structure is very simple, but does require some
-information to be set. This information is gathered via <tt>configure</tt> and
-is written to a file, <tt>site.exp</tt> in <tt>llvm/test</tt>. The
-<tt>llvm/test</tt>
-Makefile does this work for you.</p>
-
-<p>In order for DejaGNU to work, each directory of tests must have a
-<tt>dg.exp</tt> file. This file is a program written in tcl that calls
-the <tt>llvm-runtests</tt> procedure on each test file. The
-llvm-runtests procedure is defined in
-<tt>llvm/test/lib/llvm-dg.exp</tt>. Any directory that contains only
-directories does not need the <tt>dg.exp</tt> file.</p>
-
-<p>In order for a test to be run, it must contain information within
-the test file on how to run the test. These are called <tt>RUN</tt>
-lines. Run lines are specified in the comments of the test program
-using the keyword <tt>RUN</tt> followed by a colon, and lastly the
-commands to execute. These commands will be executed in a bash script,
-so any bash syntax is acceptable. You can specify as many RUN lines as
-necessary.  Each RUN line translates to one line in the resulting bash
-script. Below is an example of legal RUN lines in a <tt>.ll</tt>
-file:</p>
-<pre>
-; RUN: llvm-as < %s | llvm-dis > %t1
-; RUN: llvm-dis < %s.bc-13 > %t2
-; RUN: diff %t1 %t2
-</pre>
-<p>There are a couple patterns within a <tt>RUN</tt> line that the
-llvm-runtest procedure looks for and replaces with the appropriate
-syntax:</p>
-
-<dl style="margin-left: 25px">
-<dt>%p</dt> 
-<dd>The path to the source directory. This is for locating
-any supporting files that are not generated by the test, but used by
-the test.</dd> 
-<dt>%s</dt> 
-<dd>The test file.</dd> 
-
-<dt>%t</dt>
-<dd>Temporary filename: testscript.test_filename.tmp, where
-test_filename is the name of the test file. All temporary files are
-placed in the Output directory within the directory the test is
-located.</dd> 
-
-<dt>%prcontext</dt> 
-<dd>Path to a script that performs grep -C. Use this since not all
-platforms support grep -C.</dd>
-
-<dt>%llvmgcc</dt> <dd>Full path to the llvm-gcc executable.</dd>
-<dt>%llvmgxx</dt> <dd>Full path to the llvm-g++ executable.</dd>
-</dl>
+  <p>The LLVM test suite is partially driven by DejaGNU and partially driven by 
+  GNU Make. Specifically, the Features and Regression tests are all driven by 
+  DejaGNU. The <tt>llvm-test</tt> module is currently driven by a set of 
+  Makefiles.</p>
+
+  <p>The DejaGNU structure is very simple, but does require some information to 
+  be set. This information is gathered via <tt>configure</tt> and is written 
+  to a file, <tt>site.exp</tt> in <tt>llvm/test</tt>. The <tt>llvm/test</tt> 
+  Makefile does this work for you.</p>
+
+  <p>In order for DejaGNU to work, each directory of tests must have a 
+  <tt>dg.exp</tt> file. DejaGNU looks for this file to determine how to run the
+  tests. This file is just a Tcl script and it can do anything you want, but 
+  we've standardized it for the LLVM regression tests. It simply loads a Tcl 
+  library (<tt>test/lib/llvm.exp</tt>) and calls the <tt>llvm_runtests</tt> 
+  function defined in that library with a list of file names to run. The names 
+  are obtained by using Tcl's glob command.  Any directory that contains only
+  directories does not need the <tt>dg.exp</tt> file.</p>
+
+  <p>The <tt>llvm-runtests</tt> function lookas at each file that is passed to
+  it and gathers any lines together that match "RUN:". This are the "RUN" lines
+  that specify how the test is to be run. So, each test script must contain
+  RUN lines if it is to do anything. If there are no RUN lines, the
+  <tt>llvm-runtests</tt> function will issue an error and the test will
+  fail.</p>
+
+  <p>RUN lines are specified in the comments of the test program using the 
+  keyword <tt>RUN</tt> followed by a colon, and lastly the command (pipeline) 
+  to execute.  Together, these lines form the "script" that 
+  <tt>llvm-runtests</tt> executes to run the test case.  The syntax of the
+  RUN lines is similar to a shell's syntax for pipelines including I/O
+  redirection and variable substitution.  However, even though these lines 
+  may <i>look</i> like a shell script, they are not. RUN lines are interpreted 
+  directly by the Tcl <tt>exec</tt> command. They are never executed by a 
+  shell. Consequently the syntax differs from normal shell script syntax in a 
+  few ways.  You can specify as many RUN lines as needed.</p>
+
+  <p>Each RUN line is executed on its own, distinct from other lines unless
+  its last character is <tt>\</tt>. This continuation character causes the RUN
+  line to be concatenated with the next one. In this way you can build up long
+  pipelines of commands without making huge line lengths. The lines ending in
+  <tt>\</tt> are concatenated until a RUN line that doesn't end in <tt>\</tt> is
+  found. This concatenated set or RUN lines then constitutes one execution. 
+  Tcl will substitute variables and arrange for the pipeline to be executed. If
+  any process in the pipeline fails, the entire line (and test case) fails too.
+  </p>
+
+  <p> Below is an example of legal RUN lines in a <tt>.ll</tt> file:</p>
+  <pre>
+  ; RUN: llvm-as &lt; %s | llvm-dis &gt; %t1
+  ; RUN: llvm-dis &lt; %s.bc-13 &gt; %t2
+  ; RUN: diff %t1 %t2
+  </pre>
+
+  <p>As with a Unix shell, the RUN: lines permit pipelines and I/O redirection
+  to be used. However, the usage is slightly different than for Bash. To check
+  what's legal, see the documentation for the 
+  <a href="http://www.tcl.tk/man/tcl8.5/TclCmd/exec.htm#M2">Tcl exec</a>
+  command and the 
+  <a href="http://www.tcl.tk/man/tcl8.5/tutorial/Tcl26.html">tutorial</a>. 
+  The major differences are:</p>
+  <ul>
+    <li>You can't do <tt>2&gt;&amp;1</tt>. That will cause Tcl to write to a
+    file named <tt>&amp;1</tt>. Usually this is done to get stderr to go through
+    a pipe. You can do that in tcl with <tt>|&amp;</tt> so replace this idiom:
+    <tt>... 2&gt;&amp;1 | grep</tt> with <tt>... |&amp; grep</tt></li>
+    <li>You can only redirect to a file, not to another descriptor and not from
+    a here document.</li>
+    <li>tcl supports redirecting to open files with the @ syntax but you
+    shouldn't use that here.</li>
+  </ul>
+
+  <p>There are some quoting rules that you must pay attention to when writing
+  your RUN lines. In general nothing needs to be quoted. Tcl won't strip off any
+  ' or " so they will get passed to the invoked program. For example:</p>
+  <pre>
+     ... | grep 'find this string'
+  </pre>
+  <p>This will fail because the ' characters are passed to grep. This would
+  instruction grep to look for <tt>'find</tt> in the files <tt>this</tt> and
+  <tt>string'</tt>. To avoid this use curly braces to tell Tcl that it should
+  treat everything enclosed as one value. So our example would become:</p>
+  <pre>
+     ... | grep {find this string}
+  </pre>
+  <p>Additionally, the characters <tt>[</tt> and <tt>]</tt> are treated 
+  specially by Tcl. They tell Tcl to interpret the content as a command to
+  execute. Since these characters are often used in regular expressions this can
+  have disastrous results and cause the entire test run in a directory to fail.
+  For example, a common idiom is to look for some basicblock number:</p>
+  <pre>
+     ... | grep bb[2-8]
+  </pre>
+  <p>This, however, will cause Tcl to fail because its going to try to execute
+  a program named "2-8". Instead, what you want is this:</p>
+  <pre>
+     ... | grep {bb\[2-8\]}
+  </pre>
+  <p>Finally, if you need to pass the <tt>\</tt> character down to a program,
+  then it must be doubled. This is another Tcl special character. So, suppose
+  you had:
+  <pre>
+     ... | grep 'i32\*'
+  </pre>
+  <p>This will fail to match what you want (a pointer to i32). First, the
+  <tt>'</tt> do not get stripped off. Second, the <tt>\</tt> gets stripped off
+  by Tcl so what grep sees is: <tt>'i32*'</tt>. That's not likely to match
+  anything. To resolve this you must use <tt>\\</tt> and the <tt>{}</tt>, like
+  this:</p>
+  <pre>
+     ... | grep {i32\\*}
+  </pre>
  
-<p>There are also several scripts in the llvm/test/Scripts directory
-that you might find useful when writing <tt>RUN</tt> lines.</p>
-
-<p>Lastly, you can easily mark a test that is expected to fail on a
-specific platform or with a specific version of llvmgcc by using the
- <tt>XFAIL</tt> keyword. Xfail lines are
-specified in the comments of the test program using <tt>XFAIL</tt>,
-followed by a colon, and one or more regular expressions (separated by
-a comma) that will match against the target triplet or llvmgcc version for the
-machine. You can use * to match all targets. You can specify the major or full
- version (i.e. 3.4) for llvmgcc. Here is an example of an
-<tt>XFAIL</tt> line:</p>
-<pre>
-; XFAIL: darwin,sun,llvmgcc4
-</pre>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsection"><a name="dgvars">Vars And Substitutions</a></div>
+<div class="doc_text">
+  <p>With a RUN line there are a number of substitutions that are permitted. In
+  general, any Tcl variable that is available in the <tt>substitute</tt> 
+  function (in <tt>test/lib/llvm.exp</tt>) can be substituted into a RUN line.
+  To make a substitution just write the variable's name preceded by a $. 
+  Additionally, for compatibility reasons with previous versions of the test
+  library, certain names can be accessed with an alternate syntax: a % prefix.
+  These alternates are deprecated and may go away in a future version.
+  </p>
+  Here are the available variable names. The alternate syntax is listed in
+  parentheses.</p>
+  <dl style="margin-left: 25px">
+    <dt><b>$test</b> (%s)</dt>
+    <dd>The full path to the test case's source. This is suitable for passing
+    on the command line as the input to an llvm tool.</dd>
+    <dt><b>$srcdir</b></dt>
+    <dd>The source directory from where the "<tt>make check</tt>" was run.</dd>
+    <dt><b>objdir</b></dt>
+    <dd>The object directory that corresponds to the </tt>$srcdir</tt>.</dd>
+    <dt><b>subdir</b></dt>
+    <dd>A partial path from the <tt>test</tt> directory that contains the 
+    sub-directory that contains the test source being executed.</dd>
+    <dt><b>srcroot</b></dt>
+    <dd>The root directory of the LLVM src tree.</dd>
+    <dt><b>objroot</b></dt>
+    <dd>The root directory of the LLVM object tree. This could be the same
+    as the srcroot.</dd>
+    <dt><b>path</b><dt>
+    <dd>The path to the directory that contains the test case source.  This is 
+    for locating any supporting files that are not generated by the test, but 
+    used by the test.</dd>
+    <dt><b>tmp</b></dt>
+    <dd>The path to a temporary file name that could be used for this test case.
+    The file name won't conflict with other test cases. You can append to it if
+    you need multiple temporaries. This is useful as the destination of some
+    redirected output.</dd>
+    <dt><b>llvmlibsdir</b> (%llvmlibsdir)</dt>
+    <dd>The directory where the LLVM libraries are located.</dd>
+    <dt><b>target_triplet</b> (%target_triplet)</dt>
+    <dd>The target triplet that corresponds to the current host machine (the one
+    running the test cases). This should probably be called "host".<dd>
+    <dt><b>prcontext</b> (%prcontext)</dt>
+    <dd>Path to the prcontext tcl script that prints some context around a 
+    line that matches a pattern. This isn't strictly necessary as the test suite
+    is run with its PATH altered to include the test/Scripts directory where
+    the prcontext script is located. Note that this script is similar to 
+    <tt>grep -C</tt> but you should use the <tt>prcontext</tt> script because
+    not all platforms support <tt>grep -C</tt>.</dd>
+    <dt><b>llvmgcc</b> (%llvmgcc)</dt>
+    <dd>The full path to the <tt>llvm-gcc</tt> executable as specified in the
+    configured LLVM environment</dd>
+    <dt><b>llvmgxx</b> (%llvmgxx)</dt>
+    <dd>The full path to the <tt>llvm-gxx</tt> executable as specified in the
+    configured LLVM environment</dd>
+    <dt><b>llvmgcc_version</b> (%llvmgcc_version)</dt>
+    <dd>The full version number of the <tt>llvm-gcc</tt> executable.</dd>
+    <dt><b>llvmgccmajvers</b> (%llvmgccmajvers)</dt>
+    <dd>The major version number of the <tt>llvm-gcc</tt> executable.</dd>
+    <dt><b>gccpath</b></dt>
+    <dd>The full path to the C compiler used to <i>build </i> LLVM. Note that 
+    this might not be gcc.</dd>
+    <dt><b>gxxpath</b></dt>
+    <dd>The full path to the C++ compiler used to <i>build </i> LLVM. Note that 
+    this might not be g++.</dd>
+    <dt><b>compile_c</b> (%compile_c)</dt>
+    <dd>The full command line used to compile LLVM C source  code. This has all 
+    the configured -I, -D and optimization options.</dd>
+    <dt><b>compile_cxx</b> (%compile_cxx)</dt>
+    <dd>The full command used to compile LLVM C++ source  code. This has 
+    all the configured -I, -D and optimization options.</dd>
+    <dt><b>link</b> (%link)</dt> 
+    <dd>This full link command used to link LLVM executables. This has all the
+    configured -I, -L and -l options.</dd>
+    <dt><b>shlibext</b> (%shlibext)</dt>
+    <dd>The suffix for the host platforms share library (dll) files. This
+    includes the period as the first character.</dd>
+  </dl>
+  <p>To add more variables, two things need to be changed. First, add a line in
+  the <tt>test/Makefile</tt> that creates the <tt>site.exp</tt> file. This will
+  "set" the variable as a global in the site.exp file. Second, in the
+  <tt>test/lib/llvm.exp</tt> file, in the substitute proc, add the variable name
+  to the list of "global" declarations at the beginning of the proc. That's it,
+  the variable can then be used in test scripts.</p>
+</div>
+  
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsection"><a name="dgfeatures">Other Features</a></div>
+<div class="doc_text">
+  <p>To make RUN line writing easier, there are several shell scripts located
+  in the <tt>llvm/test/Scripts</tt> directory. For example:</p>
+  <dl>
+    <dt><b>ignore</b></dt>
+    <dd>This script runs its arguments and then always returns 0. This is useful
+    in cases where the test needs to cause a tool to generate an error (e.g. to
+    check the error output). However, any program in a pipeline that returns a
+    non-zero result will cause the test to fail. This script overcomes that 
+    issue and nicely documents that the test case is purposefully ignoring the
+    result code of the tool</dd>
+    <dt><b>not</b></dt>
+    <dd>This script runs its arguments and then inverts the result code from 
+    it. Zero result codes become 1. Non-zero result codes become 0. This is
+    useful to invert the result of a grep. For example "not grep X" means
+    succeed only if you don't find X in the input.</dd>
+  </dl>
+
+  <p>Sometimes it is necessary to mark a test case as "expected fail" or XFAIL.
+  You can easily mark a test as XFAIL just by including  <tt>XFAIL: </tt> on a
+  line near the top of the file. This signals that the test case should succeed
+  if the test fails. Such test cases are counted separately by DejaGnu. To
+  specify an expected fail, use the XFAIL keyword in the comments of the test
+  program followed by a colon and one or more regular expressions (separated by
+  a comma). The regular expressions allow you to XFAIL the test conditionally
+  by host platform. The regular expressions following the : are matched against
+  the target triplet or llvmgcc version number for the host machine. If there is
+  a match, the test is expected to fail. If not, the test is expected to
+  succeed. To XFAIL everywhere just specify <tt>XFAIL: *</tt>. When matching
+  the llvm-gcc version, you can specify the major (e.g. 3) or full version 
+  (i.e. 3.4) number. Here is an example of an <tt>XFAIL</tt> line:</p>
+  <pre>
+   ; XFAIL: darwin,sun,llvmgcc4
+  </pre>
+
+  <p>To make the output more useful, the <tt>llvm_runtest</tt> function wil
+  scan the lines of the test case for ones that contain a pattern that matches
+  PR[0-9]+. This is the syntax for specifying a PR (Problem Report) number that
+  is related to the test case. The numer after "PR" specifies the LLVM bugzilla
+  number. When a PR number is specified, it will be used in the pass/fail
+  reporting. This is useful to quickly get some context when a test fails.</p>
+
+  <p>Finally, any line that contains "END." will cause the special
+  interpretation of lines to terminate. This is generally done right after the
+  last RUN: line. This has two side effects: (a) it prevents special
+  interpretation of lines that are part of the test program, not the
+  instructions to the test case, and (b) it speeds things up for really big test
+  cases by avoiding interpretation of the remainder of the file.</p>
  
  </div>