X-Git-Url: http://plrg.eecs.uci.edu/git/?a=blobdiff_plain;f=docs%2FMIRLangRef.rst;h=a5f8c8c743ab2829b79f1b3fcd434dbb4a691190;hb=36eebaa409ab1bceb9ad0d4fb8ad7e47d61bd31c;hp=d67c15bcf90515d721f938b26dfd9016152afd96;hpb=ded00c79af1b1a493424b11006440de78e016d2c;p=oota-llvm.git diff --git a/docs/MIRLangRef.rst b/docs/MIRLangRef.rst index d67c15bcf90..a5f8c8c743a 100644 --- a/docs/MIRLangRef.rst +++ b/docs/MIRLangRef.rst @@ -33,9 +33,74 @@ contain the serialized machine functions. .. _YAML documents: http://www.yaml.org/spec/1.2/spec.html#id2800132 +MIR Testing Guide +================= + +You can use the MIR format for testing in two different ways: + +- You can write MIR tests that invoke a single code generation pass using the + ``run-pass`` option in llc. + +- You can use llc's ``stop-after`` option with existing or new LLVM assembly + tests and check the MIR output of a specific code generation pass. + +Testing Individual Code Generation Passes +----------------------------------------- + +The ``run-pass`` option in llc allows you to create MIR tests that invoke +just a single code generation pass. When this option is used, llc will parse +an input MIR file, run the specified code generation pass, and print the +resulting MIR to the standard output stream. + +You can generate an input MIR file for the test by using the ``stop-after`` +option in llc. For example, if you would like to write a test for the +post register allocation pseudo instruction expansion pass, you can specify +the machine copy propagation pass in the ``stop-after`` option, as it runs +just before the pass that we are trying to test: + + ``llc -stop-after machine-cp bug-trigger.ll > test.mir`` + +After generating the input MIR file, you'll have to add a run line that uses +the ``-run-pass`` option to it. In order to test the post register allocation +pseudo instruction expansion pass on X86-64, a run line like the one shown +below can be used: + + ``# RUN: llc -run-pass postrapseudos -march=x86-64 %s -o /dev/null | FileCheck %s`` + +The MIR files are target dependent, so they have to be placed in the target +specific test directories. They also need to specify a target triple or a +target architecture either in the run line or in the embedded LLVM IR module. + +Limitations +----------- + +Currently the MIR format has several limitations in terms of which state it +can serialize: + +- The target-specific state in the target-specific ``MachineFunctionInfo`` + subclasses isn't serialized at the moment. + +- The target-specific ``MachineConstantPoolValue`` subclasses (in the ARM and + SystemZ backends) aren't serialized at the moment. + +- The ``MCSymbol`` machine operands are only printed, they can't be parsed. + +- A lot of the state in ``MachineModuleInfo`` isn't serialized - only the CFI + instructions and the variable debug information from MMI is serialized right + now. + +These limitations impose restrictions on what you can test with the MIR format. +For now, tests that would like to test some behaviour that depends on the state +of certain ``MCSymbol`` operands or the exception handling state in MMI, can't +use the MIR format. As well as that, tests that test some behaviour that +depends on the state of the target specific ``MachineFunctionInfo`` or +``MachineConstantPoolValue`` subclasses can't use the MIR format at the moment. + High Level Structure ==================== +.. _embedded-module: + Embedded Module --------------- @@ -167,6 +232,8 @@ of 32 and 16: bb.0.entry: successors: %bb.1.then(32), %bb.2.else(16) +.. _bb-liveins: + Live In Registers ^^^^^^^^^^^^^^^^^ @@ -203,7 +270,8 @@ specified in brackets after the block's definition: Machine Instructions -------------------- -A machine instruction is composed of a name, machine operands, +A machine instruction is composed of a name, +:ref:`machine operands `, :ref:`instruction flags `, and machine memory operands. The instruction's name is usually specified before the operands. The example @@ -239,17 +307,177 @@ The flag ``frame-setup`` can be specified before the instruction's name: %fp = frame-setup ADDXri %sp, 0, 0 +.. _registers: + +Registers +--------- + +Registers are one of the key primitives in the machine instructions +serialization language. They are primarly used in the +:ref:`register machine operands `, +but they can also be used in a number of other places, like the +:ref:`basic block's live in list `. + +The physical registers are identified by their name. They use the following +syntax: + +.. code-block:: llvm + + % + +The example below shows three X86 physical registers: + +.. code-block:: llvm + + %eax + %r15 + %eflags + +The virtual registers are identified by their ID number. They use the following +syntax: + +.. code-block:: llvm + + % + +Example: + +.. code-block:: llvm + + %0 + +The null registers are represented using an underscore ('``_``'). They can also be +represented using a '``%noreg``' named register, although the former syntax +is preferred. + +.. _machine-operands: + +Machine Operands +---------------- + +There are seventeen different kinds of machine operands, and all of them, except +the ``MCSymbol`` operand, can be serialized. The ``MCSymbol`` operands are +just printed out - they can't be parsed back yet. + +Immediate Operands +^^^^^^^^^^^^^^^^^^ + +The immediate machine operands are untyped, 64-bit signed integers. The +example below shows an instance of the X86 ``MOV32ri`` instruction that has an +immediate machine operand ``-42``: + +.. code-block:: llvm + + %eax = MOV32ri -42 + +.. TODO: Describe the CIMM (Rare) and FPIMM immediate operands. + +.. _register-operands: + +Register Operands +^^^^^^^^^^^^^^^^^ + +The :ref:`register ` primitive is used to represent the register +machine operands. The register operands can also have optional +:ref:`register flags `, +:ref:`a subregister index `, +and a reference to the tied register operand. +The full syntax of a register operand is shown below: + +.. code-block:: llvm + + [] [ : ] [ (tied-def ) ] + +This example shows an instance of the X86 ``XOR32rr`` instruction that has +5 register operands with different register flags: + +.. code-block:: llvm + + dead %eax = XOR32rr undef %eax, undef %eax, implicit-def dead %eflags, implicit-def %al + +.. _register-flags: + +Register Flags +~~~~~~~~~~~~~~ + +The table below shows all of the possible register flags along with the +corresponding internal ``llvm::RegState`` representation: + +.. list-table:: + :header-rows: 1 + + * - Flag + - Internal Value + + * - ``implicit`` + - ``RegState::Implicit`` + + * - ``implicit-def`` + - ``RegState::ImplicitDefine`` + + * - ``def`` + - ``RegState::Define`` + + * - ``dead`` + - ``RegState::Dead`` + + * - ``killed`` + - ``RegState::Kill`` + + * - ``undef`` + - ``RegState::Undef`` + + * - ``internal`` + - ``RegState::InternalRead`` + + * - ``early-clobber`` + - ``RegState::EarlyClobber`` + + * - ``debug-use`` + - ``RegState::Debug`` + +.. _subregister-indices: + +Subregister Indices +~~~~~~~~~~~~~~~~~~~ + +The register machine operands can reference a portion of a register by using +the subregister indices. The example below shows an instance of the ``COPY`` +pseudo instruction that uses the X86 ``sub_8bit`` subregister index to copy 8 +lower bits from the 32-bit virtual register 0 to the 8-bit virtual register 1: + +.. code-block:: llvm + + %1 = COPY %0:sub_8bit + +The names of the subregister indices are target specific, and are typically +defined in the target's ``*RegisterInfo.td`` file. + +Global Value Operands +^^^^^^^^^^^^^^^^^^^^^ + +The global value machine operands reference the global values from the +:ref:`embedded LLVM IR module `. +The example below shows an instance of the X86 ``MOV64rm`` instruction that has +a global value operand named ``G``: + +.. code-block:: llvm + + %rax = MOV64rm %rip, 1, _, @G, _ + +The named global values are represented using an identifier with the '@' prefix. +If the identifier doesn't match the regular expression +`[-a-zA-Z$._][-a-zA-Z$._0-9]*`, then this identifier must be quoted. + +The unnamed global values are represented using an unsigned numeric value with +the '@' prefix, like in the following examples: ``@0``, ``@989``. .. TODO: Describe the parsers default behaviour when optional YAML attributes are missing. .. TODO: Describe the syntax for the bundled instructions. -.. TODO: Describe the syntax of the immediate machine operands. -.. TODO: Describe the syntax of the register machine operands. -.. TODO: Describe the syntax of the virtual register operands and their YAML - definitions. -.. TODO: Describe the syntax of the register operand flags and the subregisters. +.. TODO: Describe the syntax for virtual register YAML definitions. .. TODO: Describe the machine function's YAML flag attributes. -.. TODO: Describe the syntax for the global value, external symbol and register +.. TODO: Describe the syntax for the external symbol and register mask machine operands. .. TODO: Describe the frame information YAML mapping. .. TODO: Describe the syntax of the stack object machine operands and their