X-Git-Url: http://plrg.eecs.uci.edu/git/?a=blobdiff_plain;ds=sidebyside;f=docs%2FExceptionHandling.html;h=b378deb00711e8a6d662f33dfee9fad09689f266;hb=3e6157de576e349d33a9b08d103405b3a8fb9159;hp=57b0c4d84887555b480bed8cabfa2007842aafcb;hpb=fb0a64a172fa405f7d0bcdd11226b99b433b8522;p=oota-llvm.git diff --git a/docs/ExceptionHandling.html b/docs/ExceptionHandling.html index 57b0c4d8488..b378deb0071 100644 --- a/docs/ExceptionHandling.html +++ b/docs/ExceptionHandling.html @@ -3,11 +3,15 @@ Exception Handling in LLVM + + + -
Exception Handling in LLVM
+

Exception Handling in LLVM

@@ -16,28 +20,31 @@
  • Introduction
    1. Itanium ABI Zero-cost Exception Handling
    2. +
    3. Setjmp/Longjmp Exception Handling
    4. Overview
  • LLVM Code Generation
    1. Throw
    2. Try/Catch
    3. -
    4. Finallys
    5. +
    6. Cleanups
    7. Throw Filters
    8. +
    9. Restrictions
  • Exception Handling Intrinsics
      -
    1. llvm.eh.exception
    2. -
    3. llvm.eh.selector
    4. -
    5. llvm.eh.filter
    6. llvm.eh.typeid.for
    7. +
    8. llvm.eh.sjlj.setjmp
    9. +
    10. llvm.eh.sjlj.longjmp
    11. +
    12. llvm.eh.sjlj.lsda
    13. +
    14. llvm.eh.sjlj.callsite
    15. +
    16. llvm.eh.sjlj.dispatchsetup
  • Asm Table Formats
    1. Exception Handling Frame
    2. Exception Tables
  • -
  • ToDo
  • @@ -48,397 +55,512 @@ -
    Introduction
    +

    Introduction

    -
    +

    This document is the central repository for all information pertaining to -exception handling in LLVM. It describes the format that LLVM exception -handling information takes, which is useful for those interested in creating -front-ends or dealing directly with the information. Further, this document -provides specific examples of what exception handling information is used for -C/C++.

    - -
    + exception handling in LLVM. It describes the format that LLVM exception + handling information takes, which is useful for those interested in creating + front-ends or dealing directly with the information. Further, this document + provides specific examples of what exception handling information is used for + in C and C++.

    -
    +

    Itanium ABI Zero-cost Exception Handling -

    + -
    +

    Exception handling for most programming languages is designed to recover from -conditions that rarely occur during general use of an application. To that end, -exception handling should not interfere with the main flow of an -application's algorithm by performing checkpointing tasks such as saving -the current pc or register state.

    + conditions that rarely occur during general use of an application. To that + end, exception handling should not interfere with the main flow of an + application's algorithm by performing checkpointing tasks, such as saving the + current pc or register state.

    The Itanium ABI Exception Handling Specification defines a methodology for -providing outlying data in the form of exception tables without inlining -speculative exception handling code in the flow of an application's main -algorithm. Thus, the specification is said to add "zero-cost" to the normal -execution of an application.

    + providing outlying data in the form of exception tables without inlining + speculative exception handling code in the flow of an application's main + algorithm. Thus, the specification is said to add "zero-cost" to the normal + execution of an application.

    A more complete description of the Itanium ABI exception handling runtime -support of can be found at Itanium C++ ABI: -Exception Handling. A description of the exception frame format can be -found at Exception Frames, with details of the Dwarf -specification at Dwarf 3 -Standard. A description for the C++ exception table formats can be found at -Exception Handling -Tables.

    + support of can be found at + Itanium C++ ABI: + Exception Handling. A description of the exception frame format can be + found at + Exception + Frames, with details of the DWARF 4 specification at + DWARF 4 Standard. + A description for the C++ exception table formats can be found at + Exception Handling + Tables.

    - +

    + Setjmp/Longjmp Exception Handling +

    -
    +
    -

    When an exception is thrown in llvm code, the runtime does a best effort to -find a handler suited to process the circumstance.

    +

    Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics + llvm.eh.sjlj.setjmp and + llvm.eh.sjlj.longjmp to + handle control flow for exception handling.

    -

    The runtime first attempts to find an exception frame corresponding to -the function where the exception was thrown. If the programming language (ex. -C++) supports exception handling, the exception frame contains a reference to an -exception table describing how to process the exception. If the language (ex. -C) does not support exception handling or if the exception needs to be forwarded -to a prior activation, the exception frame contains information about how to -unwind the current activation and restore the state of the prior activation. -This process is repeated until the exception is handled. If the exception is -not handled and no activations remain, then the application is terminated with -an appropriate error message.

    - -

    Since different programming languages have different behaviors when handling -exceptions, the exception handling ABI provides a mechanism for supplying -personalities. An exception handling personality is defined by way of a -personality function (ex. for C++ __gxx_personality_v0) which -receives the context of the exception, an exception structure containing -the exception object type and value, and a reference to the exception table for -the current function. The personality function for the current compile unit is -specified in a common exception frame.

    - -

    The organization of an exception table is language dependent. For C++, an -exception table is organized as a series of code ranges defining what to do if -an exception occurs in that range. Typically, the information associated with a -range defines which types of exception objects (using C++ type info) that -are handled in that range, and an associated action that should take place. -Actions typically pass control to a landing pad.

    - -

    A landing pad corresponds to the code found in the catch portion of a -try/catch sequence. When execution resumes at a landing pad, it receives the -exception structure and a selector corresponding to the type of exception -thrown. The selector is then used to determine which catch should actually -process the exception.

    +

    For each function which does exception processing — be + it try/catch blocks or cleanups — that function + registers itself on a global frame list. When exceptions are unwinding, the + runtime uses this list to identify which functions need processing.

    + +

    Landing pad selection is encoded in the call site entry of the function + context. The runtime returns to the function via + llvm.eh.sjlj.longjmp, where + a switch table transfers control to the appropriate landing pad based on + the index stored in the function context.

    + +

    In contrast to DWARF exception handling, which encodes exception regions + and frame information in out-of-line tables, SJLJ exception handling + builds and removes the unwind frame context at runtime. This results in + faster exception handling at the expense of slower execution when no + exceptions are thrown. As exceptions are, by their nature, intended for + uncommon code paths, DWARF exception handling is generally preferred to + SJLJ.

    - +

    + Overview +

    -
    +
    -

    At the time of this writing, only C++ exception handling support is available -in LLVM. So the remainder of this document will be somewhat C++-centric.

    +

    When an exception is thrown in LLVM code, the runtime does its best to find a + handler suited to processing the circumstance.

    -

    From the C++ developers perspective, exceptions are defined in terms of the -throw and try/catch statements. In this section we will -describe the implementation of llvm exception handling in terms of C++ -examples.

    +

    The runtime first attempts to find an exception frame corresponding to + the function where the exception was thrown. If the programming language + supports exception handling (e.g. C++), the exception frame contains a + reference to an exception table describing how to process the exception. If + the language does not support exception handling (e.g. C), or if the + exception needs to be forwarded to a prior activation, the exception frame + contains information about how to unwind the current activation and restore + the state of the prior activation. This process is repeated until the + exception is handled. If the exception is not handled and no activations + remain, then the application is terminated with an appropriate error + message.

    + +

    Because different programming languages have different behaviors when + handling exceptions, the exception handling ABI provides a mechanism for + supplying personalities. An exception handling personality is defined + by way of a personality function (e.g. __gxx_personality_v0 + in C++), which receives the context of the exception, an exception + structure containing the exception object type and value, and a reference + to the exception table for the current function. The personality function + for the current compile unit is specified in a common exception + frame.

    + +

    The organization of an exception table is language dependent. For C++, an + exception table is organized as a series of code ranges defining what to do + if an exception occurs in that range. Typically, the information associated + with a range defines which types of exception objects (using C++ type + info) that are handled in that range, and an associated action that + should take place. Actions typically pass control to a landing + pad.

    + +

    A landing pad corresponds roughly to the code found in the catch + portion of a try/catch sequence. When execution resumes at + a landing pad, it receives an exception structure and a + selector value corresponding to the type of exception + thrown. The selector is then used to determine which catch should + actually process the exception.

    + +
    -
    +

    + LLVM Code Generation +

    + +
    + +

    From a C++ developer's perspective, exceptions are defined in terms of the + throw and try/catch statements. In this section + we will describe the implementation of LLVM exception handling in terms of + C++ examples.

    + + +

    Throw -

    + -
    +

    Languages that support exception handling typically provide a throw -operation to initiate the exception process. Internally, a throw operation -breaks down into two steps. First, a request is made to allocate exception -space for an exception structure. This structure needs to survive beyond the -current activation. This structure will contain the type and value of the -object being thrown. Second, a call is made to the runtime to raise the -exception, passing the exception structure as an argument.

    + operation to initiate the exception process. Internally, a throw + operation breaks down into two steps.

    + +
      +
    1. A request is made to allocate exception space for an exception structure. + This structure needs to survive beyond the current activation. This + structure will contain the type and value of the object being thrown.
    2. + +
    3. A call is made to the runtime to raise the exception, passing the + exception structure as an argument.
    4. +

    In C++, the allocation of the exception structure is done by the -__cxa_allocate_exception runtime function. The exception raising is -handled by __cxa_throw. The type of the exception is represented using -a C++ RTTI type info structure.

    + __cxa_allocate_exception runtime function. The exception raising is + handled by __cxa_throw. The type of the exception is represented + using a C++ RTTI structure.

    - + + +
    + +

    A call within the scope of a try statement can potentially raise an + exception. In those circumstances, the LLVM C++ front-end replaces the call + with an invoke instruction. Unlike a call, the invoke has + two potential continuation points:

    + +
      +
    1. where to continue when the call succeeds as per normal, and
    2. -
      - -

      A call within the scope of a try statement can potentially raise an exception. -In those circumstances, the LLVM C++ front-end replaces the call with an -invoke instruction. Unlike a call, the invoke has two potential -continuation points; where to continue when the call succeeds as per normal, and -where to continue if the call raises an exception, either by a throw or the -unwinding of a throw.

      - -

      The term used to define a the place where an invoke continues after an -exception is called a landing pad. LLVM landing pads are conceptually -alternative function entry points where a exception structure reference and a type -info index are passed in as arguments. The landing pad saves the exception -structure reference and then proceeds to select the catch block that corresponds -to the type info of the exception object.

      - -

      Two llvm intrinsic functions are used convey information about the landing -pad to the back end.

      - -

      llvm.eh.exception takes no -arguments and returns the exception structure reference. The backend replaces -this intrinsic with the code that accesses the first argument of a call. The -LLVM C++ front end generates code to save this value in an alloca location for -further use in the landing pad and catch code.

      - -

      llvm.eh.selector takes a minimum of -three arguments. The first argument is the reference to the exception -structure. The second argument is a reference to the personality function to be -used for this try catch sequence. The remaining arguments are references to the -type infos for each of the catch statements in the order they should be tested. -The catch all (...) is represented with a null i8*. The result -of the llvm.eh.selector is the index of -the type info in the corresponding exception table. The LLVM C++ front end -generates code to save this value in an alloca location for further use in the -landing pad and catch code.

      +
    3. where to continue if the call raises an exception, either by a throw or + the unwinding of a throw
    4. +
    + +

    The term used to define a the place where an invoke continues after + an exception is called a landing pad. LLVM landing pads are + conceptually alternative function entry points where an exception structure + reference and a type info index are passed in as arguments. The landing pad + saves the exception structure reference and then proceeds to select the catch + block that corresponds to the type info of the exception object.

    + +

    The LLVM landingpad + instruction is used to convey information about the landing pad to the + back end. For C++, the landingpad instruction returns a pointer and + integer pair corresponding to the pointer to the exception structure + and the selector value respectively.

    + +

    The landingpad instruction takes a reference to the personality + function to be used for this try/catch sequence. The + remainder of the instruction is a list of cleanup, catch, + and filter clauses. The exception is tested against the clauses + sequentially from first to last. The selector value is a positive number if + the exception matched a type info, a negative number if it matched a filter, + and zero if it matched a cleanup. If nothing is matched, the behavior of + the program is undefined. If a type info matched, + then the selector value is the index of the type info in the exception table, + which can be obtained using the + llvm.eh.typeid.for intrinsic.

    Once the landing pad has the type info selector, the code branches to the -code for the first catch. The catch then checks the value of the type info -selector against the index of type info for that catch. Since the type info -index is not known until all the type info have been gathered in the backend, -the catch code will call the llvm.eh.typeid.for intrinsic to -determine the index for a given type info. If the catch fails to match the -selector then control is passed on to the next catch. Note: Since the landing -pad will not be used if there is no match in the list of type info on the call -to llvm.eh.selector, then neither the -last catch nor catch all need to perform the the check against the -selector.

    + code for the first catch. The catch then checks the value of the type info + selector against the index of type info for that catch. Since the type info + index is not known until all the type infos have been gathered in the + backend, the catch code must call the + llvm.eh.typeid.for intrinsic to + determine the index for a given type info. If the catch fails to match the + selector then control is passed on to the next catch.

    Finally, the entry and exit of catch code is bracketed with calls to -__cxa_begin_catch and __cxa_end_catch. -__cxa_begin_catch takes a exception structure reference as an argument -and returns the value of the exception object. __cxa_end_catch -takes a exception structure reference as an argument. This function clears the -exception from the exception space. Note: a rethrow from within the catch may -replace this call with a __cxa_rethrow.

    + __cxa_begin_catch and __cxa_end_catch.

    + +
      +
    • __cxa_begin_catch takes an exception structure reference as an + argument and returns the value of the exception object.
    • + +
    • __cxa_end_catch takes no arguments. This function:

      +
        +
      1. Locates the most recently caught exception and decrements its handler + count,
      2. +
      3. Removes the exception from the caught stack if the handler + count goes to zero, and
      4. +
      5. Destroys the exception if the handler count goes to zero and the + exception was not re-thrown by throw.
      6. +
      +

      Note: a rethrow from within the catch may replace this call with + a __cxa_rethrow.

    • +
    - +

    + Cleanups +

    -
    +
    -

    To handle destructors and cleanups in try code, control may not run directly -from a landing pad to the first catch. Control may actually flow from the -landing pad to clean up code and then to the first catch. Since the required -clean up for each invoke in a try may be different (ex., intervening -constructor), there may be several landing pads for a given try.

    +

    A cleanup is extra code which needs to be run as part of unwinding a scope. + C++ destructors are a typical example, but other languages and language + extensions provide a variety of different kinds of cleanups. In general, a + landing pad may need to run arbitrary amounts of cleanup code before actually + entering a catch block. To indicate the presence of cleanups, a + landingpad instruction + should have a cleanup clause. Otherwise, the unwinder will not stop at + the landing pad if there are no catches or filters that require it to.

    + +

    Note: Do not allow a new exception to propagate out of the execution + of a cleanup. This can corrupt the internal state of the unwinder. + Different languages describe different high-level semantics for these + situations: for example, C++ requires that the process be terminated, whereas + Ada cancels both exceptions and throws a third.

    + +

    When all cleanups are finished, if the exception is not handled by the + current function, resume unwinding by calling the + resume instruction, passing in + the result of the landingpad instruction for the original landing + pad.

    - + -
    +
    -

    C++ allows the specification of which exception types that can be thrown from -a function. To represent this a top level landing pad may exist to filter out -invalid types. To express this in LLVM code the landing pad will call llvm.eh.filter instead of llvm.eh.selector. The arguments are the -same, but what gets created in the exception table is different. llvm.eh.filter will return a negative value -if it doesn't find a match. If no match is found then a call to -__cxa_call_unexpected should be made, otherwise -_Unwind_Resume. Each of these functions require a reference to the -exception structure.

    +

    C++ allows the specification of which exception types may be thrown from a + function. To represent this, a top level landing pad may exist to filter out + invalid types. To express this in LLVM code the + landingpad instruction will + have a filter clause. The clause consists of an array of type infos. + landingpad will return a negative value if the exception does not + match any of the type infos. If no match is found then a call + to __cxa_call_unexpected should be made, otherwise + _Unwind_Resume. Each of these functions requires a reference to the + exception structure. Note that the most general form of a + landingpad instruction can + have any number of catch, cleanup, and filter clauses (though having more + than one cleanup is pointless). The LLVM C++ front-end can generate such + landingpad instructions due + to inlining creating nested exception handling scopes.

    - +

    + Restrictions +

    + +
    -
    +

    The unwinder delegates the decision of whether to stop in a call frame to + that call frame's language-specific personality function. Not all unwinders + guarantee that they will stop to perform cleanups. For example, the GNU C++ + unwinder doesn't do so unless the exception is actually caught somewhere + further up the stack.

    -

    LLVM uses several intrinsic functions (name prefixed with "llvm.eh") to -provide exception handling information at various points in generated code.

    +

    In order for inlining to behave correctly, landing pads must be prepared to + handle selector results that they did not originally advertise. Suppose that + a function catches exceptions of type A, and it's inlined into a + function that catches exceptions of type B. The inliner will update + the landingpad instruction for the inlined landing pad to include + the fact that B is also caught. If that landing pad assumes that it + will only be entered to catch an A, it's in for a rude awakening. + Consequently, landing pads must test for the selector results they understand + and then resume exception propagation with the + resume instruction if none of + the conditions match.

    - - -
    + +

    + Exception Handling Intrinsics +

    + +
    + +

    In addition to the + landingpad and + resume instructions, LLVM uses + several intrinsic functions (name prefixed with llvm.eh) to + provide exception handling information at various points in generated + code.

    + + +

    + llvm.eh.typeid.for +

    + +
    +
    -  i8* %llvm.eh.exception( )
    +  i32 @llvm.eh.typeid.for(i8* %type_info)
     
    -

    This intrinsic indicates that the exception structure is available at this -point in the code. The backend will replace this intrinsic with code to fetch -the first argument of a call. The effect is that the intrinsic result is the -exception structure reference.

    +

    This intrinsic returns the type info index in the exception table of the + current function. This value can be used to compare against the result + of landingpad instruction. + The single argument is a reference to a type info.

    - +

    + llvm.eh.sjlj.setjmp +

    + +
    -
    -  i32 %llvm.eh.selector(i8*, i8*, i8*, ...)
    +  i32 @llvm.eh.sjlj.setjmp(i8* %setjmp_buf)
     
    -

    This intrinsic indicates that the exception selector is available at this -point in the code. The backend will replace this intrinsic with code to fetch -the second argument of a call. The effect is that the intrinsic result is the -exception selector.

    +

    For SJLJ based exception handling, this intrinsic forces register saving for + the current function and stores the address of the following instruction for + use as a destination address + by llvm.eh.sjlj.longjmp. The + buffer format and the overall functioning of this intrinsic is compatible + with the GCC __builtin_setjmp implementation allowing code built + with the clang and GCC to interoperate.

    -

    llvm.eh.selector takes a minimum of -three arguments. The first argument is the reference to the exception -structure. The second argument is a reference to the personality function to be -used for this try catch sequence. The remaining arguments are references to the -type infos for each of the catch statements in the order they should be tested. -The catch all (...) is represented with a null i8*.

    +

    The single parameter is a pointer to a five word buffer in which the calling + context is saved. The front end places the frame pointer in the first word, + and the target implementation of this intrinsic should place the destination + address for a + llvm.eh.sjlj.longjmp in the + second word. The following three words are available for use in a + target-specific manner.

    - +

    + llvm.eh.sjlj.longjmp +

    + +
    -
    -  i32 %llvm.eh.filter(i8*, i8*, i8*, ...)
    +  void @llvm.eh.sjlj.longjmp(i8* %setjmp_buf)
     
    -

    This intrinsic indicates that the exception selector is available at this -point in the code. The backend will replace this intrinsic with code to fetch -the second argument of a call. The effect is that the intrinsic result is the -exception selector.

    - -

    llvm.eh.filter takes a minimum of -three arguments. The first argument is the reference to the exception -structure. The second argument is a reference to the personality function to be -used for this function. The remaining arguments are references to the type infos -for each type that can be thrown by the current function.

    +

    For SJLJ based exception handling, the llvm.eh.sjlj.longjmp + intrinsic is used to implement __builtin_longjmp(). The single + parameter is a pointer to a buffer populated + by llvm.eh.sjlj.setjmp. The frame + pointer and stack pointer are restored from the buffer, then control is + transferred to the destination address.

    - - +

    + llvm.eh.sjlj.lsda +

    + +
    -
    -  i32 %llvm.eh.typeid.for(i8*)
    +  i8* @llvm.eh.sjlj.lsda()
     
    -

    This intrinsic returns the type info index in the exception table of the -current function. This value can be used to compare against the result of llvm.eh.selector. The single argument is -a reference to a type info.

    +

    For SJLJ based exception handling, the llvm.eh.sjlj.lsda intrinsic + returns the address of the Language Specific Data Area (LSDA) for the current + function. The SJLJ front-end code stores this address in the exception + handling function context for use by the runtime.

    - +

    + llvm.eh.sjlj.callsite +

    -
    +
    -

    There are two tables that are used by the exception handling runtime to -determine which actions should take place when an exception is thrown.

    +
    +  void @llvm.eh.sjlj.callsite(i32 %call_site_num)
    +
    + +

    For SJLJ based exception handling, the llvm.eh.sjlj.callsite + intrinsic identifies the callsite value associated with the + following invoke instruction. This is used to ensure that landing + pad entries in the LSDA are generated in matching order.

    - +

    + llvm.eh.sjlj.dispatchsetup +

    -
    +
    -

    An exception handling frame eh_frame is very similar to the unwind -frame used by dwarf debug info. The frame contains all the information -necessary to tear down the current frame and restore the state of the prior -frame. There is an exception handling frame for each function in a compile -unit, plus a common exception handling frame that defines information common to -all functions in the unit.

    +
    +  void @llvm.eh.sjlj.dispatchsetup(i32 %dispatch_value)
    +
    -

    Todo - Table details here.

    +

    For SJLJ based exception handling, the llvm.eh.sjlj.dispatchsetup + intrinsic is used by targets to do any unwind edge setup they need. By + default, no action is taken.

    - - -
    - -

    An exception table contains information about what actions to take when an -exception is thrown in a particular part of a function's code. There is -one exception table per function except leaf routines and functions that have -only calls to non-throwing functions will not need an exception table.

    + +

    + Asm Table Formats +

    -

    Todo - Table details here.

    +
    -
    +

    There are two tables that are used by the exception handling runtime to + determine which actions should be taken when an exception is thrown.

    -
    - ToDo -
    +

    + Exception Handling Frame +

    -
    +
    -
      +

      An exception handling frame eh_frame is very similar to the unwind + frame used by DWARF debug info. The frame contains all the information + necessary to tear down the current frame and restore the state of the prior + frame. There is an exception handling frame for each function in a compile + unit, plus a common exception handling frame that defines information common + to all functions in the unit.

      -
    1. Need to create landing pads for code in between explicit landing pads. -The landing pads will have a zero action and a NULL landing pad address and are -used to inform the runtime that the exception should be rethrown.

    2. + -
    3. Actions for a given function should be folded to save space.

    4. +
    -
  • Filters for inlined functions need to be handled more extensively. -Currently it's hardwired for one filter per function.

  • + +

    + Exception Tables +

    -
  • Testing/Testing/Testing.

  • +
    - +

    An exception table contains information about what actions to take when an + exception is thrown in a particular part of a function's code. There is one + exception table per function, except leaf functions and functions that have + calls only to non-throwing functions. They do not need an exception + table.

    + + + +
    @@ -447,12 +569,12 @@ Currently it's hardwired for one filter per function.


    Valid CSS! + src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"> Valid HTML 4.01! + src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"> Chris Lattner
    - LLVM Compiler Infrastructure
    + LLVM Compiler Infrastructure
    Last modified: $Date$