Source Level Debugging with LLVM

This document is the central repository for all information pertaining to -debug information in LLVM. It describes the actual format -that the LLVM debug information takes, which is useful for those interested -in creating front-ends or dealing directly with the information. Further, this -document provides specifc examples of what debug information for C/C++.

- -

+ debug information in LLVM. It describes the actual format + that the LLVM debug information takes, which is useful for those + interested in creating front-ends or dealing directly with the information. + Further, this document provides specific examples of what debug information + for C/C++ looks like.

Philosophy behind LLVM debugging information -

+ -

The idea of the LLVM debugging information is to capture how the important -pieces of the source-language's Abstract Syntax Tree map onto LLVM code. -Several design aspects have shaped the solution that appears here. The -important ones are:

+ pieces of the source-language's Abstract Syntax Tree map onto LLVM code. + Several design aspects have shaped the solution that appears here. The + important ones are:

Debugging information should have very little impact on the rest of the -compiler. No transformations, analyses, or code generators should need to be -modified because of debugging information.
Debugging information should have very little impact on the rest of the + compiler. No transformations, analyses, or code generators should need to + be modified because of debugging information.
LLVM optimizations should interact in well-defined and -easily described ways with the debugging information.
LLVM optimizations should interact in well-defined and + easily described ways with the debugging information.
Because LLVM is designed to support arbitrary programming languages, -LLVM-to-LLVM tools should not need to know anything about the semantics of the -source-level-language.
Because LLVM is designed to support arbitrary programming languages, + LLVM-to-LLVM tools should not need to know anything about the semantics of + the source-level-language.
Source-level languages are often widely different from one another. -LLVM should not put any restrictions of the flavor of the source-language, and -the debugging information should work with any language.
With code generator support, it should be possible to use an LLVM compiler -to compile a program to native machine code and standard debugging formats. -This allows compatibility with traditional machine-code level debuggers, like -GDB or DBX.
Source-level languages are often widely different from one another. + LLVM should not put any restrictions of the flavor of the source-language, + and the debugging information should work with any language.
With code generator support, it should be possible to use an LLVM compiler + to compile a program to native machine code and standard debugging + formats. This allows compatibility with traditional machine-code level + debuggers, like GDB or DBX.

The approach used by the LLVM implementation is to use a small set of intrinsic functions to define a mapping -between LLVM program objects and the source-level objects. The description of -the source-level program is maintained in LLVM global variables in an implementation-defined format (the C/C++ front-end -currently uses working draft 7 of the Dwarf 3 standard).

The approach used by the LLVM implementation is to use a small set + of intrinsic functions to define a + mapping between LLVM program objects and the source-level objects. The + description of the source-level program is maintained in LLVM metadata + in an implementation-defined format + (the C/C++ front-end currently uses working draft 7 of + the DWARF 3 + standard).

When a program is being debugged, a debugger interacts with the user and -turns the stored debug information into source-language specific information. -As such, a debugger must be aware of the source-language, and is thus tied to -a specific language or family of languages.

+ turns the stored debug information into source-language specific information. + As such, a debugger must be aware of the source-language, and is thus tied to + a specific language or family of languages.

Debug information consumers -

+ + +

The role of debug information is to provide meta information normally -stripped away during the compilation process. This meta information provides an -LLVM user a relationship between generated code and the original program source -code.

+ stripped away during the compilation process. This meta information provides + an LLVM user a relationship between generated code and the original program + source code.

Currently, debug information is consumed by the DwarfWriter to produce dwarf -information used by the gdb debugger. Other targets could use the same -information to produce stabs or other debug forms.

Currently, debug information is consumed by DwarfDebug to produce dwarf + information used by the gdb debugger. Other targets could use the same + information to produce stabs or other debug forms.

It would also be reasonable to use debug information to feed profiling tools -for analysis of generated code, or, tools for reconstructing the original source -from generated code.

+ for analysis of generated code, or, tools for reconstructing the original + source from generated code.

TODO - expound a bit more.

Debugging optimized code -

+ -

An extremely high priority of LLVM debugging information is to make it -interact well with optimizations and analysis. In particular, the LLVM debug -information provides the following guarantees:

+ interact well with optimizations and analysis. In particular, the LLVM debug + information provides the following guarantees:

LLVM debug information always provides information to accurately read the -source-level state of the program, regardless of which LLVM optimizations -have been run, and without any modification to the optimizations themselves. -However, some optimizations may impact the ability to modify the current state -of the program with a debugger, such as setting program variables, or calling -functions that have been deleted.
LLVM optimizations gracefully interact with debugging information. If they -are not aware of debug information, they are automatically disabled as necessary -in the cases that would invalidate the debug info. This retains the LLVM -features, making it easy to write new transformations.
As desired, LLVM optimizations can be upgraded to be aware of the LLVM -debugging information, allowing them to update the debugging information as they -perform aggressive optimizations. This means that, with effort, the LLVM -optimizers could optimize debug code just as well as non-debug code.
LLVM debug information does not prevent many important optimizations from -happening (for example inlining, basic block reordering/merging/cleanup, tail -duplication, etc), further reducing the amount of the compiler that eventually -is "aware" of debugging information.
LLVM debug information is automatically optimized along with the rest of the -program, using existing facilities. For example, duplicate information is -automatically merged by the linker, and unused information is automatically -removed.
LLVM debug information always provides information to accurately read + the source-level state of the program, regardless of which LLVM + optimizations have been run, and without any modification to the + optimizations themselves. However, some optimizations may impact the + ability to modify the current state of the program with a debugger, such + as setting program variables, or calling functions that have been + deleted.
As desired, LLVM optimizations can be upgraded to be aware of the LLVM + debugging information, allowing them to update the debugging information + as they perform aggressive optimizations. This means that, with effort, + the LLVM optimizers could optimize debug code just as well as non-debug + code.
LLVM debug information does not prevent optimizations from + happening (for example inlining, basic block reordering/merging/cleanup, + tail duplication, etc).
LLVM debug information is automatically optimized along with the rest of + the program, using existing facilities. For example, duplicate + information is automatically merged by the linker, and unused information + is automatically removed.

Basically, the debug information allows you to compile a program with -"-O0 -g" and get full debug information, allowing you to arbitrarily -modify the program as it executes from a debugger. Compiling a program with -"-O3 -g" gives you full debug information that is always available and -accurate for reading (e.g., you get accurate stack traces despite tail call -elimination and inlining), but you might lose the ability to modify the program -and call functions where were optimized out of the program, or inlined away -completely.

- -

-LLVM test suite provides a framework to test optimizer's handling of -debugging information. It can be run like this:

+ "-O0 -g" and get full debug information, allowing you to arbitrarily + modify the program as it executes from a debugger. Compiling a program with + "-O3 -g" gives you full debug information that is always available + and accurate for reading (e.g., you get accurate stack traces despite tail + call elimination and inlining), but you might lose the ability to modify the + program and call functions where were optimized out of the program, or + inlined away completely.

+ +

LLVM test suite provides a + framework to test optimizer's handling of debugging information. It can be + run like this:

@@ -219,843 +210,843 @@ debugging information. It can be run like this:

-This will test impact of debugging information on optimization passes. If -debugging information influences optimization passes then it will be reported -as a failure. See TestingGuide -for more information on LLVM test infratsture and how to run various tests. -

This will test impact of debugging information on optimization passes. If + debugging information influences optimization passes then it will be reported + as a failure. See TestingGuide for more + information on LLVM test infrastructure and how to run various tests.

+ +

Debugging information format -

+ -

LLVM debugging information has been carefully designed to make it possible -for the optimizer to optimize the program and debugging information without -necessarily having to know anything about debugging information. In particular, -the global constant merging pass automatically eliminates duplicated debugging -information (often caused by header files), the global dead code elimination -pass automatically deletes debugging information for a function if it decides to -delete the function, and the linker eliminates debug information when it merges -linkonce functions.

+ for the optimizer to optimize the program and debugging information without + necessarily having to know anything about debugging information. In + particular, the use of metadata avoids duplicated debugging information from + the beginning, and the global dead code elimination pass automatically + deletes debugging information for a function if it decides to delete the + function.

To do this, most of the debugging information (descriptors for types, -variables, functions, source files, etc) is inserted by the language front-end -in the form of LLVM global variables. These LLVM global variables are no -different from any other global variables, except that they have a web of LLVM -intrinsic functions that point to them. If the last references to a particular -piece of debugging information are deleted (for example, by the --globaldce pass), the extraneous debug information will automatically -become dead and be removed by the optimizer.

+ variables, functions, source files, etc) is inserted by the language + front-end in the form of LLVM metadata.

Debug information is designed to be agnostic about the target debugger and -debugging information representation (e.g. DWARF/Stabs/etc). It uses a generic -machine debug information pass to decode the information that represents -variables, types, functions, namespaces, etc: this allows for arbitrary -source-language semantics and type-systems to be used, as long as there is a -module written for the target debugger to interpret the information. In -addition, debug global variables are declared in the "llvm.metadata" -section. All values declared in this section are stripped away after target -debug information is constructed and before the program object is emitted.

+ debugging information representation (e.g. DWARF/Stabs/etc). It uses a + generic pass to decode the information that represents variables, types, + functions, namespaces, etc: this allows for arbitrary source-language + semantics and type-systems to be used, as long as there is a module + written for the target debugger to interpret the information.

To provide basic functionality, the LLVM debugger does have to make some -assumptions about the source-level language being debugged, though it keeps -these to a minimum. The only common features that the LLVM debugger assumes -exist are source files, and program objects. These abstract objects are -used by a debugger to form stack traces, show information about local -variables, etc.

+ assumptions about the source-level language being debugged, though it keeps + these to a minimum. The only common features that the LLVM debugger assumes + exist are source files, + and program objects. These abstract + objects are used by a debugger to form stack traces, show information about + local variables, etc.

This section of the documentation first describes the representation aspects -common to any source-language. The next section -describes the data layout conventions used by the C and C++ front-ends.

- -

+ common to any source-language. The next section + describes the data layout conventions used by the C and C++ front-ends.

Debug information descriptors -

+ + +

In consideration of the complexity and volume of debug information, LLVM -provides a specification for well formed debug global variables. The constant -value of each of these globals is one of a limited set of structures, known as -debug descriptors.

+ provides a specification for well formed debug descriptors.

Consumers of LLVM debug information expect the descriptors for program -objects to start in a canonical format, but the descriptors can include -additional information appended at the end that is source-language specific. All -LLVM debugging information is versioned, allowing backwards compatibility in the -case that the core structures need to change in some way. Also, all debugging -information objects start with a tag to indicate what type of object it is. The -source-language is allowed to define its own objects, by using unreserved tag -numbers. We recommend using with tags in the range 0x1000 thru 0x2000 (there is -a defined enum DW_TAG_user_base = 0x1000.)

- -

The fields of debug descriptors used internally by LLVM (MachineModuleInfo) -are restricted to only the simple data types int, uint, -bool, float, double, sbyte* and { }* -. References to arbitrary values are handled using a { }* and a -cast to { }* expression; typically references to other field -descriptors, arrays of descriptors or global variables.

+ objects to start in a canonical format, but the descriptors can include + additional information appended at the end that is source-language + specific. All LLVM debugging information is versioned, allowing backwards + compatibility in the case that the core structures need to change in some + way. Also, all debugging information objects start with a tag to indicate + what type of object it is. The source-language is allowed to define its own + objects, by using unreserved tag numbers. We recommend using with tags in + the range 0x1000 through 0x2000 (there is a defined enum DW_TAG_user_base = + 0x1000.)

+ +

The fields of debug descriptors used internally by LLVM + are restricted to only the simple data types i32, i1, + float, double, mdstring and mdnode.

-  %llvm.dbg.object.type = type {
-    uint,   ;; A tag
-    ...
-  }
+!1 = metadata !{
+  i32,   ;; A tag
+  ...
+}

The first field of a descriptor is always an -uint containing a tag value identifying the content of the descriptor. -The remaining fields are specific to the descriptor. The values of tags are -loosely bound to the tag values of Dwarf information entries. However, that -does not restrict the use of the information supplied to Dwarf targets. To -facilitate versioning of debug information, the tag is augmented with the -current debug version (LLVMDebugVersion = 4 << 16 or 0x40000 or 262144.)

+ i32 containing a tag value identifying the content of the + descriptor. The remaining fields are specific to the descriptor. The values + of tags are loosely bound to the tag values of DWARF information entries. + However, that does not restrict the use of the information supplied to DWARF + targets. To facilitate versioning of debug information, the tag is augmented + with the current debug version (LLVMDebugVersion = 8 << 16 or + 0x80000 or 524288.)

The details of the various descriptors follow.

- -

- Anchor descriptors -

+ Compile unit descriptors +

-  %llvm.dbg.anchor.type = type {
-    uint,   ;; Tag = 0 + LLVMDebugVersion
-    uint    ;; Tag of descriptors grouped by the anchor
-  }
+!0 = metadata !{
+  i32,       ;; Tag = 17 + LLVMDebugVersion 
+             ;; (DW_TAG_compile_unit)
+  i32,       ;; Unused field. 
+  i32,       ;; DWARF language identifier (ex. DW_LANG_C89) 
+  metadata,  ;; Source file name
+  metadata,  ;; Source file directory (includes trailing slash)
+  metadata   ;; Producer (ex. "4.0.1 LLVM (LLVM research group)")
+  i1,        ;; True if this is a main compile unit. 
+  i1,        ;; True if this is optimized.
+  metadata,  ;; Flags
+  i32        ;; Runtime version
+  metadata   ;; List of enums types
+  metadata   ;; List of retained types
+  metadata   ;; List of subprograms
+  metadata   ;; List of global variables
+}

One important aspect of the LLVM debug representation is that it allows the -LLVM debugger to efficiently index all of the global objects without having the -scan the program. To do this, all of the global objects use "anchor" -descriptors with designated names. All of the global objects of a particular -type (e.g., compile units) contain a pointer to the anchor. This pointer allows -a debugger to use def-use chains to find all global objects of that type.

- -

The following names are recognized as anchors by LLVM:

These descriptors contain a source language ID for the file (we use the DWARF + 3.0 ID numbers, such as DW_LANG_C89, DW_LANG_C_plus_plus, + DW_LANG_Cobol74, etc), three strings describing the filename, + working directory of the compiler, and an identifier string for the compiler + that produced it.

-  %llvm.dbg.compile_units       = linkonce constant %llvm.dbg.anchor.type  { uint 0, uint 17 } ;; DW_TAG_compile_unit
-  %llvm.dbg.global_variables    = linkonce constant %llvm.dbg.anchor.type  { uint 0, uint 52 } ;; DW_TAG_variable
-  %llvm.dbg.subprograms         = linkonce constant %llvm.dbg.anchor.type  { uint 0, uint 46 } ;; DW_TAG_subprogram
-

- -

Using anchors in this way (where the compile unit descriptor points to the -anchors, as opposed to having a list of compile unit descriptors) allows for the -standard dead global elimination and merging passes to automatically remove -unused debugging information. If the globals were kept track of through lists, -there would always be an object pointing to the descriptors, thus would never be -deleted.

Compile unit descriptors provide the root context for objects declared in a + specific compilation unit. File descriptors are defined using this context. + These descriptors are collected by a named metadata + !llvm.dbg.cu. Compile unit descriptor keeps track of subprograms, + global variables and type information.

- Compile unit descriptors -

+ File descriptors +

-  %llvm.dbg.compile_unit.type = type {
-    uint,   ;; Tag = 17 + LLVMDebugVersion (DW_TAG_compile_unit)
-    {  }*,  ;; Compile unit anchor = cast = (%llvm.dbg.anchor.type* %llvm.dbg.compile_units to {  }*)
-    uint,   ;; Dwarf language identifier (ex. DW_LANG_C89) 
-    sbyte*, ;; Source file name
-    sbyte*, ;; Source file directory (includes trailing slash)
-    sbyte*  ;; Producer (ex. "4.0.1 LLVM (LLVM research group)")
-  }
+!0 = metadata !{
+  i32,       ;; Tag = 41 + LLVMDebugVersion 
+             ;; (DW_TAG_file_type)
+  metadata,  ;; Source file name
+  metadata,  ;; Source file directory (includes trailing slash)
+  metadata   ;; Unused
+}

These descriptors contain a source language ID for the file (we use the Dwarf -3.0 ID numbers, such as DW_LANG_C89, DW_LANG_C_plus_plus, -DW_LANG_Cobol74, etc), three strings describing the filename, working -directory of the compiler, and an identifier string for the compiler that -produced it.

These descriptors contain information for a file. Global variables and top + level functions would be defined using this context.k File descriptors also + provide context for source line correspondence.

Compile unit descriptors provide the root context for objects declared in a -specific source file. Global variables and top level functions would be defined -using this context. Compile unit descriptors also provide context for source -line correspondence.

Each input file is encoded as a separate file descriptor in LLVM debugging + information output.

Global variable descriptors -

+ -

-  %llvm.dbg.global_variable.type = type {
-    uint,   ;; Tag = 52 + LLVMDebugVersion (DW_TAG_variable)
-    {  }*,  ;; Global variable anchor = cast (%llvm.dbg.anchor.type* %llvm.dbg.global_variables to {  }*),  
-    {  }*,  ;; Reference to context descriptor
-    sbyte*, ;; Name
-    sbyte*, ;; Display name (fully qualified C++ name)
-    sbyte*, ;; MIPS linkage name (for C++)
-    {  }*,  ;; Reference to compile unit where defined
-    uint,   ;; Line number where defined
-    {  }*,  ;; Reference to type descriptor
-    bool,   ;; True if the global is local to compile unit (static)
-    bool,   ;; True if the global is defined in the compile unit (not extern)
-    {  }*   ;; Reference to the global variable
-  }
+!1 = metadata !{
+  i32,      ;; Tag = 52 + LLVMDebugVersion 
+            ;; (DW_TAG_variable)
+  i32,      ;; Unused field.
+  metadata, ;; Reference to context descriptor
+  metadata, ;; Name
+  metadata, ;; Display name (fully qualified C++ name)
+  metadata, ;; MIPS linkage name (for C++)
+  metadata, ;; Reference to file where defined
+  i32,      ;; Line number where defined
+  metadata, ;; Reference to type descriptor
+  i1,       ;; True if the global is local to compile unit (static)
+  i1,       ;; True if the global is defined in the compile unit (not extern)
+  {}*       ;; Reference to the global variable
+}

These descriptors provide debug information about globals variables. The -provide details such as name, type and where the variable is defined.

+provide details such as name, type and where the variable is defined. All +global variables are collected by named metadata !llvm.dbg.gv.

Subprogram descriptors -

+ -

-  %llvm.dbg.subprogram.type = type {
-    uint,   ;; Tag = 46 + LLVMDebugVersion (DW_TAG_subprogram)
-    {  }*,  ;; Subprogram anchor = cast (%llvm.dbg.anchor.type* %llvm.dbg.subprograms to {  }*),  
-    {  }*,  ;; Reference to context descriptor
-    sbyte*, ;; Name
-    sbyte*, ;; Display name (fully qualified C++ name)
-    sbyte*, ;; MIPS linkage name (for C++)
-    {  }*,  ;; Reference to compile unit where defined
-    uint,   ;; Line number where defined
-    {  }*,  ;; Reference to type descriptor
-    bool,   ;; True if the global is local to compile unit (static)
-    bool    ;; True if the global is defined in the compile unit (not extern)
-  }
+!2 = metadata !{
+  i32,      ;; Tag = 46 + LLVMDebugVersion
+            ;; (DW_TAG_subprogram)
+  i32,      ;; Unused field.
+  metadata, ;; Reference to context descriptor
+  metadata, ;; Name
+  metadata, ;; Display name (fully qualified C++ name)
+  metadata, ;; MIPS linkage name (for C++)
+  metadata, ;; Reference to file where defined
+  i32,      ;; Line number where defined
+  metadata, ;; Reference to type descriptor
+  i1,       ;; True if the global is local to compile unit (static)
+  i1,       ;; True if the global is defined in the compile unit (not extern)
+  i32,      ;; Virtuality, e.g. dwarf::DW_VIRTUALITY__virtual
+  i32,      ;; Index into a virtual function
+  metadata, ;; indicates which base type contains the vtable pointer for the 
+            ;; derived class
+  i1,       ;; isArtificial
+  i1,       ;; isOptimized
+  Function *,;; Pointer to LLVM function
+  metadata, ;; Lists function template parameters
+  metadata  ;; Function declaration descriptor
+  metadata  ;; List of function variables
+}

These descriptors provide debug information about functions, methods and -subprograms. They provide details such as name, return types and the source -location where the subprogram is defined.

+ subprograms. They provide details such as name, return types and the source + location where the subprogram is defined. + All subprogram descriptors are collected by a named metadata + !llvm.dbg.sp. +

+ -

Block descriptors +

+ +

+!3 = metadata !{
+  i32,     ;; Tag = 11 + LLVMDebugVersion (DW_TAG_lexical_block)
+  metadata,;; Reference to context descriptor
+  i32,     ;; Line number
+  i32,     ;; Column number
+  metadata,;; Reference to source file
+  i32      ;; Unique ID to identify blocks from a template function
+}
+

This descriptor provides debug information about nested blocks within a + subprogram. The line number and column numbers are used to dinstinguish + two lexical blocks at same depth.

-  %llvm.dbg.block = type {
-    uint,   ;; Tag = 13 + LLVMDebugVersion (DW_TAG_lexical_block)
-    {  }*   ;; Reference to context descriptor
-  }
+!3 = metadata !{
+  i32,     ;; Tag = 11 + LLVMDebugVersion (DW_TAG_lexical_block)
+  metadata ;; Reference to the scope we're annotating with a file change
+  metadata,;; Reference to the file the scope is enclosed in.
+}

These descriptors provide debug information about nested blocks within a -subprogram. The array of member descriptors is used to define local variables -and deeper nested blocks.

This descriptor provides a wrapper around a lexical scope to handle file + changes in the middle of a lexical block.

Basic type descriptors -

+ -

-  %llvm.dbg.basictype.type = type {
-    uint,   ;; Tag = 36 + LLVMDebugVersion (DW_TAG_base_type)
-    {  }*,  ;; Reference to context (typically a compile unit)
-    sbyte*, ;; Name (may be "" for anonymous types)
-    {  }*,  ;; Reference to compile unit where defined (may be NULL)
-    uint,   ;; Line number where defined (may be 0)
-    uint,   ;; Size in bits
-    uint,   ;; Alignment in bits
-    uint,   ;; Offset in bits
-    uint    ;; Dwarf type encoding
-  }
+!4 = metadata !{
+  i32,      ;; Tag = 36 + LLVMDebugVersion 
+            ;; (DW_TAG_base_type)
+  metadata, ;; Reference to context 
+  metadata, ;; Name (may be "" for anonymous types)
+  metadata, ;; Reference to file where defined (may be NULL)
+  i32,      ;; Line number where defined (may be 0)
+  i64,      ;; Size in bits
+  i64,      ;; Alignment in bits
+  i64,      ;; Offset in bits
+  i32,      ;; Flags
+  i32       ;; DWARF type encoding
+}

These descriptors define primitive types used in the code. Example int, bool -and float. The context provides the scope of the type, which is usually the top -level. Since basic types are not usually user defined the compile unit and line -number can be left as NULL and 0. The size, alignment and offset are expressed -in bits and can be 64 bit values. The alignment is used to round the offset -when embedded in a composite type -(example to keep float doubles on 64 bit boundaries.) The offset is the bit -offset if embedded in a composite -type.

+ and float. The context provides the scope of the type, which is usually the + top level. Since basic types are not usually user defined the context + and line number can be left as NULL and 0. The size, alignment and offset + are expressed in bits and can be 64 bit values. The alignment is used to + round the offset when embedded in a + composite type (example to keep float + doubles on 64 bit boundaries.) The offset is the bit offset if embedded in + a composite type.

The type encoding provides the details of the type. The values are typically -one of the following;

+ one of the following:

-  DW_ATE_address = 1
-  DW_ATE_boolean = 2
-  DW_ATE_float = 4
-  DW_ATE_signed = 5
-  DW_ATE_signed_char = 6
-  DW_ATE_unsigned = 7
-  DW_ATE_unsigned_char = 8
+DW_ATE_address       = 1
+DW_ATE_boolean       = 2
+DW_ATE_float         = 4
+DW_ATE_signed        = 5
+DW_ATE_signed_char   = 6
+DW_ATE_unsigned      = 7
+DW_ATE_unsigned_char = 8

Derived type descriptors -

+ -

-  %llvm.dbg.derivedtype.type = type {
-    uint,   ;; Tag (see below)
-    {  }*,  ;; Reference to context
-    sbyte*, ;; Name (may be "" for anonymous types)
-    {  }*,  ;; Reference to compile unit where defined (may be NULL)
-    uint,   ;; Line number where defined (may be 0)
-    uint,   ;; Size in bits
-    uint,   ;; Alignment in bits
-    uint,   ;; Offset in bits
-    {  }*   ;; Reference to type derived from
-  }
+!5 = metadata !{
+  i32,      ;; Tag (see below)
+  metadata, ;; Reference to context
+  metadata, ;; Name (may be "" for anonymous types)
+  metadata, ;; Reference to file where defined (may be NULL)
+  i32,      ;; Line number where defined (may be 0)
+  i64,      ;; Size in bits
+  i64,      ;; Alignment in bits
+  i64,      ;; Offset in bits
+  metadata, ;; Reference to type derived from
+  metadata, ;; (optional) Name of the Objective C property assoicated with 
+            ;; Objective-C an ivar 
+  metadata, ;; (optional) Name of the Objective C property getter selector.
+  metadata, ;; (optional) Name of the Objective C property setter selector.
+  i32       ;; (optional) Objective C property attributes.
+}

These descriptors are used to define types derived from other types. The value of the tag varies depending on the meaning. The following are possible -tag values;

+tag values:

-  DW_TAG_formal_parameter = 5
-  DW_TAG_member = 13
-  DW_TAG_pointer_type = 15
-  DW_TAG_reference_type = 16
-  DW_TAG_typedef = 22
-  DW_TAG_const_type = 38
-  DW_TAG_volatile_type = 53
-  DW_TAG_restrict_type = 55
+DW_TAG_formal_parameter = 5
+DW_TAG_member           = 13
+DW_TAG_pointer_type     = 15
+DW_TAG_reference_type   = 16
+DW_TAG_typedef          = 22
+DW_TAG_const_type       = 38
+DW_TAG_volatile_type    = 53
+DW_TAG_restrict_type    = 55

DW_TAG_member is used to define a member of a composite type or subprogram. The type of the member is the derived type. DW_TAG_formal_parameter -is used to define a member which is a formal argument of a subprogram.

DW_TAG_member is used to define a member of + a composite type + or subprogram. The type of the member is + the derived + type. DW_TAG_formal_parameter is used to define a member which + is a formal argument of a subprogram.

DW_TAG_typedef is used to -provide a name for the derived type.

DW_TAG_typedef is used to provide a name for the derived type.

DW_TAG_pointer_type, -DW_TAG_reference_type, DW_TAG_const_type, -DW_TAG_volatile_type and DW_TAG_restrict_type are used to -qualify the derived type.

DW_TAG_pointer_type,DW_TAG_reference_type, + DW_TAG_const_type, DW_TAG_volatile_type + and DW_TAG_restrict_type are used to qualify + the derived type.

Derived type location can be determined -from the compile unit and line number. The size, alignment and offset are -expressed in bits and can be 64 bit values. The alignment is used to round the -offset when embedded in a composite type -(example to keep float doubles on 64 bit boundaries.) The offset is the bit -offset if embedded in a composite -type.

- -

Note that the void * type is expressed as a -llvm.dbg.derivedtype.type with tag of DW_TAG_pointer_type and -NULL derived type.

+ from the context and line number. The size, alignment and offset are + expressed in bits and can be 64 bit values. The alignment is used to round + the offset when embedded in a composite + type (example to keep float doubles on 64 bit boundaries.) The offset is + the bit offset if embedded in a composite + type.

+ +

Note that the void * type is expressed as a type derived from NULL. +

Composite type descriptors -

+ -

-  %llvm.dbg.compositetype.type = type {
-    uint,   ;; Tag (see below)
-    {  }*,  ;; Reference to context
-    sbyte*, ;; Name (may be "" for anonymous types)
-    {  }*,  ;; Reference to compile unit where defined (may be NULL)
-    uint,   ;; Line number where defined (may be 0)
-    uint,   ;; Size in bits
-    uint,   ;; Alignment in bits
-    uint,   ;; Offset in bits
-    {  }*   ;; Reference to array of member descriptors
-  }
+!6 = metadata !{
+  i32,      ;; Tag (see below)
+  metadata, ;; Reference to context
+  metadata, ;; Name (may be "" for anonymous types)
+  metadata, ;; Reference to file where defined (may be NULL)
+  i32,      ;; Line number where defined (may be 0)
+  i64,      ;; Size in bits
+  i64,      ;; Alignment in bits
+  i64,      ;; Offset in bits
+  i32,      ;; Flags
+  metadata, ;; Reference to type derived from
+  metadata, ;; Reference to array of member descriptors
+  i32       ;; Runtime languages
+}

These descriptors are used to define types that are composed of 0 or more elements. The value of the tag varies depending on the meaning. The following -are possible tag values;

+are possible tag values:

-  DW_TAG_array_type = 1
-  DW_TAG_enumeration_type = 4
-  DW_TAG_structure_type = 19
-  DW_TAG_union_type = 23
-  DW_TAG_vector_type = 259
-  DW_TAG_subroutine_type = 46
-  DW_TAG_inheritance = 26
+DW_TAG_array_type       = 1
+DW_TAG_enumeration_type = 4
+DW_TAG_structure_type   = 19
+DW_TAG_union_type       = 23
+DW_TAG_vector_type      = 259
+DW_TAG_subroutine_type  = 21
+DW_TAG_inheritance      = 28

The vector flag indicates that an array type is a native packed vector.

The members of array types (tag = DW_TAG_array_type) or vector types -(tag = DW_TAG_vector_type) are subrange -descriptors, each representing the range of subscripts at that level of -indexing.

+ (tag = DW_TAG_vector_type) are subrange + descriptors, each representing the range of subscripts at that level of + indexing.

The members of enumeration types (tag = DW_TAG_enumeration_type) are -enumerator descriptors, each representing the -definition of enumeration value -for the set.

+ enumerator descriptors, each representing + the definition of enumeration value for the set. All enumeration type + descriptors are collected by named metadata !llvm.dbg.enum.

The members of structure (tag = DW_TAG_structure_type) or union (tag -= DW_TAG_union_type) types are any one of the basic, derived -or composite type descriptors, each -representing a field member of the structure or union.

+ = DW_TAG_union_type) types are any one of + the basic, + derived + or composite type descriptors, each + representing a field member of the structure or union.

For C++ classes (tag = DW_TAG_structure_type), member descriptors -provide information about base classes, static members and member functions. If -a member is a derived type descriptor and has -a tag of DW_TAG_inheritance, then the type represents a base class. If -the member of is a global variable -descriptor then it represents a static member. And, if the member is a subprogram descriptor then it represents a member -function. For static members and member functions, getName() returns -the members link or the C++ mangled name. getDisplayName() the -simplied version of the name.

- -

The first member of subroutine (tag = DW_TAG_subroutine_type) -type elements is the return type for the subroutine. The remaining -elements are the formal arguments to the subroutine.

+ provide information about base classes, static members and member + functions. If a member is a derived type + descriptor and has a tag of DW_TAG_inheritance, then the type + represents a base class. If the member of is + a global variable descriptor then it + represents a static member. And, if the member is + a subprogram descriptor then it represents + a member function. For static members and member + functions, getName() returns the members link or the C++ mangled + name. getDisplayName() the simplied version of the name.

+ +

The first member of subroutine (tag = DW_TAG_subroutine_type) type + elements is the return type for the subroutine. The remaining elements are + the formal arguments to the subroutine.

Composite type location can be -determined from the compile unit and line number. The size, alignment and -offset are expressed in bits and can be 64 bit values. The alignment is used to -round the offset when embedded in a composite -type (as an example, to keep float doubles on 64 bit boundaries.) The offset -is the bit offset if embedded in a composite -type.

+ determined from the context and line number. The size, alignment and + offset are expressed in bits and can be 64 bit values. The alignment is used + to round the offset when embedded in + a composite type (as an example, to keep + float doubles on 64 bit boundaries.) The offset is the bit offset if embedded + in a composite type.

Subrange descriptors -

+ -

-  %llvm.dbg.subrange.type = type {
-    uint,   ;; Tag = 33 + LLVMDebugVersion (DW_TAG_subrange_type)
-    uint,   ;; Low value
-    uint    ;; High value
-  }
+!42 = metadata !{
+  i32,    ;; Tag = 33 + LLVMDebugVersion (DW_TAG_subrange_type)
+  i64,    ;; Low value
+  i64     ;; High value
+}

These descriptors are used to define ranges of array subscripts for an array -composite type. The low value defines the -lower bounds typically zero for C/C++. The high value is the upper bounds. -Values are 64 bit. High - low + 1 is the size of the array. If -low == high the array will be unbounded.

+ composite type. The low value defines + the lower bounds typically zero for C/C++. The high value is the upper + bounds. Values are 64 bit. High - low + 1 is the size of the array. If low + > high the array bounds are not included in generated debugging information. +

Enumerator descriptors -

+ -

-  %llvm.dbg.enumerator.type = type {
-    uint,   ;; Tag = 40 + LLVMDebugVersion (DW_TAG_enumerator)
-    sbyte*, ;; Name
-    uint    ;; Value
-  }
+!6 = metadata !{
+  i32,      ;; Tag = 40 + LLVMDebugVersion 
+            ;; (DW_TAG_enumerator)
+  metadata, ;; Name
+  i64       ;; Value
+}

These descriptors are used to define members of an enumeration composite type, it associates the name to the -value.

These descriptors are used to define members of an + enumeration composite type, it + associates the name to the value.

Local variables -

+ + +

-  %llvm.dbg.variable.type = type {
-    uint,    ;; Tag (see below)
-    {  }*,   ;; Context
-    sbyte*,  ;; Name
-    {  }*,   ;; Reference to compile unit where defined
-    uint,    ;; Line number where defined
-    {  }*    ;; Type descriptor
-  }
+!7 = metadata !{
+  i32,      ;; Tag (see below)
+  metadata, ;; Context
+  metadata, ;; Name
+  metadata, ;; Reference to file where defined
+  i32,      ;; 24 bit - Line number where defined
+            ;; 8 bit - Argument number. 1 indicates 1st argument.
+  metadata, ;; Type descriptor
+  i32,      ;; flags
+  metadata  ;; (optional) Reference to inline location
+}

These descriptors are used to define variables local to a sub program. The -value of the tag depends on the usage of the variable;

+ value of the tag depends on the usage of the variable:

-  DW_TAG_auto_variable = 256
-  DW_TAG_arg_variable = 257
-  DW_TAG_return_variable = 258
+DW_TAG_auto_variable   = 256
+DW_TAG_arg_variable    = 257
+DW_TAG_return_variable = 258

An auto variable is any variable declared in the body of the function. An -argument variable is any variable that appears as a formal argument to the -function. A return variable is used to track the result of a function and has -no source correspondent.

+ argument variable is any variable that appears as a formal argument to the + function. A return variable is used to track the result of a function and + has no source correspondent.

The context is either the subprogram or block where the variable is defined. -Name the source variable name. Compile unit and line indicate where the -variable was defined. Type descriptor defines the declared type of the -variable.

+ Name the source variable name. Context and line indicate where the + variable was defined. Type descriptor defines the declared type of the + variable.

+ +

Debugger intrinsic functions -

+ -

LLVM uses several intrinsic functions (name prefixed with "llvm.dbg") to -provide debug information at various points in generated code.

- -

+ provide debug information at various points in generated code.

- llvm.dbg.stoppoint -

+ llvm.dbg.declare +

-  void %llvm.dbg.stoppoint( uint, uint, { }* )
+  void %llvm.dbg.declare(metadata, metadata)

This intrinsic is used to provide correspondence between the source file and -the generated code. The first argument is the line number (base 1), second -argument is the column number (0 if unknown) and the third argument the source -%llvm.dbg.compile_unit* cast to a -{ }*. Code following a call to this intrinsic will have been defined -in close proximity of the line, column and file. This information holds until -the next call to %lvm.dbg.stoppoint.

- +

This intrinsic provides information about a local element (ex. variable.) The + first argument is metadata holding alloca for the variable. The + second argument is metadata containing description of the variable.

- llvm.dbg.func.start -

+ llvm.dbg.value +

-  void %llvm.dbg.func.start( { }* )
+  void %llvm.dbg.value(metadata, i64, metadata)

This intrinsic is used to link the debug information in %llvm.dbg.subprogram to the function. It -defines the beginning of the function's declarative region (scope). It also -implies a call to %llvm.dbg.stoppoint which defines a -source line "stop point". The intrinsic should be called early in the function -after the all the alloca instructions. It should be paired off with a closing -%llvm.dbg.region.end. The function's -single argument is the %llvm.dbg.subprogram.type.

This intrinsic provides information when a user source variable is set to a + new value. The first argument is the new value (wrapped as metadata). The + second argument is the offset in the user source variable where the new value + is written. The third argument is metadata containing description of the + user source variable.

- llvm.dbg.region.start -

+ Object lifetimes and scoping +

+ +

In many languages, the local variables in functions can have their lifetimes + or scopes limited to a subset of a function. In the C family of languages, + for example, variables are only live (readable and writable) within the + source block that they are defined in. In functional languages, values are + only readable after they have been defined. Though this is a very obvious + concept, it is non-trivial to model in LLVM, because it has no notion of + scoping in this sense, and does not want to be tied to a language's scoping + rules.

+ +

In order to handle this, the LLVM debug format uses the metadata attached to + llvm instructions to encode line number and scoping information. Consider + the following C fragment, for example:

-  void %llvm.dbg.region.start( { }* )
+1.  void foo() {
+2.    int X = 21;
+3.    int Y = 22;
+4.    {
+5.      int Z = 23;
+6.      Z = X;
+7.    }
+8.    X = Y;
+9.  }

- -

This intrinsic is used to define the beginning of a declarative scope (ex. -block) for local language elements. It should be paired off with a closing -%llvm.dbg.region.end. The -function's single argument is the %llvm.dbg.block which is starting.

- -

- llvm.dbg.region.end -

Compiled to LLVM, this function would be represented like this:

-  void %llvm.dbg.region.end( { }* )
-

- -

This intrinsic is used to define the end of a declarative scope (ex. block) -for local language elements. It should be paired off with an opening %llvm.dbg.region.start or %llvm.dbg.func.start. The function's -single argument is either the %llvm.dbg.block or the %llvm.dbg.subprogram.type which is -ending.

+define void @foo() nounwind ssp { +entry: + %X = alloca i32, align 4 ; <i32*> [#uses=4] + %Y = alloca i32, align 4 ; <i32*> [#uses=4] + %Z = alloca i32, align 4 ; <i32*> [#uses=3] + %0 = bitcast i32* %X to {}* ; <{}*> [#uses=1] + call void @llvm.dbg.declare(metadata !{i32 * %X}, metadata !0), !dbg !7 + store i32 21, i32* %X, !dbg !8 + %1 = bitcast i32* %Y to {}* ; <{}*> [#uses=1] + call void @llvm.dbg.declare(metadata !{i32 * %Y}, metadata !9), !dbg !10 + store i32 22, i32* %Y, !dbg !11 + %2 = bitcast i32* %Z to {}* ; <{}*> [#uses=1] + call void @llvm.dbg.declare(metadata !{i32 * %Z}, metadata !12), !dbg !14 + store i32 23, i32* %Z, !dbg !15 + %tmp = load i32* %X, !dbg !16 ; <i32> [#uses=1] + %tmp1 = load i32* %Y, !dbg !16 ; <i32> [#uses=1] + %add = add nsw i32 %tmp, %tmp1, !dbg !16 ; <i32> [#uses=1] + store i32 %add, i32* %Z, !dbg !16 + %tmp2 = load i32* %Y, !dbg !17 ; <i32> [#uses=1] + store i32 %tmp2, i32* %X, !dbg !17 + ret void, !dbg !18 +} +declare void @llvm.dbg.declare(metadata, metadata) nounwind readnone + +!0 = metadata !{i32 459008, metadata !1, metadata !"X", + metadata !3, i32 2, metadata !6}; [ DW_TAG_auto_variable ] +!1 = metadata !{i32 458763, metadata !2}; [DW_TAG_lexical_block ] +!2 = metadata !{i32 458798, i32 0, metadata !3, metadata !"foo", metadata !"foo", + metadata !"foo", metadata !3, i32 1, metadata !4, + i1 false, i1 true}; [DW_TAG_subprogram ] +!3 = metadata !{i32 458769, i32 0, i32 12, metadata !"foo.c", + metadata !"/private/tmp", metadata !"clang 1.1", i1 true, + i1 false, metadata !"", i32 0}; [DW_TAG_compile_unit ] +!4 = metadata !{i32 458773, metadata !3, metadata !"", null, i32 0, i64 0, i64 0, + i64 0, i32 0, null, metadata !5, i32 0}; [DW_TAG_subroutine_type ] +!5 = metadata !{null} +!6 = metadata !{i32 458788, metadata !3, metadata !"int", metadata !3, i32 0, + i64 32, i64 32, i64 0, i32 0, i32 5}; [DW_TAG_base_type ] +!7 = metadata !{i32 2, i32 7, metadata !1, null} +!8 = metadata !{i32 2, i32 3, metadata !1, null} +!9 = metadata !{i32 459008, metadata !1, metadata !"Y", metadata !3, i32 3, + metadata !6}; [ DW_TAG_auto_variable ] +!10 = metadata !{i32 3, i32 7, metadata !1, null} +!11 = metadata !{i32 3, i32 3, metadata !1, null} +!12 = metadata !{i32 459008, metadata !13, metadata !"Z", metadata !3, i32 5, + metadata !6}; [ DW_TAG_auto_variable ] +!13 = metadata !{i32 458763, metadata !1}; [DW_TAG_lexical_block ] +!14 = metadata !{i32 5, i32 9, metadata !13, null} +!15 = metadata !{i32 5, i32 5, metadata !13, null} +!16 = metadata !{i32 6, i32 5, metadata !13, null} +!17 = metadata !{i32 8, i32 3, metadata !1, null} +!18 = metadata !{i32 9, i32 1, metadata !2, null} +

- -

- llvm.dbg.declare -

This example illustrates a few important details about LLVM debugging + information. In particular, it shows how the llvm.dbg.declare + intrinsic and location information, which are attached to an instruction, + are applied together to allow a debugger to analyze the relationship between + statements, variable definitions, and the code used to implement the + function.

-  void %llvm.dbg.declare( { } *, { }* )
+call void @llvm.dbg.declare(metadata, metadata !0), !dbg !7

- -

This intrinsic provides information about a local element (ex. variable.) The -first argument is the alloca for the variable, cast to a { }*. The -second argument is the %llvm.dbg.variable containing the description -of the variable, also cast to a { }*.

- -

- - -

- - Representing stopping points in the source program - -

- -

LLVM debugger "stop points" are a key part of the debugging representation -that allows the LLVM to maintain simple semantics for debugging optimized code. The basic idea is that the -front-end inserts calls to the %llvm.dbg.stoppoint intrinsic -function at every point in the program where a debugger should be able to -inspect the program (these correspond to places a debugger stops when you -"step" through it). The front-end can choose to place these as -fine-grained as it would like (for example, before every subexpression -evaluated), but it is recommended to only put them after every source statement -that includes executable code.

- -

Using calls to this intrinsic function to demark legal points for the -debugger to inspect the program automatically disables any optimizations that -could potentially confuse debugging information. To non-debug-information-aware -transformations, these calls simply look like calls to an external function, -which they must assume to do anything (including reading or writing to any part -of reachable memory). On the other hand, it does not impact many optimizations, -such as code motion of non-trapping instructions, nor does it impact -optimization of subexpressions, code duplication transformations, or basic-block -reordering transformations.

The first intrinsic + %llvm.dbg.declare + encodes debugging information for the variable X. The metadata + !dbg !7 attached to the intrinsic provides scope information for the + variable X.

- -

- Object lifetimes and scoping +

+!7 = metadata !{i32 2, i32 7, metadata !1, null}
+!1 = metadata !{i32 458763, metadata !2}; [DW_TAG_lexical_block ]
+!2 = metadata !{i32 458798, i32 0, metadata !3, metadata !"foo", 
+                metadata !"foo", metadata !"foo", metadata !3, i32 1, 
+                metadata !4, i1 false, i1 true}; [DW_TAG_subprogram ]   
+

In many languages, the local variables in functions can have their lifetime -or scope limited to a subset of a function. In the C family of languages, for -example, variables are only live (readable and writable) within the source block -that they are defined in. In functional languages, values are only readable -after they have been defined. Though this is a very obvious concept, it is also -non-trivial to model in LLVM, because it has no notion of scoping in this sense, -and does not want to be tied to a language's scoping rules.

Here !7 is metadata providing location information. It has four + fields: line number, column number, scope, and original scope. The original + scope represents inline location if this instruction is inlined inside a + caller, and is null otherwise. In this example, scope is encoded by + !1. !1 represents a lexical block inside the scope + !2, where !2 is a + subprogram descriptor. This way the + location information attached to the intrinsics indicates that the + variable X is declared at line number 2 at a function level scope in + function foo.

In order to handle this, the LLVM debug format uses the notion of "regions" -of a function, delineated by calls to intrinsic functions. These intrinsic -functions define new regions of the program and indicate when the region -lifetime expires. Consider the following C fragment, for example:

Now lets take another example.

-1.  void foo() {
-2.    int X = ...;
-3.    int Y = ...;
-4.    {
-5.      int Z = ...;
-6.      ...
-7.    }
-8.    ...
-9.  }
+call void @llvm.dbg.declare(metadata, metadata !12), !dbg !14

Compiled to LLVM, this function would be represented like this:

The second intrinsic + %llvm.dbg.declare + encodes debugging information for variable Z. The metadata + !dbg !14 attached to the intrinsic provides scope information for + the variable Z.

-void %foo() {
-entry:
-    %X = alloca int
-    %Y = alloca int
-    %Z = alloca int
-    
-    ...
-    
-    call void %llvm.dbg.func.start( %llvm.dbg.subprogram.type* %llvm.dbg.subprogram )
-    
-    call void %llvm.dbg.stoppoint( uint 2, uint 2, %llvm.dbg.compile_unit* %llvm.dbg.compile_unit )
-    
-    call void %llvm.dbg.declare({}* %X, ...)
-    call void %llvm.dbg.declare({}* %Y, ...)
-    
-    ;; Evaluate expression on line 2, assigning to X.
-    
-    call void %llvm.dbg.stoppoint( uint 3, uint 2, %llvm.dbg.compile_unit* %llvm.dbg.compile_unit )
-    
-    ;; Evaluate expression on line 3, assigning to Y.
-    
-    call void %llvm.region.start()
-    call void %llvm.dbg.stoppoint( uint 5, uint 4, %llvm.dbg.compile_unit* %llvm.dbg.compile_unit )
-    call void %llvm.dbg.declare({}* %X, ...)
-    
-    ;; Evaluate expression on line 5, assigning to Z.
-    
-    call void %llvm.dbg.stoppoint( uint 7, uint 2, %llvm.dbg.compile_unit* %llvm.dbg.compile_unit )
-    call void %llvm.region.end()
-    
-    call void %llvm.dbg.stoppoint( uint 9, uint 2, %llvm.dbg.compile_unit* %llvm.dbg.compile_unit )
-    
-    call void %llvm.region.end()
-    
-    ret void
-}
+!13 = metadata !{i32 458763, metadata !1}; [DW_TAG_lexical_block ]
+!14 = metadata !{i32 5, i32 9, metadata !13, null}

This example illustrates a few important details about the LLVM debugging -information. In particular, it shows how the various intrinsics are applied -together to allow a debugger to analyze the relationship between statements, -variable definitions, and the code used to implement the function.

- -

The first intrinsic %llvm.dbg.func.start provides -a link with the subprogram descriptor -containing the details of this function. This call also defines the beginning -of the function region, bounded by the %llvm.region.end at the end of -the function. This region is used to bracket the lifetime of variables declared -within. For a function, this outer region defines a new stack frame whose -lifetime ends when the region is ended.

- -

It is possible to define inner regions for short term variables by using the -%llvm.region.start and %llvm.region.end to bound a -region. The inner region in this example would be for the block containing the -declaration of Z.

- -

Using regions to represent the boundaries of source-level functions allow -LLVM interprocedural optimizations to arbitrarily modify LLVM functions without -having to worry about breaking mapping information between the LLVM code and the -and source-level program. In particular, the inliner requires no modification -to support inlining with debugging information: there is no explicit correlation -drawn between LLVM functions and their source-level counterparts (note however, -that if the inliner inlines all instances of a non-strong-linkage function into -its caller that it will not be possible for the user to manually invoke the -inlined function from a debugger).

- -

Once the function has been defined, the stopping point corresponding to -line #2 (column #2) of the function is encountered. At this point in the -function, no local variables are live. As lines 2 and 3 of the example -are executed, their variable definitions are introduced into the program using -%llvm.dbg.declare, without the -need to specify a new region. These variables do not require new regions to be -introduced because they go out of scope at the same point in the program: line -9.

- -

In contrast, the Z variable goes out of scope at a different time, -on line 7. For this reason, it is defined within the inner region, which kills -the availability of Z before the code for line 8 is executed. In this -way, regions can support arbitrary source-language scoping rules, as long as -they can only be nested (ie, one scope cannot partially overlap with a part of -another scope).

- -

It is worth noting that this scoping mechanism is used to control scoping of -all declarations, not just variable declarations. For example, the scope of a -C++ using declaration is controlled with this and could change how name lookup is -performed.

Here !14 indicates that Z is declared at line number 5 and + column number 9 inside of lexical scope !13. The lexical scope + itself resides inside of lexical scope !1 described above.

The scope information attached with each instruction provides a + straightforward way to find instructions covered by a scope.

C/C++ front-end specific debug information -

+ -

The C and C++ front-ends represent information about the program in a format -that is effectively identical to Dwarf 3.0 in terms of -information content. This allows code generators to trivially support native -debuggers by generating standard dwarf information, and contains enough -information for non-dwarf targets to translate it as needed.

+ that is effectively identical + to DWARF 3.0 in + terms of information content. This allows code generators to trivially + support native debuggers by generating standard dwarf information, and + contains enough information for non-dwarf targets to translate it as + needed.

This section describes the forms used to represent C and C++ programs. Other -languages could pattern themselves after this (which itself is tuned to -representing programs in the same way that Dwarf 3 does), or they could choose -to provide completely different forms if they don't fit into the Dwarf model. -As support for debugging information gets added to the various LLVM -source-language front-ends, the information used should be documented here.

+ languages could pattern themselves after this (which itself is tuned to + representing programs in the same way that DWARF 3 does), or they could + choose to provide completely different forms if they don't fit into the DWARF + model. As support for debugging information gets added to the various LLVM + source-language front-ends, the information used should be documented + here.

The following sections provide examples of various C/C++ constructs and the -debug information that would best describe those constructs.

- -

+ debug information that would best describe those constructs.

C/C++ source file information -

+ -

Given the source files "MySource.cpp" and "MyHeader.h" located in the -directory "/Users/mine/sources", the following code;

Given the source files MySource.cpp and MyHeader.h located + in the directory /Users/mine/sources, the following code:

 #include "MyHeader.h"
 
@@ -1063,556 +1054,584 @@ int main(int argc, char *argv[]) {
   return 0;
 }

a C/C++ front-end would generate the following descriptors;

a C/C++ front-end would generate the following descriptors:

 ...
 ;;
-;; Define types used.  In this case we need one for compile unit anchors and one
-;; for compile units.
+;; Define the compile unit for the main source file "/Users/mine/sources/MySource.cpp".
 ;;
-%llvm.dbg.anchor.type = type { uint, uint }
-%llvm.dbg.compile_unit.type = type { uint, {  }*, uint, uint, sbyte*, sbyte*, sbyte* }
-...
-;;
-;; Define the anchor for compile units.  Note that the second field of the
-;; anchor is 17, which is the same as the tag for compile units
-;; (17 = DW_TAG_compile_unit.)
-;;
-%llvm.dbg.compile_units = linkonce constant %llvm.dbg.anchor.type { uint 0, uint 17 }, section "llvm.metadata"
+!2 = metadata !{
+  i32 524305,    ;; Tag
+  i32 0,         ;; Unused
+  i32 4,         ;; Language Id
+  metadata !"MySource.cpp", 
+  metadata !"/Users/mine/sources", 
+  metadata !"4.2.1 (Based on Apple Inc. build 5649) (LLVM build 00)", 
+  i1 true,       ;; Main Compile Unit
+  i1 false,      ;; Optimized compile unit
+  metadata !"",  ;; Compiler flags
+  i32 0}         ;; Runtime version
 
 ;;
-;; Define the compile unit for the source file "/Users/mine/sources/MySource.cpp".
-;;
-%llvm.dbg.compile_unit1 = internal constant %llvm.dbg.compile_unit.type {
-    uint add(uint 17, uint 262144), 
-    {  }* cast (%llvm.dbg.anchor.type* %llvm.dbg.compile_units to {  }*), 
-    uint 1, 
-    uint 1, 
-    sbyte* getelementptr ([13 x sbyte]* %str1, int 0, int 0), 
-    sbyte* getelementptr ([21 x sbyte]* %str2, int 0, int 0), 
-    sbyte* getelementptr ([33 x sbyte]* %str3, int 0, int 0) }, section "llvm.metadata"
-    
-;;
-;; Define the compile unit for the header file "/Users/mine/sources/MyHeader.h".
+;; Define the file for the file "/Users/mine/sources/MySource.cpp".
 ;;
-%llvm.dbg.compile_unit2 = internal constant %llvm.dbg.compile_unit.type {
-    uint add(uint 17, uint 262144), 
-    {  }* cast (%llvm.dbg.anchor.type* %llvm.dbg.compile_units to {  }*), 
-    uint 1, 
-    uint 1, 
-    sbyte* getelementptr ([11 x sbyte]* %str4, int 0, int 0), 
-    sbyte* getelementptr ([21 x sbyte]* %str2, int 0, int 0), 
-    sbyte* getelementptr ([33 x sbyte]* %str3, int 0, int 0) }, section "llvm.metadata"
+!1 = metadata !{
+  i32 524329,    ;; Tag
+  metadata !"MySource.cpp", 
+  metadata !"/Users/mine/sources", 
+  metadata !2    ;; Compile unit
+}
 
 ;;
-;; Define each of the strings used in the compile units.
+;; Define the file for the file "/Users/mine/sources/Myheader.h"
 ;;
-%str1 = internal constant [13 x sbyte] c"MySource.cpp\00", section "llvm.metadata";
-%str2 = internal constant [21 x sbyte] c"/Users/mine/sources/\00", section "llvm.metadata";
-%str3 = internal constant [33 x sbyte] c"4.0.1 LLVM (LLVM research group)\00", section "llvm.metadata";
-%str4 = internal constant [11 x sbyte] c"MyHeader.h\00", section "llvm.metadata";
+!3 = metadata !{
+  i32 524329,    ;; Tag
+  metadata !"Myheader.h"
+  metadata !"/Users/mine/sources", 
+  metadata !2    ;; Compile unit
+}
+
 ...

llvm::Instruction provides easy access to metadata attached with an +instruction. One can extract line number information encoded in LLVM IR +using Instruction::getMetadata() and +DILocation::getLineNumber(). +

+ if (MDNode *N = I->getMetadata("dbg")) {  // Here I is an LLVM instruction
+   DILocation Loc(N);                      // DILocation is in DebugInfo.h
+   unsigned Line = Loc.getLineNumber();
+   StringRef File = Loc.getFilename();
+   StringRef Dir = Loc.getDirectory();
+ }
+

C/C++ global variable information -

+ -

Given an integer global variable declared as follows;

Given an integer global variable declared as follows:

 int MyGlobal = 100;

a C/C++ front-end would generate the following descriptors;

a C/C++ front-end would generate the following descriptors:

 ;;
-;; Define types used. One for global variable anchors, one for the global
-;; variable descriptor, one for the global's basic type and one for the global's
-;; compile unit.
-;;
-%llvm.dbg.anchor.type = type { uint, uint }
-%llvm.dbg.global_variable.type = type { uint, {  }*, {  }*, sbyte*, {  }*, uint, {  }*, bool, bool, {  }*, uint }
-%llvm.dbg.basictype.type = type { uint, {  }*, sbyte*, {  }*, int, uint, uint, uint, uint }
-%llvm.dbg.compile_unit.type = ...
-...
-;;
 ;; Define the global itself.
 ;;
 %MyGlobal = global int 100
 ...
 ;;
-;; Define the anchor for global variables.  Note that the second field of the
-;; anchor is 52, which is the same as the tag for global variables
-;; (52 = DW_TAG_variable.)
+;; List of debug info of globals
 ;;
-%llvm.dbg.global_variables = linkonce constant %llvm.dbg.anchor.type { uint 0, uint 52 }, section "llvm.metadata"
+!llvm.dbg.gv = !{!0}
 
 ;;
 ;; Define the global variable descriptor.  Note the reference to the global
 ;; variable anchor and the global variable itself.
 ;;
-%llvm.dbg.global_variable = internal constant %llvm.dbg.global_variable.type {
-    uint add(uint 52, uint 262144), 
-    {  }* cast (%llvm.dbg.anchor.type* %llvm.dbg.global_variables to {  }*), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([9 x sbyte]* %str1, int 0, int 0), 
-    sbyte* getelementptr ([1 x sbyte]* %str2, int 0, int 0), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    uint 1,
-    {  }* cast (%llvm.dbg.basictype.type* %llvm.dbg.basictype to {  }*), 
-    bool false, 
-    bool true, 
-    {  }* cast (int* %MyGlobal to {  }*) }, section "llvm.metadata"
-    
+!0 = metadata !{
+  i32 524340,              ;; Tag
+  i32 0,                   ;; Unused
+  metadata !1,             ;; Context
+  metadata !"MyGlobal",    ;; Name
+  metadata !"MyGlobal",    ;; Display Name
+  metadata !"MyGlobal",    ;; Linkage Name
+  metadata !3,             ;; Compile Unit
+  i32 1,                   ;; Line Number
+  metadata !4,             ;; Type
+  i1 false,                ;; Is a local variable
+  i1 true,                 ;; Is this a definition
+  i32* @MyGlobal           ;; The global variable
+}
+
 ;;
 ;; Define the basic type of 32 bit signed integer.  Note that since int is an
 ;; intrinsic type the source file is NULL and line 0.
 ;;    
-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([4 x sbyte]* %str3, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    uint 5 }, section "llvm.metadata"
+!4 = metadata !{
+  i32 524324,              ;; Tag
+  metadata !1,             ;; Context
+  metadata !"int",         ;; Name
+  metadata !1,             ;; File
+  i32 0,                   ;; Line number
+  i64 32,                  ;; Size in Bits
+  i64 32,                  ;; Align in Bits
+  i64 0,                   ;; Offset in Bits
+  i32 0,                   ;; Flags
+  i32 5                    ;; Encoding
+}
 
-;;
-;; Define the names of the global variable and basic type.
-;;
-%str1 = internal constant [9 x sbyte] c"MyGlobal\00", section "llvm.metadata"
-%str2 = internal constant [1 x sbyte] c"\00", section "llvm.metadata"
-%str3 = internal constant [4 x sbyte] c"int\00", section "llvm.metadata"

C/C++ function information -

+ -

Given a function declared as follows;

Given a function declared as follows:

 int main(int argc, char *argv[]) {
   return 0;
 }

a C/C++ front-end would generate the following descriptors;

a C/C++ front-end would generate the following descriptors:

-;;
-;; Define types used. One for subprogram anchors, one for the subprogram
-;; descriptor, one for the global's basic type and one for the subprogram's
-;; compile unit.
-;;
-%llvm.dbg.subprogram.type = type { uint, {  }*, {  }*, sbyte*, {  }*, bool, bool }
-%llvm.dbg.anchor.type = type { uint, uint }
-%llvm.dbg.compile_unit.type = ...
-	
 ;;
 ;; Define the anchor for subprograms.  Note that the second field of the
 ;; anchor is 46, which is the same as the tag for subprograms
 ;; (46 = DW_TAG_subprogram.)
 ;;
-%llvm.dbg.subprograms = linkonce constant %llvm.dbg.anchor.type { uint 0, uint 46 }, section "llvm.metadata"
-
-;;
-;; Define the descriptor for the subprogram.  TODO - more details.
-;;
-%llvm.dbg.subprogram = internal constant %llvm.dbg.subprogram.type {
-    uint add(uint 46, uint 262144), 
-    {  }* cast (%llvm.dbg.anchor.type* %llvm.dbg.subprograms to {  }*), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([5 x sbyte]* %str1, int 0, int 0), 
-    sbyte* getelementptr ([1 x sbyte]* %str2, int 0, int 0), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*),
-    uint 1,
-    {  }* null, 
-    bool false, 
-    bool true }, section "llvm.metadata"
-
-;;
-;; Define the name of the subprogram.
-;;
-%str1 = internal constant [5 x sbyte] c"main\00", section "llvm.metadata"
-%str2 = internal constant [1 x sbyte] c"\00", section "llvm.metadata"
-
+!6 = metadata !{
+  i32 524334,        ;; Tag
+  i32 0,             ;; Unused
+  metadata !1,       ;; Context
+  metadata !"main",  ;; Name
+  metadata !"main",  ;; Display name
+  metadata !"main",  ;; Linkage name
+  metadata !1,       ;; File
+  i32 1,             ;; Line number
+  metadata !4,       ;; Type
+  i1 false,          ;; Is local 
+  i1 true,           ;; Is definition
+  i32 0,             ;; Virtuality attribute, e.g. pure virtual function
+  i32 0,             ;; Index into virtual table for C++ methods
+  i32 0,             ;; Type that holds virtual table.
+  i32 0,             ;; Flags
+  i1 false,          ;; True if this function is optimized
+  Function *,        ;; Pointer to llvm::Function
+  null               ;; Function template parameters
+}
 ;;
 ;; Define the subprogram itself.
 ;;
-int %main(int %argc, sbyte** %argv) {
+define i32 @main(i32 %argc, i8** %argv) {
 ...
 }

C/C++ basic types -

+ -

The following are the basic type descriptors for C/C++ core types;

- -

The following are the basic type descriptors for C/C++ core types:

bool -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([5 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    uint 2 }, section "llvm.metadata"
-%str1 = internal constant [5 x sbyte] c"bool\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"bool",  ;; Name
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 8,             ;; Size in Bits
+  i64 8,             ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 2              ;; Encoding
+}

char -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([5 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 8, 
-    uint 8, 
-    uint 0, 
-    uint 6 }, section "llvm.metadata"
-%str1 = internal constant [5 x sbyte] c"char\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"char",  ;; Name
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 8,             ;; Size in Bits
+  i64 8,             ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 6              ;; Encoding
+}

unsigned char -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([14 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 8, 
-    uint 8, 
-    uint 0, 
-    uint 8 }, section "llvm.metadata"
-%str1 = internal constant [14 x sbyte] c"unsigned char\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"unsigned char", 
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 8,             ;; Size in Bits
+  i64 8,             ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 8              ;; Encoding
+}

short -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([10 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 16, 
-    uint 16, 
-    uint 0, 
-    uint 5 }, section "llvm.metadata"
-%str1 = internal constant [10 x sbyte] c"short int\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"short int",
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 16,            ;; Size in Bits
+  i64 16,            ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 5              ;; Encoding
+}

unsigned short -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([19 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 16, 
-    uint 16, 
-    uint 0, 
-    uint 7 }, section "llvm.metadata"
-%str1 = internal constant [19 x sbyte] c"short unsigned int\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"short unsigned int",
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 16,            ;; Size in Bits
+  i64 16,            ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 7              ;; Encoding
+}

int -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([4 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    uint 5 }, section "llvm.metadata"
-%str1 = internal constant [4 x sbyte] c"int\00", section "llvm.metadata"
-

+!2 = metadata !{ + i32 524324, ;; Tag + metadata !1, ;; Context + metadata !"int", ;; Name + metadata !1, ;; File + i32 0, ;; Line number + i64 32, ;; Size in Bits + i64 32, ;; Align in Bits + i64 0, ;; Offset in Bits + i32 0, ;; Flags + i32 5 ;; Encoding +} +

unsigned int -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([13 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    uint 7 }, section "llvm.metadata"
-%str1 = internal constant [13 x sbyte] c"unsigned int\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"unsigned int",
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 32,            ;; Size in Bits
+  i64 32,            ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 7              ;; Encoding
+}

long long -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([14 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 64, 
-    uint 64, 
-    uint 0, 
-    uint 5 }, section "llvm.metadata"
-%str1 = internal constant [14 x sbyte] c"long long int\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"long long int",
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 64,            ;; Size in Bits
+  i64 64,            ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 5              ;; Encoding
+}

unsigned long long -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([23 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 64, 
-    uint 64, 
-    uint 0, 
-    uint 7 }, section "llvm.metadata"
-%str1 = internal constant [23 x sbyte] c"long long unsigned int\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"long long unsigned int",
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 64,            ;; Size in Bits
+  i64 64,            ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 7              ;; Encoding
+}

float -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([6 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    uint 4 }, section "llvm.metadata"
-%str1 = internal constant [6 x sbyte] c"float\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"float",
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 32,            ;; Size in Bits
+  i64 32,            ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 4              ;; Encoding
+}

double -

+ -

-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([7 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 64, 
-    uint 64, 
-    uint 0, 
-    uint 4 }, section "llvm.metadata"
-%str1 = internal constant [7 x sbyte] c"double\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"double",;; Name
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 64,            ;; Size in Bits
+  i64 64,            ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 4              ;; Encoding
+}

+ +

C/C++ derived types -

+ -

Given the following as an example of C/C++ derived type;

Given the following as an example of C/C++ derived type:

 typedef const int *IntPtr;

a C/C++ front-end would generate the following descriptors;

a C/C++ front-end would generate the following descriptors:

 ;;
 ;; Define the typedef "IntPtr".
 ;;
-%llvm.dbg.derivedtype1 = internal constant %llvm.dbg.derivedtype.type {
-    uint add(uint 22, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([7 x sbyte]* %str1, int 0, int 0), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    int 1, 
-    uint 0, 
-    uint 0, 
-    uint 0, 
-    {  }* cast (%llvm.dbg.derivedtype.type* %llvm.dbg.derivedtype2 to {  }*) }, section "llvm.metadata"
-%str1 = internal constant [7 x sbyte] c"IntPtr\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524310,          ;; Tag
+  metadata !1,         ;; Context
+  metadata !"IntPtr",  ;; Name
+  metadata !3,         ;; File
+  i32 0,               ;; Line number
+  i64 0,               ;; Size in bits
+  i64 0,               ;; Align in bits
+  i64 0,               ;; Offset in bits
+  i32 0,               ;; Flags
+  metadata !4          ;; Derived From type
+}
 
 ;;
 ;; Define the pointer type.
 ;;
-%llvm.dbg.derivedtype2 = internal constant %llvm.dbg.derivedtype.type {
-    uint add(uint 15, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* null, 
-    {  }* null, 
-    int 0, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    {  }* cast (%llvm.dbg.derivedtype.type* %llvm.dbg.derivedtype3 to {  }*) }, section "llvm.metadata"
-
+!4 = metadata !{
+  i32 524303,          ;; Tag
+  metadata !1,         ;; Context
+  metadata !"",        ;; Name
+  metadata !1,         ;; File
+  i32 0,               ;; Line number
+  i64 64,              ;; Size in bits
+  i64 64,              ;; Align in bits
+  i64 0,               ;; Offset in bits
+  i32 0,               ;; Flags
+  metadata !5          ;; Derived From type
+}
 ;;
 ;; Define the const type.
 ;;
-%llvm.dbg.derivedtype3 = internal constant %llvm.dbg.derivedtype.type {
-    uint add(uint 38, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* null, 
-    {  }* null, 
-    int 0, 
-    uint 0, 
-    uint 0, 
-    uint 0, 
-    {  }* cast (%llvm.dbg.basictype.type* %llvm.dbg.basictype1 to {  }*) }, section "llvm.metadata"	
-
+!5 = metadata !{
+  i32 524326,          ;; Tag
+  metadata !1,         ;; Context
+  metadata !"",        ;; Name
+  metadata !1,         ;; File
+  i32 0,               ;; Line number
+  i64 32,              ;; Size in bits
+  i64 32,              ;; Align in bits
+  i64 0,               ;; Offset in bits
+  i32 0,               ;; Flags
+  metadata !6          ;; Derived From type
+}
 ;;
 ;; Define the int type.
 ;;
-%llvm.dbg.basictype1 = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([4 x sbyte]* %str2, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    uint 5 }, section "llvm.metadata"
-%str2 = internal constant [4 x sbyte] c"int\00", section "llvm.metadata"
+!6 = metadata !{
+  i32 524324,          ;; Tag
+  metadata !1,         ;; Context
+  metadata !"int",     ;; Name
+  metadata !1,         ;; File
+  i32 0,               ;; Line number
+  i64 32,              ;; Size in bits
+  i64 32,              ;; Align in bits
+  i64 0,               ;; Offset in bits
+  i32 0,               ;; Flags
+  5                    ;; Encoding
+}

C/C++ struct/union types -

+ -

Given the following as an example of C/C++ struct type;

Given the following as an example of C/C++ struct type:

 struct Color {
   unsigned Red;
@@ -1620,106 +1639,112 @@ struct Color {
   unsigned Blue;
 };

a C/C++ front-end would generate the following descriptors;

a C/C++ front-end would generate the following descriptors:

 ;;
 ;; Define basic type for unsigned int.
 ;;
-%llvm.dbg.basictype = internal constant %llvm.dbg.basictype.type {
-    uint add(uint 36, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([13 x sbyte]* %str1, int 0, int 0), 
-    {  }* null, 
-    int 0, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    uint 7 }, section "llvm.metadata"
-%str1 = internal constant [13 x sbyte] c"unsigned int\00", section "llvm.metadata"
-
+!5 = metadata !{
+  i32 524324,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"unsigned int",
+  metadata !1,       ;; File
+  i32 0,             ;; Line number
+  i64 32,            ;; Size in Bits
+  i64 32,            ;; Align in Bits
+  i64 0,             ;; Offset in Bits
+  i32 0,             ;; Flags
+  i32 7              ;; Encoding
+}
 ;;
 ;; Define composite type for struct Color.
 ;;
-%llvm.dbg.compositetype = internal constant %llvm.dbg.compositetype.type {
-    uint add(uint 19, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([6 x sbyte]* %str2, int 0, int 0), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    int 1, 
-    uint 96, 
-    uint 32, 
-    uint 0, 
-    {  }* null,
-    {  }* cast ([3 x {  }*]* %llvm.dbg.array to {  }*) }, section "llvm.metadata"
-%str2 = internal constant [6 x sbyte] c"Color\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524307,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"Color", ;; Name
+  metadata !1,       ;; Compile unit
+  i32 1,             ;; Line number
+  i64 96,            ;; Size in bits
+  i64 32,            ;; Align in bits
+  i64 0,             ;; Offset in bits
+  i32 0,             ;; Flags
+  null,              ;; Derived From
+  metadata !3,       ;; Elements
+  i32 0              ;; Runtime Language
+}
 
 ;;
 ;; Define the Red field.
 ;;
-%llvm.dbg.derivedtype1 = internal constant %llvm.dbg.derivedtype.type {
-    uint add(uint 13, uint 262144), 
-    {  }* null, 
-    sbyte* getelementptr ([4 x sbyte]* %str3, int 0, int 0), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    int 2, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    {  }* cast (%llvm.dbg.basictype.type* %llvm.dbg.basictype to {  }*) }, section "llvm.metadata"
-%str3 = internal constant [4 x sbyte] c"Red\00", section "llvm.metadata"
+!4 = metadata !{
+  i32 524301,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"Red",   ;; Name
+  metadata !1,       ;; File
+  i32 2,             ;; Line number
+  i64 32,            ;; Size in bits
+  i64 32,            ;; Align in bits
+  i64 0,             ;; Offset in bits
+  i32 0,             ;; Flags
+  metadata !5        ;; Derived From type
+}
 
 ;;
 ;; Define the Green field.
 ;;
-%llvm.dbg.derivedtype2 = internal constant %llvm.dbg.derivedtype.type {
-    uint add(uint 13, uint 262144), 
-    {  }* null, 
-    sbyte* getelementptr ([6 x sbyte]* %str4, int 0, int 0), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    int 3, 
-    uint 32, 
-    uint 32, 
-    uint 32, 
-    {  }* cast (%llvm.dbg.basictype.type* %llvm.dbg.basictype to {  }*) }, section "llvm.metadata"
-%str4 = internal constant [6 x sbyte] c"Green\00", section "llvm.metadata"
+!6 = metadata !{
+  i32 524301,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"Green", ;; Name
+  metadata !1,       ;; File
+  i32 3,             ;; Line number
+  i64 32,            ;; Size in bits
+  i64 32,            ;; Align in bits
+  i64 32,             ;; Offset in bits
+  i32 0,             ;; Flags
+  metadata !5        ;; Derived From type
+}
 
 ;;
 ;; Define the Blue field.
 ;;
-%llvm.dbg.derivedtype3 = internal constant %llvm.dbg.derivedtype.type {
-    uint add(uint 13, uint 262144), 
-    {  }* null, 
-    sbyte* getelementptr ([5 x sbyte]* %str5, int 0, int 0), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    int 4, 
-    uint 32, 
-    uint 32, 
-    uint 64, 
-    {  }* cast (%llvm.dbg.basictype.type* %llvm.dbg.basictype to {  }*) }, section "llvm.metadata"
-%str5 = internal constant [5 x sbyte] c"Blue\00", section "llvm.metadata"
+!7 = metadata !{
+  i32 524301,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"Blue",  ;; Name
+  metadata !1,       ;; File
+  i32 4,             ;; Line number
+  i64 32,            ;; Size in bits
+  i64 32,            ;; Align in bits
+  i64 64,             ;; Offset in bits
+  i32 0,             ;; Flags
+  metadata !5        ;; Derived From type
+}
 
 ;;
 ;; Define the array of fields used by the composite type Color.
 ;;
-%llvm.dbg.array = internal constant [3 x {  }*] [
-      {  }* cast (%llvm.dbg.derivedtype.type* %llvm.dbg.derivedtype1 to {  }*),
-      {  }* cast (%llvm.dbg.derivedtype.type* %llvm.dbg.derivedtype2 to {  }*),
-      {  }* cast (%llvm.dbg.derivedtype.type* %llvm.dbg.derivedtype3 to {  }*) ], section "llvm.metadata"
+!3 = metadata !{metadata !4, metadata !6, metadata !7}

C/C++ enumeration types -

+ -

Given the following as an example of C/C++ enumeration type;

Given the following as an example of C/C++ enumeration type:

 enum Trees {
   Spruce = 100,
@@ -1727,61 +1752,54 @@ enum Trees {
   Maple = 300
 };

a C/C++ front-end would generate the following descriptors;

a C/C++ front-end would generate the following descriptors:

 ;;
 ;; Define composite type for enum Trees
 ;;
-%llvm.dbg.compositetype = internal constant %llvm.dbg.compositetype.type {
-    uint add(uint 4, uint 262144), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    sbyte* getelementptr ([6 x sbyte]* %str1, int 0, int 0), 
-    {  }* cast (%llvm.dbg.compile_unit.type* %llvm.dbg.compile_unit to {  }*), 
-    int 1, 
-    uint 32, 
-    uint 32, 
-    uint 0, 
-    {  }* null, 
-    {  }* cast ([3 x {  }*]* %llvm.dbg.array to {  }*) }, section "llvm.metadata"
-%str1 = internal constant [6 x sbyte] c"Trees\00", section "llvm.metadata"
+!2 = metadata !{
+  i32 524292,        ;; Tag
+  metadata !1,       ;; Context
+  metadata !"Trees", ;; Name
+  metadata !1,       ;; File
+  i32 1,             ;; Line number
+  i64 32,            ;; Size in bits
+  i64 32,            ;; Align in bits
+  i64 0,             ;; Offset in bits
+  i32 0,             ;; Flags
+  null,              ;; Derived From type
+  metadata !3,       ;; Elements
+  i32 0              ;; Runtime language
+}
+
+;;
+;; Define the array of enumerators used by composite type Trees.
+;;
+!3 = metadata !{metadata !4, metadata !5, metadata !6}
 
 ;;
 ;; Define Spruce enumerator.
 ;;
-%llvm.dbg.enumerator1 = internal constant %llvm.dbg.enumerator.type {
-    uint add(uint 40, uint 262144), 
-    sbyte* getelementptr ([7 x sbyte]* %str2, int 0, int 0), 
-    int 100 }, section "llvm.metadata"
-%str2 = internal constant [7 x sbyte] c"Spruce\00", section "llvm.metadata"
+!4 = metadata !{i32 524328, metadata !"Spruce", i64 100}
 
 ;;
 ;; Define Oak enumerator.
 ;;
-%llvm.dbg.enumerator2 = internal constant %llvm.dbg.enumerator.type {
-    uint add(uint 40, uint 262144), 
-    sbyte* getelementptr ([4 x sbyte]* %str3, int 0, int 0), 
-    int 200 }, section "llvm.metadata"
-%str3 = internal constant [4 x sbyte] c"Oak\00", section "llvm.metadata"
+!5 = metadata !{i32 524328, metadata !"Oak", i64 200}
 
 ;;
 ;; Define Maple enumerator.
 ;;
-%llvm.dbg.enumerator3 = internal constant %llvm.dbg.enumerator.type {
-    uint add(uint 40, uint 262144), 
-    sbyte* getelementptr ([6 x sbyte]* %str4, int 0, int 0), 
-    int 300 }, section "llvm.metadata"
-%str4 = internal constant [6 x sbyte] c"Maple\00", section "llvm.metadata"
+!6 = metadata !{i32 524328, metadata !"Maple", i64 300}
 
-;;
-;; Define the array of enumerators used by composite type Trees.
-;;
-%llvm.dbg.array = internal constant [3 x {  }*] [
-  {  }* cast (%llvm.dbg.enumerator.type* %llvm.dbg.enumerator1 to {  }*),
-  {  }* cast (%llvm.dbg.enumerator.type* %llvm.dbg.enumerator2 to {  }*),
-  {  }* cast (%llvm.dbg.enumerator.type* %llvm.dbg.enumerator3 to {  }*) ], section "llvm.metadata"

+ +

@@ -1795,7 +1813,7 @@ enum Trees { src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"> Chris Lattner
- LLVM Compiler Infrastructure
+ LLVM Compiler Infrastructure
Last modified: $Date$

Source Level Debugging with LLVM

Introduction

Philosophy behind LLVM debugging information -

Debug information consumers -

Debugging optimized code -

Debugging information format -

Debug information descriptors -

+ Compile unit descriptors +

+ File descriptors +

Global variable descriptors -

Subprogram descriptors -

Block descriptors +

Basic type descriptors -

Derived type descriptors -

Composite type descriptors -

Subrange descriptors -

Enumerator descriptors -

Local variables -

Debugger intrinsic functions -

+ llvm.dbg.declare +

+ llvm.dbg.value +

+ Object lifetimes and scoping +

C/C++ front-end specific debug information -

C/C++ source file information -

C/C++ global variable information -

C/C++ function information -

C/C++ basic types -

bool -

char -

unsigned char -

short -

unsigned short -

int -

unsigned int -

long long -

unsigned long long -

float -

double -

C/C++ derived types -

C/C++ struct/union types -

C/C++ enumeration types -