.. contents::
:local:
-.. sectionauthor:: Chris Lattner <sabre@nondot.org> and
- Gordon Henriksen
-
Introduction
============
This document describes the mechanisms and interfaces provided by LLVM to
support accurate garbage collection.
-.. _feature:
-
Goals and non-goals
-------------------
started quickly and scale up to a more sophisticated implementation as your
compiler matures.
-.. _quickstart:
-
Getting started
===============
highly portable, built-in ShadowStack code generator. It is compiled into
``llc`` and works even with the interpreter and C backends.
-.. _quickstart-compiler:
-
In your compiler
----------------
``load`` and ``store`` for now. You will need them when switching to a more
advanced GC.
-.. _quickstart-runtime:
-
In your runtime
---------------
}
}
-.. _shadow-stack:
-
About the shadow stack
----------------------
* Not thread-safe.
Still, it's an easy way to get started. After your compiler and runtime are up
-and running, writing a plugin_ will allow you to take advantage of :ref:`more
-advanced GC features <collector-algos>` of LLVM in order to improve performance.
+and running, writing a :ref:`plugin <plugin>` will allow you to take advantage
+of :ref:`more advanced GC features <collector-algos>` of LLVM in order to
+improve performance.
.. _gc_intrinsics:
to be a complete interface to any garbage collector. A program will need to
interface with the GC library using the facilities provided by that program.
-.. _gcattr:
-
Specifying GC code generation: ``gc "..."``
-------------------------------------------
store %Object* null, %Object** %X
...
-.. _barriers:
-
Reading and writing references in the heap
------------------------------------------
%derived = getelementptr %object, i32 0, i32 2, i32 %n
LLVM does not enforce this relationship between the object and derived pointer
-(although a plugin_ might). However, it would be an unusual collector that
-violated it.
+(although a :ref:`plugin <plugin>` might). However, it would be an unusual
+collector that violated it.
The use of these intrinsics is naturally optional if the target GC does require
the corresponding barrier. Such a GC plugin will replace the intrinsic calls
with the corresponding ``load`` or ``store`` instruction if they are used.
-.. _gcwrite:
-
Write barrier: ``llvm.gcwrite``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For write barriers, LLVM provides the ``llvm.gcwrite`` intrinsic function. It
has exactly the same semantics as a non-volatile ``store`` to the derived
pointer (the third argument). The exact code generated is specified by a
-compiler plugin_.
+compiler :ref:`plugin <plugin>`.
Many important algorithms require write barriers, including generational and
concurrent collectors. Additionally, write barriers could be used to implement
reference counting.
-.. _gcread:
-
Read barrier: ``llvm.gcread``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For read barriers, LLVM provides the ``llvm.gcread`` intrinsic function. It has
exactly the same semantics as a non-volatile ``load`` from the derived pointer
-(the second argument). The exact code generated is specified by a compiler
-plugin_.
+(the second argument). The exact code generated is specified by a
+:ref:`compiler plugin <plugin>`.
Read barriers are needed by fewer algorithms than write barriers, and may have a
greater performance impact since pointer reads are more frequent than writes.
* The stack map is not compiled into the executable.
-Using the LLVM makefiles (like the `sample project
-<http://llvm.org/viewvc/llvm-project/llvm/trunk/projects/sample/>`__), this code
+Using the LLVM makefiles, this code
can be compiled as a plugin using a simple makefile:
.. code-block:: make
$ cat sample.ll
define void @f() gc "mygc" {
entry:
- ret void
+ ret void
}
$ llvm-as < sample.ll | llc -load=MyGC.so
Denotes a multithreaded mutator; the collector must still stop the mutator
("stop the world") before beginning reachability analysis. Stopping a
multithreaded mutator is a complicated problem. It generally requires highly
- platform specific code in the runtime, and the production of carefully
+ platform-specific code in the runtime, and the production of carefully
designed machine code at safe points.
Concurrent
distinguishing an uninitialized stack root from an initialized one. Therefore,
this feature should be used by all GC plugins. It is enabled by default.
-.. _custom:
-
Custom lowering of intrinsics: ``CustomRoots``, ``CustomReadBarriers``, and ``CustomWriteBarriers``
---------------------------------------------------------------------------------------------------
``performCustomLowering`` **must** eliminate the corresponding barriers.
``performCustomLowering`` must comply with the same restrictions as
-`FunctionPass::runOnFunction <WritingAnLLVMPass.html#runOnFunction>`__
+:ref:`FunctionPass::runOnFunction <writing-an-llvm-pass-runOnFunction>`
Likewise, ``initializeCustomLowering`` has the same semantics as
-`Pass::doInitialization(Module&)
-<WritingAnLLVMPass.html#doInitialization_mod>`__
+:ref:`Pass::doInitialization(Module&)
+<writing-an-llvm-pass-doInitialization-mod>`
The following can be used as a template:
.. code-block:: c++
- #include "llvm/Module.h"
- #include "llvm/IntrinsicInst.h"
+ #include "llvm/IR/Module.h"
+ #include "llvm/IR/IntrinsicInst.h"
bool MyGC::initializeCustomLowering(Module &M) {
return false;
namespace {
class LLVM_LIBRARY_VISIBILITY MyGCPrinter : public GCMetadataPrinter {
public:
- virtual void beginAssembly(std::ostream &OS, AsmPrinter &AP,
- const TargetAsmInfo &TAI);
+ virtual void beginAssembly(AsmPrinter &AP);
- virtual void finishAssembly(std::ostream &OS, AsmPrinter &AP,
- const TargetAsmInfo &TAI);
+ virtual void finishAssembly(AsmPrinter &AP);
};
GCMetadataPrinterRegistry::Add<MyGCPrinter>
X("mygc", "My bespoke garbage collector.");
}
-The collector should use ``AsmPrinter`` and ``TargetAsmInfo`` to print portable
-assembly code to the ``std::ostream``. The collector itself contains the stack
-map for the entire module, and may access the ``GCFunctionInfo`` using its own
-``begin()`` and ``end()`` methods. Here's a realistic example:
+The collector should use ``AsmPrinter`` to print portable assembly code. The
+collector itself contains the stack map for the entire module, and may access
+the ``GCFunctionInfo`` using its own ``begin()`` and ``end()`` methods. Here's
+a realistic example:
.. code-block:: c++
#include "llvm/CodeGen/AsmPrinter.h"
- #include "llvm/Function.h"
- #include "llvm/Target/TargetMachine.h"
- #include "llvm/DataLayout.h"
+ #include "llvm/IR/Function.h"
+ #include "llvm/IR/DataLayout.h"
#include "llvm/Target/TargetAsmInfo.h"
+ #include "llvm/Target/TargetMachine.h"
- void MyGCPrinter::beginAssembly(std::ostream &OS, AsmPrinter &AP,
- const TargetAsmInfo &TAI) {
+ void MyGCPrinter::beginAssembly(AsmPrinter &AP) {
// Nothing to do.
}
- void MyGCPrinter::finishAssembly(std::ostream &OS, AsmPrinter &AP,
- const TargetAsmInfo &TAI) {
- // Set up for emitting addresses.
- const char *AddressDirective;
- int AddressAlignLog;
- if (AP.TM.getDataLayout()->getPointerSize() == sizeof(int32_t)) {
- AddressDirective = TAI.getData32bitsDirective();
- AddressAlignLog = 2;
- } else {
- AddressDirective = TAI.getData64bitsDirective();
- AddressAlignLog = 3;
- }
+ void MyGCPrinter::finishAssembly(AsmPrinter &AP) {
+ MCStreamer &OS = AP.OutStreamer;
+ unsigned IntPtrSize = AP.TM.getSubtargetImpl()->getDataLayout()->getPointerSize();
// Put this in the data section.
- AP.SwitchToDataSection(TAI.getDataSection());
+ OS.SwitchSection(AP.getObjFileLowering().getDataSection());
// For each function...
for (iterator FI = begin(), FE = end(); FI != FE; ++FI) {
GCFunctionInfo &MD = **FI;
- // Emit this data structure:
+ // A compact GC layout. Emit this data structure:
//
// struct {
// int32_t PointCount;
- // struct {
- // void *SafePointAddress;
- // int32_t LiveCount;
- // int32_t LiveOffsets[LiveCount];
- // } Points[PointCount];
+ // void *SafePointAddress[PointCount];
+ // int32_t StackFrameSize; // in words
+ // int32_t StackArity;
+ // int32_t LiveCount;
+ // int32_t LiveOffsets[LiveCount];
// } __gcmap_<FUNCTIONNAME>;
// Align to address width.
- AP.EmitAlignment(AddressAlignLog);
-
- // Emit the symbol by which the stack map entry can be found.
- std::string Symbol;
- Symbol += TAI.getGlobalPrefix();
- Symbol += "__gcmap_";
- Symbol += MD.getFunction().getName();
- if (const char *GlobalDirective = TAI.getGlobalDirective())
- OS << GlobalDirective << Symbol << "\n";
- OS << TAI.getGlobalPrefix() << Symbol << ":\n";
+ AP.EmitAlignment(IntPtrSize == 4 ? 2 : 3);
// Emit PointCount.
+ OS.AddComment("safe point count");
AP.EmitInt32(MD.size());
- AP.EOL("safe point count");
// And each safe point...
for (GCFunctionInfo::iterator PI = MD.begin(),
- PE = MD.end(); PI != PE; ++PI) {
- // Align to address width.
- AP.EmitAlignment(AddressAlignLog);
-
+ PE = MD.end(); PI != PE; ++PI) {
// Emit the address of the safe point.
- OS << AddressDirective
- << TAI.getPrivateGlobalPrefix() << "label" << PI->Num;
- AP.EOL("safe point address");
-
- // Emit the stack frame size.
- AP.EmitInt32(MD.getFrameSize());
- AP.EOL("stack frame size");
-
- // Emit the number of live roots in the function.
- AP.EmitInt32(MD.live_size(PI));
- AP.EOL("live root count");
-
- // And for each live root...
- for (GCFunctionInfo::live_iterator LI = MD.live_begin(PI),
- LE = MD.live_end(PI);
- LI != LE; ++LI) {
- // Print its offset within the stack frame.
- AP.EmitInt32(LI->StackOffset);
- AP.EOL("stack offset");
- }
+ OS.AddComment("safe point address");
+ MCSymbol *Label = PI->Label;
+ AP.EmitLabelPlusOffset(Label/*Hi*/, 0/*Offset*/, 4/*Size*/);
+ }
+
+ // Stack information never change in safe points! Only print info from the
+ // first call-site.
+ GCFunctionInfo::iterator PI = MD.begin();
+
+ // Emit the stack frame size.
+ OS.AddComment("stack frame size (in words)");
+ AP.EmitInt32(MD.getFrameSize() / IntPtrSize);
+
+ // Emit stack arity, i.e. the number of stacked arguments.
+ unsigned RegisteredArgs = IntPtrSize == 4 ? 5 : 6;
+ unsigned StackArity = MD.getFunction().arg_size() > RegisteredArgs ?
+ MD.getFunction().arg_size() - RegisteredArgs : 0;
+ OS.AddComment("stack arity");
+ AP.EmitInt32(StackArity);
+
+ // Emit the number of live roots in the function.
+ OS.AddComment("live root count");
+ AP.EmitInt32(MD.live_size(PI));
+
+ // And for each live root...
+ for (GCFunctionInfo::live_iterator LI = MD.live_begin(PI),
+ LE = MD.live_end(PI);
+ LI != LE; ++LI) {
+ // Emit live root's offset within the stack frame.
+ OS.AddComment("stack index (offset / wordsize)");
+ AP.EmitInt32(LI->StackOffset);
}
}
}
[Henderson2002] `Accurate Garbage Collection in an Uncooperative Environment
<http://citeseer.ist.psu.edu/henderson02accurate.html>`__
-