LLVM Bitcode File Format
  1. Abstract
  2. Overview
  3. Bitstream Format
    1. Magic Numbers
  4. LLVM IR Encoding

Written by Chris Lattner.

Abstract

This document describes the LLVM bitstream file format and the encoding of the LLVM IR into it.

Overview

What is commonly known as the LLVM bitcode file format (also, sometimes anachronistically known as bytecode) is actually two things: a bitstream container format and an encoding of LLVM IR into the container format.

The bitstream format is an abstract encoding of structured data, like very similar to XML in some ways. Like XML, bitstream files contain tags, and nested structures, and you can parse the file without having to understand the tags. Unlike XML, the bitstream format is a binary encoding, and unlike XML it provides a mechanism for the file to self-describe "abbreviations", which are effectively size optimizations for the content.

This document first describes the LLVM bitstream format, then describes the record structure used by LLVM IR files.

Bitstream Format

The bitstream format is literally a stream of bits, with a very simple structure. This structure consists of the following concepts:

Note that the llvm-bcanalyzer tool can be used to dump and inspect arbitrary bitstreams, which is very useful for understanding the encoding.

Magic Numbers

LLVM

Well-Formedness

blah

LLVM IR Encoding


Valid CSS! Valid HTML 4.01! Chris Lattner
The LLVM Compiler Infrastructure
Last modified: $Date$