-.. _gep:
-
=======================================
The Often Misunderstood GEP Instruction
=======================================
When people are first confronted with the GEP instruction, they tend to relate
it to known concepts from other programming paradigms, most notably C array
indexing and field selection. GEP closely resembles C array indexing and field
-selection, however it's is a little different and this leads to the following
+selection, however it is a little different and this leads to the following
questions.
What is the first index of the GEP instruction?
...
munge(Array);
-In this "C" example, the front end compiler (llvm-gcc) will generate three GEP
+In this "C" example, the front end compiler (Clang) will generate three GEP
instructions for the three indices through "P" in the assignment statement. The
function argument ``P`` will be the first operand of each of these GEP
instructions. The second operand indexes through that pointer. The third
void %munge(%struct.munger_struct* %P) {
entry:
- %tmp = getelementptr %struct.munger_struct* %P, i32 1, i32 0
+ %tmp = getelementptr %struct.munger_struct, %struct.munger_struct* %P, i32 1, i32 0
%tmp = load i32* %tmp
- %tmp6 = getelementptr %struct.munger_struct* %P, i32 2, i32 1
+ %tmp6 = getelementptr %struct.munger_struct, %struct.munger_struct* %P, i32 2, i32 1
%tmp7 = load i32* %tmp6
%tmp8 = add i32 %tmp7, %tmp
- %tmp9 = getelementptr %struct.munger_struct* %P, i32 0, i32 0
+ %tmp9 = getelementptr %struct.munger_struct, %struct.munger_struct* %P, i32 0, i32 0
store i32 %tmp8, i32* %tmp9
ret void
}
%MyVar = uninitialized global i32
...
- %idx1 = getelementptr i32* %MyVar, i64 0
- %idx2 = getelementptr i32* %MyVar, i64 1
- %idx3 = getelementptr i32* %MyVar, i64 2
+ %idx1 = getelementptr i32, i32* %MyVar, i64 0
+ %idx2 = getelementptr i32, i32* %MyVar, i64 1
+ %idx3 = getelementptr i32, i32* %MyVar, i64 2
These GEP instructions are simply making address computations from the base
address of ``MyVar``. They compute, as follows (using C syntax):
%MyStruct = uninitialized global { float*, i32 }
...
- %idx = getelementptr { float*, i32 }* %MyStruct, i64 0, i32 1
+ %idx = getelementptr { float*, i32 }, { float*, i32 }* %MyStruct, i64 0, i32 1
The GEP above yields an ``i32*`` by indexing the ``i32`` typed field of the
structure ``%MyStruct``. When people first look at it, they wonder why the ``i64
%MyVar = uninitialized global { [40 x i32 ]* }
...
- %idx = getelementptr { [40 x i32]* }* %MyVar, i64 0, i32 0, i64 0, i64 17
+ %idx = getelementptr { [40 x i32]* }, { [40 x i32]* }* %MyVar, i64 0, i32 0, i64 0, i64 17
In this example, we have a global variable, ``%MyVar`` that is a pointer to a
structure containing a pointer to an array of 40 ints. The GEP instruction seems
to be accessing the 18th integer of the structure's array of ints. However, this
is actually an illegal GEP instruction. It won't compile. The reason is that the
-pointer in the structure <i>must</i> be dereferenced in order to index into the
+pointer in the structure *must* be dereferenced in order to index into the
array of 40 ints. Since the GEP instruction never accesses memory, it is
illegal.
.. code-block:: llvm
- %idx = getelementptr { [40 x i32]* }* %, i64 0, i32 0
+ %idx = getelementptr { [40 x i32]* }, { [40 x i32]* }* %, i64 0, i32 0
%arr = load [40 x i32]** %idx
- %idx = getelementptr [40 x i32]* %arr, i64 0, i64 17
+ %idx = getelementptr [40 x i32], [40 x i32]* %arr, i64 0, i64 17
In this case, we have to load the pointer in the structure with a load
instruction before we can index into the array. If the example was changed to:
%MyVar = uninitialized global { [40 x i32 ] }
...
- %idx = getelementptr { [40 x i32] }*, i64 0, i32 0, i64 17
+ %idx = getelementptr { [40 x i32] }, { [40 x i32] }*, i64 0, i32 0, i64 17
then everything works fine. In this case, the structure does not contain a
pointer and the GEP instruction can index through the global variable, into the
.. code-block:: llvm
- %MyVar = global { [10 x i32 ] }
- %idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 0, i32 0, i64 1
- %idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1
+ %MyVar = global { [10 x i32] }
+ %idx1 = getelementptr { [10 x i32] }, { [10 x i32] }* %MyVar, i64 0, i32 0, i64 1
+ %idx2 = getelementptr { [10 x i32] }, { [10 x i32] }* %MyVar, i64 1
In this example, ``idx1`` computes the address of the second integer in the
array that is in the structure in ``%MyVar``, that is ``MyVar+4``. The type of
.. code-block:: llvm
- %MyVar = global { [10 x i32 ] }
- %idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 1, i32 0, i64 0
- %idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1
+ %MyVar = global { [10 x i32] }
+ %idx1 = getelementptr { [10 x i32] }, { [10 x i32] }* %MyVar, i64 1, i32 0, i64 0
+ %idx2 = getelementptr { [10 x i32] }, { [10 x i32] }* %MyVar, i64 1
In this example, the value of ``%idx1`` is ``%MyVar+40`` and its type is
``i32*``. The value of ``%idx2`` is also ``MyVar+40`` but its type is ``{ [10 x
Can I compute the distance between two objects, and add that value to one address to compute the other address?
---------------------------------------------------------------------------------------------------------------
-As with arithmetic on null, You can use GEP to compute an address that way, but
+As with arithmetic on null, you can use GEP to compute an address that way, but
you can't use that pointer to actually access the object if you do, unless the
object is managed outside of LLVM.