Position Independent Executables (PIEs) ======================================= About ===== The PIE framework is designed to allow normal C code from the kernel to be embedded into the kernel, loaded at arbirary addresses, and executed. A PIE is a position independent executable is a piece of self contained code that can be relocated to any address. Before the code is run, a simple list of offset based relocations has to be performed. Copyright 2013 Texas Instruments, Inc Russ Dill Motivation ========== Without the PIE framework, the only way to support platforms that require code loaded to and run from arbitrary addresses was to write the code in assembly. For example, a platform may have suspend/resume steps that disable/enable SDRAM and must be run from on chip SRAM. In addition to the SRAM virtual address not being known at compile time for device tree platforms, the code must often run with the MMU enabled or disabled (physical vs virtual address). Design ====== The PIE code is separated into two main pieces. libpie satifies various function calls emitted by gcc. The kernel contains only one copy of libpie but whenever a PIE is loaded, a copy of libpie is copied along with the PIE code. The second piece is the PIE code and data marked with special PIE sections. At build time, libpie and the PIE sections are collected together into a single PIE executable: +---------------------------------------+ | __pie_common_start | | | | __pie_common_end | +---------------------------------------+ | __pie_overlay_start | | +-----------------------------+ | | | __pie_groupxyz_start | | | | | | | | __pie_groupxyz_end | | | +-----------------------------+ | | | __pie_groupabc_start | | | | | | | | __pie_groupabc_end | | | +-----------------------------+ | | | __pie_groupijk_start | | | | | | | | __pie_groupijk_end | | | +-----------------------------+ | | __pie_overlay_end | +---------------------------------------+ | | +---------------------------------------+ The PIE executable is then embedded into the kernel. Symbols are exported from the PIE executable and passed back into the kernel at link time. When the PIE is loaded, the memory layout then looks like the following: +---------------------------------------+ | | +---------------------------------------+ | | +---------------------------------------+ | Tail (Arch specific data/relocations | +---------------------------------------+ The architecture specific code is responsible for reading the relocations and performing the necessary fixups. Marking code/data ================= Marking code and data for inclusing into a PIE group is done with the PIE section markers, __pie() and __pie_data(). Any symbols that will be used outside of the PIE must be exported with EXPORT_PIE_SYMBOL: static struct ddr_timings xyz_timings __pie_data(platformxyz) = { [...] }; void __pie(platformxyz) xyz_ddr_on(void *addr) { [...] } EXPORT_PIE_SYMBOL(xyz_ddr_on); Loading PIEs ============ PIEs can be loaded into a genalloc pool (such as one backed by SRAM). The following functions are provided: - pie_load_sections(pool, ) - pie_load_sections_phys(pool, ) - pie_free(chunk) pie_load_sections/pie_load_sections_phys load a PIE section group into the given pool. Any necessary fixups are peformed and a chunk identifier is returned. The first variant performs fixups such that the code can be run with the current address layout. The second (phys) variant performs fixups such that the code can be executed with the MMU disabled. The pie_free function unloads a PIE from a pool. Utilizing PIEs ============== In order to translate between symbols and addresses within a loaded PIE, the following macros/functions are provided: - kern_to_pie(chunk, sym) - fn_to_pie(chunk, fn) - pie_to_phys(chunk, addr) All three take as the first argument the chunk returned by pie_load_sections. Data symbols can be translated with kern_to_pie. The macro is made so that the type returned is the type passed: kern_to_pie(chunk, xyz_struct_ptr)->foo = 15; *kern_to_pie(chunk, &xyz_flags) = XYZ_DO_THE_THING; Because certain architectures require special handling of function pointers, a special varaint is provided: ret = fn_to_pie(chunk, &xyz_ddr_on)(addr); fnptr = fn_to_pie(chunk, &abc_fn); In the case that a PIE has been configured to run with the MMU disabled, physical addresses can be translated with pie_to_phys. For instance, if the resume ROM jumps to a given physical address: trampoline = fn_to_pie(chunk, resume_trampoline); writel(pie_to_phys(chunk, trampoline), XYZ_RESUME_ADDR_REG); On the Fly Fixup ================ The tail portion of the PIE can be used to store data necessary to perform on the fly fixups. This is necessary for code that needs to run from different address spaces at different times. Any on the fly fixup support is architecture specific. Architecture Requirements ========================= Individual architectures must implement two functions: pie_arch_fill_tail - This function examines the architecture specific relocation entries and copies the ones necessary for the given PIE. pie_arch_fixup - This function performs fixups of the PIE code based on the tail data generated above. pie.lds - A linker script for the PIE executable must be provided. include/asm-generic/pie.lds.S provides a template. libpie.o - The architecture must also provide a library of functions that gcc may expect as a built-in, such as memcpy, memmove, etc. The list of functions is architecture specific.