GOT/PLT: The Dark Art of Dynamic Linking on Linux
How your program actually calls shared library functions at runtime — the Global Offset Table, Procedure Linkage Table, lazy binding, and why understanding this machinery is critical for security and performance.
When you call printf("hello") in a C program linked against libc.so, what actually happens at the machine level? The answer involves two of the most elegant — and most exploited — data structures in systems programming: the Global Offset Table (GOT) and the Procedure Linkage Table (PLT).
Understanding GOT/PLT is not optional if you work on security, reverse engineering, performance-sensitive code, or anything that touches the ELF loader. This post tears the whole mechanism apart.
Why Dynamic Linking Needs Indirection
Static linking is simple: the linker resolves every function address at build time and patches it directly into the call instruction. But shared libraries are loaded at arbitrary addresses (thanks to ASLR), so the addresses aren't known until runtime.
The naive solution would be: at load time, scan the entire binary for every reference to a shared library symbol and patch them all. This is called eager relocation, and it's slow — a large binary might have thousands of such references. It also means every page containing a relocation gets dirtied (written to), defeating shared page mapping between processes.
The real solution: indirection through tables. The code never references library addresses directly. Instead, it goes through the GOT and PLT, which are small tables that the dynamic linker patches.
The Global Offset Table (GOT)
The GOT is an array of pointers in the program's writable data segment. Each entry corresponds to one external symbol (function or global variable). At runtime, the dynamic linker fills each slot with the symbol's actual address.
// Conceptual view of the GOT
// Located in .got and .got.plt sections
//
// .got.plt layout:
// GOT[0] → address of _DYNAMIC section
// GOT[1] → link_map pointer (for the dynamic linker)
// GOT[2] → address of _dl_runtime_resolve
// GOT[3] → resolved address of first external function
// GOT[4] → resolved address of second external function
// ...
You can inspect the GOT of any binary:
# Show GOT entries
$ objdump -R /usr/bin/ls
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
000000000021ef08 R_X86_64_GLOB_DAT __ctype_toupper_loc@GLIBC_2.3
000000000021ef10 R_X86_64_GLOB_DAT __ctype_b_loc@GLIBC_2.3
000000000021ef60 R_X86_64_JUMP_SLOT printf@GLIBC_2.2.5
000000000021ef68 R_X86_64_JUMP_SLOT fwrite@GLIBC_2.2.5
R_X86_64_GLOB_DAT entries are for global data. R_X86_64_JUMP_SLOT entries are for functions — these go through the PLT.
The Procedure Linkage Table (PLT)
The PLT is a table of small code stubs in the executable .text section (read-only). Each PLT entry is a trampoline that jumps through the corresponding GOT entry.
Here's what a PLT stub looks like in x86-64:
; PLT entry for printf (PLT[1])
; Located in the .plt section
printf@plt:
jmp QWORD PTR [rip + 0x200a42] ; jump through GOT[3]
push 0x0 ; push relocation index
jmp PLT[0] ; jump to resolver stub
; PLT[0] — the resolver stub (common to all PLT entries)
PLT[0]:
push QWORD PTR [rip + 0x200a02] ; push link_map (GOT[1])
jmp QWORD PTR [rip + 0x200a04] ; jump to _dl_runtime_resolve (GOT[2])
Lazy Binding: The First Call
This is where it gets clever. With lazy binding (the default), the dynamic linker does NOT resolve all function addresses at load time. Instead:
First call to printf:
- Code calls
printf@plt - PLT stub does
jmp [GOT[3]] - But GOT[3] initially points back to the PLT stub's
push 0x0instruction (the next instruction after the jmp) - The stub pushes the relocation index (0x0 = first function) and the
link_map - It jumps to
_dl_runtime_resolve - The resolver looks up
printfinlibc.so, finds its address - It overwrites GOT[3] with the real address of
printf - It jumps to
printfto complete the call
Second call to printf:
- Code calls
printf@plt - PLT stub does
jmp [GOT[3]] - GOT[3] now contains the real address → jumps directly to
printf
Let's watch this happen in real time:
# Compile a simple program
$ cat test.c
#include <stdio.h>
int main(void) {
printf("first\n");
printf("second\n");
return 0;
}
$ gcc -o test test.c -no-pie
# Set breakpoint at printf@plt and watch GOT
$ gdb ./test
(gdb) disas 'printf@plt'
0x401030 <printf@plt>: jmp *0x2fe2(%rip) # 0x404018 <printf@GOT>
0x401036 <printf@plt+6>: push $0x0
0x40103b <printf@plt+11>: jmp 0x401020
(gdb) x/gx 0x404018
0x404018 <printf@GOT>: 0x0000000000401036 ← points back into PLT!
(gdb) break printf
(gdb) run
Breakpoint hit...
(gdb) x/gx 0x404018
0x404018 <printf@GOT>: 0x00007ffff7e12e10 ← now points to real printf!
Why GOT/PLT Matters for Security
The GOT is writable. This is by design — the dynamic linker needs to write resolved addresses into it. But it also means an attacker who can write to arbitrary memory can overwrite a GOT entry to redirect function calls.
GOT Overwrite Attack
// Simplified GOT overwrite exploit concept
//
// Suppose we have a buffer overflow that lets us
// write to an arbitrary address.
//
// Step 1: Find GOT address of a frequently-called function
// $ objdump -R target | grep puts
// 0x404018 R_X86_64_JUMP_SLOT puts@GLIBC_2.2.5
//
// Step 2: Find address of system()
// (leaked or calculated from libc base + offset)
//
// Step 3: Overwrite GOT[puts] with address of system()
//
// Step 4: Next time the program calls puts(user_input),
// it actually calls system(user_input)
//
// If user_input = "/bin/sh", you get a shell.
Mitigations
Several defences exist against GOT overwrites:
- RELRO (Relocation Read-Only):
- Partial RELRO (default): the
.gotsection (for data) is read-only after relocation, but.got.plt(for functions) stays writable. - Full RELRO (
-Wl,-z,relro,-z,now): ALL GOT entries are resolved at load time and the entire GOT is made read-only viamprotect. No lazy binding — completely prevents GOT overwrites.
- Partial RELRO (default): the
# Check RELRO status of a binary
$ checksec --file=/usr/bin/ls
RELRO STACK CANARY NX
Full RELRO Canary found NX enabled
# Compile with full RELRO
$ gcc -o hardened test.c -Wl,-z,relro,-z,now
# Compile with no RELRO (for testing/CTF)
$ gcc -o vulnerable test.c -Wl,-z,norelro
- PIE (Position-Independent Executable): Combined with ASLR, makes GOT addresses unpredictable.
- Stack canaries, NX, CFI: Defence-in-depth layers that make exploitation harder even if GOT is writable.
Performance: GOT and the Data Cache
Every PLT call does an indirect jump through a GOT pointer. This means the GOT entry must be in the data cache (L1d) for the call to be fast. For hot functions called millions of times per second, this matters.
// The cost of a PLT call vs a direct call:
//
// Direct call (static linking):
// call printf ; 1 cycle (direct, predicted)
//
// PLT call (dynamic linking):
// call printf@plt ; → jmp [GOT entry]
// ; indirect branch through GOT pointer
// ; 2-3 cycles if GOT entry is in L1 cache
// ; 10+ cycles on L1 miss (L2 hit)
// ; 50+ cycles on L2 miss (L3/RAM)
# Compare generated code with and without PLT
# Default (with PLT):
$ gcc -S -O2 test.c -o with_plt.s
# Generates: call printf@PLT
# Without PLT:
$ gcc -S -O2 -fno-plt test.c -o no_plt.s
# Generates: call *printf@GOTPCREL(%rip)
Writing a GOT/PLT Inspector
Here's a C program that reads its own GOT entries at runtime using dl_iterate_phdr:
#define _GNU_SOURCE
#include <link.h>
#include <stdio.h>
#include <elf.h>
#include <string.h>
// Callback for dl_iterate_phdr — called for each loaded shared object
static int callback(struct dl_phdr_info *info, size_t size, void *data) {
const char *name = info->dlpi_name;
if (name[0] == '\0') name = "[main executable]";
printf("\n%s (base: %p)\n", name, (void *)info->dlpi_addr);
// Walk program headers looking for PT_DYNAMIC
for (int i = 0; i < info->dlpi_phnum; i++) {
if (info->dlpi_phdr[i].p_type != PT_DYNAMIC)
continue;
ElfW(Dyn) *dyn = (ElfW(Dyn) *)(
info->dlpi_addr + info->dlpi_phdr[i].p_vaddr
);
ElfW(Addr) jmprel = 0, pltgot = 0;
size_t pltrelsz = 0;
char *strtab = NULL;
ElfW(Sym) *symtab = NULL;
// Parse dynamic section entries
for (; dyn->d_tag != DT_NULL; dyn++) {
switch (dyn->d_tag) {
case DT_JMPREL: jmprel = dyn->d_un.d_ptr; break;
case DT_PLTGOT: pltgot = dyn->d_un.d_ptr; break;
case DT_PLTRELSZ: pltrelsz = dyn->d_un.d_val; break;
case DT_STRTAB: strtab = (char *)dyn->d_un.d_ptr; break;
case DT_SYMTAB: symtab = (ElfW(Sym) *)dyn->d_un.d_ptr; break;
}
}
if (!jmprel || !strtab || !symtab) break;
printf(" GOT.PLT at %p\n", (void *)pltgot);
// Walk JMPREL relocations
size_t count = pltrelsz / sizeof(ElfW(Rela));
ElfW(Rela) *rela = (ElfW(Rela) *)jmprel;
for (size_t j = 0; j < count; j++) {
unsigned long sym_idx = ELF64_R_SYM(rela[j].r_info);
const char *sym_name = strtab + symtab[sym_idx].st_name;
void **got_entry = (void **)(rela[j].r_offset);
printf(" [%zu] %-30s GOT@%p → %p\n",
j, sym_name, (void *)got_entry, *got_entry);
}
break;
}
return 0;
}
int main(void) {
printf("=== GOT/PLT Inspector ===\n");
printf("PID: %d\n", getpid());
// Call printf once to trigger lazy resolution
// Then the GOT entry will point to the real printf
dl_iterate_phdr(callback, NULL);
return 0;
}
$ gcc -o gotinspect gotinspect.c -ldl
$ ./gotinspect
=== GOT/PLT Inspector ===
PID: 12345
[main executable] (base: 0x555555554000)
GOT.PLT at 0x555555557fd8
[0] printf GOT@0x555555558018 → 0x7ffff7e12e10
[1] dl_iterate_phdr GOT@0x555555558020 → 0x7ffff7fce230
[2] getpid GOT@0x555555558028 → 0x555555555036
Notice that getpid still points back into the PLT (0x5555...) because we haven't called it yet at that point. After the call, the GOT would be updated.
LD_BIND_NOW and LD_DEBUG
Two environment variables give you deep visibility into the dynamic linker:
# Force eager binding (resolve everything at load time)
$ LD_BIND_NOW=1 ./test
# Watch the dynamic linker resolve every symbol
$ LD_DEBUG=bindings ./test 2>&1 | head -20
12345: binding file ./test [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]:
normal symbol `printf' [GLIBC_2.2.5]
12345: binding file ./test [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]:
normal symbol `__libc_start_main' [GLIBC_2.34]
# See ALL dynamic linker activity
$ LD_DEBUG=all ./test 2>&1 | wc -l
4217 ← a simple "hello world" triggers thousands of linker events
Conclusion
The GOT/PLT mechanism is a masterwork of engineering trade-offs. It gives us shared libraries with lazy binding, ASLR compatibility, and memory savings from shared code pages — at the cost of one level of indirection per external call and a writable data section that attackers love to target.
When you type gcc -o hello hello.c, the linker generates PLT stubs, allocates GOT entries, emits relocation records, and sets up the dynamic section — all so that at runtime, the first call to printf triggers a chain reaction of symbol lookup, GOT patching, and transparent redirection that makes it look like the function was always there.
Know your GOT. Know your PLT. They are the seams of every dynamically-linked program on Linux, and understanding them gives you power over both performance and security.