Reverse Engineering 101: Cipher

On June 12th, I posted a simple challenge on the 0x00sec forum. This challenge is aimed at beginners who are just starting to explore programming and reverse engineering. You can find the challenge in the ReverseMe section above.

In this writeup, I’ll show you how to workaround this simple challenge using Ghidra, and basic reverse engineering techniques, to uncover the key and decrypt the content. The initial hint, provided by the challenge title, hinted at the involvement of some form of cipher encryption. So, with this clue in mind, let’s start.

Need To Know Basic programming principles and assembly is your first step. You don’t need to be an expert just familiar enough with registers, the stack, heap, and pointers to follow along. I’ll break it down as we tackle this challenge, aiming for clarity.**

To return the binary to its original executable state, grab the base64-encoded dump and paste it into a file named foo.txt. Then, run this command:

$ cat foo.txt | base64 -d | gunzip > foo.elf && chmod +x foo.elf

Now that you have your binary, let’s gather some information on it. While this step may not be strictly necessary, it’s always helpful to collect as much data as possible about the binary.

You’ll want to extract the hash and run it through Multi-Engine Scanners, tools like VirusTotal (VT) are your friends here. and snap a snapshot beforehand, especially if you’re diving into dynamic analysis.

Now, let’s check out what we’re working with:

$ file foo.elf foo.elf: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b42d557c4fb661b1a1ded313a1075f73c99f9aa1, for GNU/Linux 3.2.0, stripped

From this, we can tell it’s an ELF (Executable and Linkable Format) binary designed for 64-bit systems. It relies on external libraries and functions that load at runtime, and the “stripped” label indicates that the binary lacks debugging information and symbols.

$~ > strings foo.elf

[]A\A]A^A_
Welcome to the challenge!
Enter the password:
# you may also want to use `readelf -a` 

For those unfamiliar with the concept, strings are simple pre-defined pieces of human-readable data within a file. The output typically consists of various characters, but amidst them, we can find lines indicating that the challenge indeed requires a password, which we’ll need to discover.

Reversing with Ghidra

Ghidra is an open-source reverse engineering tool designed for binary analysis. We chose Ghidra for its popularity and robust feature set, We imported the foo.elf binary into Ghidra for analysis, following standard import procedures create a new project and import the binary file and click the dragon icon, and at the end will get an “Import results summary” p make sure to click yes, we do not need any extra things beyond the defaults.

We navigated to the functions folder, where functions are typically labeled as FUN_00101170 due to the absence of symbol information caused by the stripped binary. While we can adjust these labels later, our analysis began by examining the entry function. This function marks the starting point of program execution. It contains initial code responsible for setting up the program’s environment, initializing variables, and executing any necessary setup tasks before the main logic of the program unfolds. Additionally, it can feature important control flow mechanisms such as branching instructions, function calls, and conditionals, guiding the program’s behavior throughout execution.

Sy

Furthermore, clicking on a function reveals what appears to be a rendering of C or C++ code in the compile window, a feature known as the decompiler view, which greatly enhances readability.

In the middle section of Ghidra’s interface, you’ll find the assembly code of the binary. Clicking on a line of code displays the address of the current line. To locate the base address of the program, navigate to the top address within the “Program Tree” window.

ELF

Additionally, Ghidra shows the ELF header details, including key metadata such as the ELF magic number (7F 45 4C 46 or “ELF”) and architecture information. For instance, the e_machine field confirms that this binary is built for the x86-64 architecture (3Eh), and the e_entry field provides the entry point of the program, located at address 0x1140. This is the address where the program’s execution begins,

In the next sections, the numbers 03 00 represent the hexadecimal representation of the data in the middle. For instance, if a value like 3h is stored, it would be represented as 03 00 in hexadecimal format.

        0010122d 53              PUSH       RBX
        0010122e 48 83 ec 40     SUB        RSP,0x40
        00101232 48 c7 44        MOV        qword ptr [RSP + local_10],0x14

Adjacent to the bytes, you’ll find the corresponding assembly instructions, such as PUSH, along with their operands. Some lines may also reference functions and subfunctions. Keeping track of these details and references is essential during the analysis process.

As we go into the disassembler, we notice a few key functions: FUN_00101229FUN_00101790, and FUN_00101800. Here’s a snippet of the processEntry function that kicks off our exploration:

void processEntry(undefined8 param_1, undefined8 param_2)
{
  undefined auStack_8 [8];
  
  __libc_start_main(FUN_00101229, param_2, &stack0x00000008, FUN_00101790, FUN_00101800, param_1, auStack_8);
  do {
  } while (true);
}

Alright, let’s zero in on FUN_00101229 by clicking on it. This will give us a clearer perspective on the function and its operations. Here’s what we’ve got inside, Add the comments for better understanding:

undefined8 FUN_00101229(void)
{
  char *pcVar1;
  int iVar2;
  __ssize_t _Var3;
  undefined8 uVar4;
  size_t sVar5;
  undefined8 local_48;
  undefined8 local_40;
  undefined8 local_38;
  undefined8 local_30;
  undefined8 local_28;
  undefined2 local_20;
  char *local_18;
  size_t local_10;
  
  // Allocate memory for a character array of size 0x14 (20 bytes)
  local_10 = 0x14;
  local_18 = (char *)malloc(0x14);

  // Display a welcome message
  puts("Welcome to the challenge!");
  printf("Enter the password: ");

  // Read user input from stdin into local_18
  _Var3 = getline(&local_18, &local_10, stdin);
  pcVar1 = local_18;

  // Check if getline encountered an error
  if (_Var3 == -1) {
    free(local_18);
    uVar4 = 1;
  }
  else {
    // Remove the newline character from the user input
    sVar5 = strcspn(local_18, "\n");
    pcVar1[sVar5] = '\0';

    // Call a function (FUN_001013ba) to perform some checks on the user input
    iVar2 = FUN_001013ba(local_18);

    // If the function returns 0, display a message
    if (iVar2 == 0) {
      local_48 = 0x65646f6320656874;
      local_40 = 0x6f636e6920736920;
      local_38 = 0x50202e7463657272;
      local_30 = 0x727420657361656c;
      local_28 = 0x2e6e696167612079;
      local_20 = 10;
      printf("%s", &local_48);
    }
    else {
      // Call another function (FUN_001014d4) for handling an password
      FUN_001014d4();
    }

    // Free the allocated memory for local_18
    free(local_18);
    uVar4 = 0;
  }
  return uVar4;
}

This function begins by allocating memory for user input as a character array. After displaying a welcome message and prompting the user for a password, it reads the input and processes it. If an error occurs during input handling, the function frees the allocated memory and returns an error code.

If the input is valid, it calls another function for further validation. If validation returns 0, an error message is displayed; otherwise, another function is invoked, likely to handle the correct password scenario. Finally, the function cleans up and returns an appropriate code.

So this is simple we can just follow where the validation logic leads us. The function FUN_00101229 performs a basic check on the user input, and depending on the result, it either shows a message or calls another function. Specifically, the logic branches at the call to FUN_001013ba, which is where the input validation occurs. This function is most likely checking if the entered password matches a pre-defined value or some specific condition.

If FUN_001013ba returns 0, the function proceeds to print what appears to be a coded message using variables like local_48local_40, etc. These variables store hexadecimal representations of ASCII characters, and when printed, they likely display a specific message. From the values, we can infer it’s part of a failure or error message when validation fails.

On the other hand, if FUN_001013ba does not return 0FUN_001014d4 is called, which might handle successful input (i.e., when the password is correct). So, let’s check FUN_001014d4. This function constructs strings on the stack at runtime, cleverly obscuring string data within the program. That’s why the strings weren’t visible when we used the strings utility earlier they are dynamically built when the function executes.

This suggests that FUN_001014d4 likely handles the successful password scenario by constructing a message or flag that is shown to the user if the correct password is entered. Since the strings are dynamically built on the stack, key parts of the message are only revealed during the function’s execution.

This pattern of constructing strings on the stack at runtime:

        001014dc c7 44 24        MOV        dword ptr [RSP + local_c],0x0
        001014e4 8b 44 24 2c     MOV        EAX,dword ptr [RSP + local_c]
        001014eb 89 54 24 2c     MOV        dword ptr [RSP + local_c],EDX
        001014f1 c6 04 04 47     MOV        byte ptr [RSP + RAX*0x1],0x47
        001014f5 8b 44 24 2c     MOV        EAX,dword ptr [RSP + local_c]
        001014fc 89 54 24 2c     MOV        dword ptr [RSP + local_c],EDX
        00101502 c6 04 04 6f     MOV        byte ptr [RSP + RAX*0x1],0x6f
        00101506 8b 44 24 2c     MOV        EAX,dword ptr [RSP + local_c]

In this snippet, the program constructs a string byte-by-byte on the stack, building characters one at a time. This is why static analysis, such as strings, cannot detect them they are generated dynamically during execution.

To make sense of these stack strings, you can update Ghidra’s stack frame description by changing the variable type from _undefined_ to char[n], where n is the length of the constructed string. For example, in this case, using char[38] for the string:

void FUN_001014d4(void)
{
  // Define a character array to store the decoded stack string
  char decodedString[38];
  
  decodedString[0] = 'G';
  decodedString[1] = 'o';
  decodedString[2] = 'o';
  decodedString[3] = 'd';
  decodedString[4] = ' ';
  decodedString[5] = 'j';
  decodedString[6] = 'o';
  decodedString[7] = 'b';
  decodedString[8] = ' ';
...
  // Print the decoded stack string
  printf("%s", decodedString); // Good job on decrypting the password!
  return;
}

This function reveals that the message "Good job on decrypting the password!" is displayed when the correct password is entered.

The function FUN_001014d4 constructs this success message on the stack and prints it to the user, confirming that the challenge revolves around finding and entering the correct password. By piecing together the dynamic string construction, Now we know that the challenge involves some kind of decryption or cipher mechanism. Means that the core of the challenge is centered around cracking or deciphering a specific key or password, which the challange expects as input. Given the way the program obfuscates its strings and logic,

Alright,

So let’s jump to it and see what’s goin on, From the first look, it seems that FUN_001013ba is performing a password or input check by comparing the input buffer at param_1 with a predefined string stored in local_31. It appears to use a loop to compare characters, and if there is a complete match, it returns 1 (success); otherwise, it returns 0 (failure).

The encrypted password is represented as Yep, 0xd1a0c0d1a091a0d

undefined FUN_001013ba(long param_1)

{
  local_31 = 0xd1a0c0d1a091a0d;
  local_29 = 0;
  local_12 = 0x7f;
  local_20 = strlen((char *)&local_31);
  local_28 = (char *)malloc(local_20 + 1);
  strcpy(local_28,(char *)&local_31);
  FUN_0010135f(local_28,local_20,local_12);
  local_11 = 1;
  local_c = 0;
  local_10 = 0;
  while ((*(char *)(local_c + param_1) != '\0' && ((ulong)(long)local_10 < local_20))) {
    if (*(char *)(local_c + param_1) == local_28[local_10]) {
      local_10 = local_10 + 1;
    }
    else {
      local_11 = 0;
      local_10 = 0;
    }
    local_c = local_c + 1;
  }

In simpler terms, the function appears to handle password checking or decryption. The flow leads to FUN_0010135f, where we can observe an XOR pattern:

void FUN_0010135f(long param_1,ulong param_2,byte param_3)

{
  ulong local_8;
  
  for (local_8 = 0; local_8 < param_2; local_8 = local_8 + 1) {
    *(byte *)(local_8 + param_1) = *(byte *)(local_8 + param_1) ^ param_3;
  }
  return;
}

This function iterates over the length of the string and performs a bitwise XOR operation with a provided key.

The remaining task is to decrypt the password using the key provided by the author(Me) Here’s how the program operates within the challenge: it begins by handling user input, validating that input against a predefined reference string, and then displaying a congratulatory message upon a successful match. The program also performs an XOR operation to manipulate the password.

The objective of the challenge is to uncover the correct password, which can be decrypted using a specific XOR key. Once the password is successfully decrypted, it will match the predefined reference string. When this occurs, the program acknowledges the user’s success by congratulating them on decrypting the password.

Now, the next step is to decrypt the password using the XOR key set to 0x7F. The encrypted password, referred to as ec, is in hexadecimal format:

  local_31 = 0xd1a0c0d1a091a0d;
  local_29 = 0;
  local_12 = 0x7f; // This is the key used for XOR encryption 

In this context, ec represents the encrypted password. The XOR key is simply a value used to transform the original data into an unreadable format. By looking at the assembly code, we can see how ec is stored in memory:

        001013c7 48 b8 0d        MOV        RAX,0xd1a0c0d1a091a0d
                 1a 09 1a 
                 0d 0c 1a 0d

Breaking down ec, we can see it consists of several bytes: 0x0d0x1a0x090x1a0x0d0x0c0x1a, and 0x0d. Each of these bytes needs to be decrypted using the XOR operation with the key 0x7F.

To make this process easier, we can write a simple Python script. This script will create a list called decrypted_code to hold the decrypted bytes as we go through each byte in the ec list. For each byte, we will apply the XOR operation with the key and store the result in our list.

ec = [0x0d, 0x1a, 0x09, 0x1a, 0x0d, 0x0c, 0x1a, 0x0d]  # Encrypted bytes
key = 0x7F  # XOR key

decrypted_code = []  # This will hold our decrypted bytes
for byte in ec:  # Go through each byte in the encrypted list
    decrypted_code.append(byte ^ key)  # XOR each byte with the key

# Convert the decrypted bytes back to a string
special_code = ''.join(chr(byte) for byte in decrypted_code)
print(special_code)  # Output the decrypted password

In this script, after we iterate through all the bytes, we convert the decrypted bytes back into a string using chr(byte). We join these characters together to form the final string.

The essence of this script is that it reverses the XOR operation that was applied during encryption, using the specified key to reveal the original password.

When we run this script:

$ > python3 DEC.py 
reverser  # This is the password

$ > foo.elf
Welcome to the challenge!
Enter the password: reverser
Good job on decrypting the password!
$

Another way we could skip all the time wasted going through each function is to simply dump the binary and analyze it directly. By using tools like objdump, we can disassemble the executable and examine the raw assembly code.

Since we know the challenge is about a cipher, we can focus on areas like the password comparison logic and the XOR operation. By filtering the disassembled output, we can quickly pinpoint sections of the code that handle the encryption and password validation processes.

objdump -d foo.elf | grep -E -A 10 -B 10 'xor|mov|call|cmp|j[ne]|push|pop'

helps us extract key instructions related to data movement, comparisons, and conditional jumps, which are pivotal for understanding how the program flows and makes decisions, or just head to the main function, and follow the call to the check,

13c2: 48 89 7c 24 08               	movq	%rdi, 8(%rsp)
13c7: 48 b8 0d 1a 09 1a 0d 0c 1a 0d	movabsq	$944080322298452493, %rax
13d1: 48 89 44 24 17               	movq	%rax, 23(%rsp)
13d6: c6 44 24 1f 00               	movb	$0, 31(%rsp)
13db: c6 44 24 36 7f               	movb	$127, 54(%rsp)

Here, we can see how the encrypted value and the XOR key are set in memory. The encrypted password is loaded into a register, while the key (0x7F) is stored at a specific stack location. By analyzing these snippets,

So we moves a value from the %rdi register into a location on the stack. This action is part of setting up for function calls and managing data. The next instruction at movabsq loads the absolute value of the encrypted password (0xd1a0c0d1a091a0d) into the %rax register,

Following that, the instruction at movq %rax, takes the encrypted password from %rax and stores it at another stack location, for preparation for using this encrypted password in subsequent operations. and we got stores the XOR key (0x7F) at a specific memory location,

With this understanding, we can proceed to write a decryption script. We extract the encrypted password, derived from the hexadecimal value 0xd1a0c0d1a091a0d, and break it down into bytes: 0x0d0x1a0x090x1a0x0d0x0c0x1a, and 0x0d. Next, we perform the XOR operation using the stored key 0x7F.

And just like that, you can see the difference. You could solve this challenge in a minute; it’s a simple one, so it’s somewhat obvious. However, it’s always important to take your time to understand the binary at hand. The reason I followed the first approach using Ghidra and jumped between functions was to familiarize myself with the tool while also introducing you to some techniques that will help you feel comfortable and make it easier to follow along, Until next time!