Getting Your Hands Dirty: Exploiting Buffer Overflow Vulnerability In C

Sabin
by Sabin 

Buffer overflow vulnerability has been in existence since the early days of computers and exists till now. Various internet worms use buffer overflow vulnerabilities to propagate. This vulnerability totally depends on the knowledge base of the programmer. You should see the fly while allocating memories in C. In this blog, I briefly go through memory allocation which plays a key role while exploiting this vulnerability.

Buffer Overflow in C

“C” is the language of UNIX. We all know C for its speed, but with great power comes great responsibility. Managing memory is vital while writing a leakage-proof program.

We all know that C is a high-level programming language, but it believes that programmers are responsible for data integrity. If this heavy lifting of data integrity were shifted over to compilers, then the resulting binary would be very slow. Also, programmers would need to sacrifice the ability to control memories within the system.

A buffer overflow occurs when a programmer wants to put twenty bytes of data into a buffer having just ten bytes. This type of action is allowed in C even though it might crash the program. This action is known as buffer overrun or buffer overflow.

How it works

To see how this exploitation works, We will look into a small C program that will help us visualize this vulnerability.

#include<stdio.h>
#include<string.h>
int main(int argc, char *argv[]){
int value = 5; 
char buffer_one[8], buffer_two[8];
strcpy(buffer_one, "one");
strcpy(buffer_two, "two");
printf("[BEFORE] buffer_two is at %p and contains \'%s\'\n", buffer_two,buffer_two);
printf("[BEFORE] buffer_one is at %p and contains \'%s\'\n", buffer_one,buffer_one);
printf("[BEFORE] value is at %p and is %d (0x%08x)\n",&value , value ,value);
printf("\n [STRCPY] copying %d bytes into buffer_two \n\n", strlen(argv[1]));
strcpy(buffer_two, argv[1]);
printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two,buffer_two);
printf("[AFTER] buffer_one is at %p and contains \'%s\'\n", buffer_one,buffer_one);
printf("[AFTER] value is at %p and is %d (0x%08x)\n",&value , value ,value);
}

I assume you have general understanding of pointers and strings in C. To put into perspective we have 3 different variables, two with 8 byte character array and one with an integer.

int value = 5; 
char buffer_one[8], buffer_two[8];

Then, we copy some strings “one” and “two” into the buffer we previously allocated.

strcpy(buffer_one, "one");
strcpy(buffer_two, "two");

Then, there are bunch of print statements to print out the current memory location of these three variables.

printf("[BEFORE] buffer_two is at %p and contains \'%s\'\n", buffer_two,buffer_two);
printf("[BEFORE] buffer_one is at %p and contains \'%s\'\n", buffer_one,buffer_one);
printf("[BEFORE] value is at %p and is %d (0x%08x)\n",&value , value ,value);

The real vulnerability is exploited in this line:

strcpy(buffer_two, argv[1]);

Here, we copy everything that resides inside argv[1] irrespective to the capacity of the variable namely buffer_two. Everything should work fine as long as the size of argv[1] is less than eight bytes, but if anything more than that is encountered the memory is overflown. Lets compile and run this program to see overflow into action.

Compiling

gcc -o overflow <your .c file with the code>

Running (without Overflow)

./overflow 1234

Without Overflow

The program works as expected. It copies whatever we have passed as command line argument i.e 1234 to buffer_two. The interesting part occurs when we pass 9 bytes. We all know in C, every character is taken as a byte. So, if we pass 9 characters the memory should overflow.

Running (with Overflow)

./overflow 123456789

With Overflow

Congratulation! you are a hacker(lol).

You can see that 1,2,3,4,5,6,7 and 8 stays as it is but 9 has overflown and reached inside another memory location i.e onto buffer_one.

A variable can hold only the allocated size, anything more than that gets overflown into nearest memory location.

This affect is seen due to C’s FILO (First In Last Out) structure while allocating memory. To dive more onto it we can reference the memory segmentation.

Generally , compiled programs memory are divided into five segments.

  1. Text
  2. Data
  3. BSS
  4. Heap
  5. Stack

As program runs the EIP (Instruction Pointer i.e a type a register ) sets to the first instruction in the text segment. This segment is protected and no write permission are allowed.

The data and BSS segments are used to store global and static variables. The data segment contains all the initialized global and static variables, whereas BSS contains all the uninitialized ones.

The heap segments is a segment of memory which a programmer can directly control. This segment is dynamic i.e it does not have a fixed size.

The stack segment is another segment of memory of variable size which acts as a temporary scratch pad to store local functions. These stack contains stack frames. A stack segment might contain many stack frames. In general stack frames are an abstract data structure which follows FILO ordering.

In our previous example we saw an overflow onto buffer_one. Since, we defined buffer_two after buffer_one, any overflow is caught by buffer_one due to FILO ordering mechanism. In memory, buffer_two resides before buffer_one.

You might also like to read: YOLO v3 — From Python To Java ?

What is the use case ?

After discussing a lot about buffer overflow you might wonder what the use seems like. To demonstrate that we will be writing a small C program to authenticate user using string comparison.

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int check_authentication(char *password){
int auth_flag = 0;
char password_buffer[16];
strcpy(password_buffer, password);
if(strcmp(password_buffer, "sabin" ) == 0)
 auth_flag = 1;
if(strcmp(password_buffer, "sharma") == 0)
 auth_flag = 1;
return auth_flag;
}
int main(int argc , char *argv[])
{
if(argc < 2)
{
 printf("Usage: %s <password>\n", argv[0]);
 exit(0);
}
if(check_authentication(argv[1])){
 printf("\n-======================-\n");
 printf(" Access Granted.\n");
}
else{
 printf("\n Access Denied. \n");
}
}

So, here we have a simple program that authenticates user based on the string passed. The only way to get access is to know the password i.e “sabin” or “sharma”.

int check_authentication(char *password){
int auth_flag = 0;
char password_buffer[16];
strcpy(password_buffer, password);
if(strcmp(password_buffer, "sabin" ) == 0)
 auth_flag = 1;
if(strcmp(password_buffer, "sharma") == 0)
 auth_flag = 1;
return auth_flag;
}

In this function a character pointer is passed as a parameter. Firstly we initialize a variable namely auth_flag and set it to 0. Following that we create a variable namely password_buffer with a capacity of 16 bytes. After that we copy the value from the pointer passed to the local variable namely password_buffer. Lastly, we compare that copied string with some string. If they match we set the auth_flag to 1 and return auth_flag.

if(check_authentication(argv[1])){
 printf("\n-======================-\n");
 printf(" Access Granted.\n");
}
else{
 printf("\n Access Denied. \n");
}

In the main function we are just calling check_authentication to grant access. If check_authentication function returns any positive integer the access is granted meaning the entered password is correct.

This program works fine but there is a catch. What if we pass 33 characters just to create an overflow.

Compiling

gcc -o auth_overflow <your .c file with the code>

Running (without Overflow)

./auth_overflow sabin

Correct Password

Running (without Overflow)

./auth_overflow randomstring

Incorrect Password

Running (without Overflow)

./auth_overflow sharma

Correct Password

So let the hacker inside you kick in. What happens if we pass 33 characters?

Running (with Overflow)

./auth_overflow 1234567890ABCDEFGHIJKLMNOPQRS

As you can see without having correct password our access is granted. You might wonder how on earth did this happen. It’s the same FILO logic of memory segmentation. As we previously defined our string buffer after the integer variable. In memory the string buffer resides before the integer variable . Hence, the value inside integer variable is overwritten by the overflowing string or characters . The value inside integer variable is always positive due to conversion of string into ASCII values.

Note: [ if statements with positive integer are always true.]

So the overflowing value sets if condition to be true and our access is granted.

Wait wait here comes the interesting part. If you understand memory segmentation ,this vulnerability can be solved by just reversing the variable declaration.

From this

int auth_flag = 0;
char password_buffer[16];

To this

char password_buffer[16];
int auth_flag = 0;

Reversing those two lines will result in initialization of character buffer after the integer. Hence, avoiding the overflowing characters overwriting integer variable.

Final Words

In this blog we explored buffer overflow how it is exploited.

Gurzu is a software development company passionate about building software solutions for real business problems. Explore some of our awesome projects here. Need help with software development? Book a free consulting call!