Buffer Overflow Basics: A Simple Guide to Understanding Vulnerabilities
Introduction
Have you ever wondered how seemingly harmless input can compromise a program's security? Buffer overflows have become a notorious weapon in the hands of attackers, allowing them to manipulate memory and execute malicious code. In this blog post, we will explore what a buffer overflow is, how it occurs, and why it poses such a significant threat in the world of cybersecurity.
Definition: Buffer overflow occurs when a program tries to write data into a buffer beyond the buffer’s allocated size causing it to overwrite adjacent memory.
Importance: Understanding buffer overflows allows us to prevent people with malicious intent exploit your code and also makes you in general a better programmer.
What is Buffer Overflow?
Technical Definition: A buffer overflow happens when data written to a buffer overflows and alters the values in memory addresses next to the destination buffer because of inadequate bounds checking.
Example: Consider a buffer of size 5 bytes. Now imagine you copied the string “USER12AB” into the buffer. As you can see the length of the string is 8 bytes, but the allocated capacity of buffer is just 5, so what happens is the first 5 bytes, i.e, “USER1” are copied into the buffer and the rest 3 bytes of the string are copied into the memory adjacent to buffer. (See figure below)
Memory Layout: Explain stack and heap memory, with simple illustrations.
Common causes of Buffer Overflow
Unsafe Functions: Common C/C++ functions (e.g.,
strcpy
,sprintf
) that lead to overflows.Lack of Bounds Checking: Explain the importance of proper input validation.
Example: A Simple Vulnerable Program
// FILE NAME: buffer.c
#include <stdio.h>
#include <string.h>
void vulnerable_function(char* input) {
char buffer[10]; // Create buffer of 10 characters
strcpy(buffer, input); // Vulnerable to overflow
}
int main(int argc, char* argv[]) {
if(argc > 1) {
vulnerable_function(argv[1]);
}
return 0;
}
The output of the above code (compiled w/ GCC):
The flaw in this program lies in the use of strcpy
, which does not check the length of the input. If a user inputs more than 10 characters, it will overflow, leading to unexpected behavior or crashes.
If an attacker provides a long input string, it can overwrite adjacent memory, potentially crashing the program or allowing malicious actions to occur.
Fortunately, there is a thing called the Stack Smashing Protector (SSP) which is a security feature in many compilers including GCC that detects buffer overflows in programs. It adds extra checks to the code that can detect when a buffer overflow has happened and stop the program from running, which protects against attacks that try to exploit these vulnerabilities.
How does buffer overflow work w/ Example
Declaring a buffer
In programming languages like C, you might declare an array with a fixed size. For instance:
char buffer[10]; // An char array with size 10
Writing Data to the buffer
When you write data to the buffer, using some vulnerable function, for eg:
strcpy
, the function assumes that the there is enough allocated space for the data given. Example:strcpy(buffer, “Hello!\0”); // This fits into the buffer
Writing Data that exceeds the buffer’s fixed size
If you try to write more data than the buffer can hold, like this:
strcpy(buffer, "Hello this is a long string\0"); // The string being copied is larger than // the allocated size of the bufffer-
The
strcpy
continues writing data beyond the end of the buffer, leading to a buffer overflow.This extra data that doesn’t fit into the buffer starts overwriting adjacent memory locations. This can include other variables, function return addresses, or even critical data used by the program.
What happens next?
Overwriting critical data can lead to:
Unexpected behaviour: The program may crash or give incorrect results.
Security vulnerabilities: An attacker may exploit the overflow to execute arbitrary code by overwriting the return address of a function.
Example: Consider a vulnerable program that contains a buffer overflow. An attacker might input a payload that looks like this:
AAAAAAAA...AAAABBBBCCCC
Here, "AAAAAAAA" fills the buffer, "BBBB" overwrites the return address, and "CCCC" represents the shellcode that will be executed.
NOTE: This is a simplified explanation and not an accurate representation of how a Return Address Overwrite actually works
Defenses Against Buffer Overflow Attacks
There many defenses one can use, some of them being:
Stack Canaries: Compilers insert special values, called canaries into stack frames.
If the value of canary is modified, it suggest a buffer overflow has taken place.
ASLR (Address Space Layout Randomization): Its a memory protection proccess that safeguards against buffer oveflows. It works by randomizing the memory location where executable are loaded into memory.
Data Execution Prevention (DEP): Its a policy enforced by the operating system that prevents execution of code in memory area marked as non-executable, preventing attacker from executing thier payload.
Further Reading and Resources
Some online resources that you can refer for more details regarding buffer overflows and exploitation:
CTF Handbook: https://ctf101.org/binary-exploitation/buffer-overflow/
GeekForGeeks: https://www.geeksforgeeks.org/buffer-overflow-attack-with-example/
Books:
The Shellcoder's Handbook by Chris Anley et al.
Hacking: The Art of Exploitation by Jon Erickson.
Subscribe to my newsletter
Read articles from Auth0x78 directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by