Robust programming to bomb-proof your code
Table of contents
[15]
Introduction
Secure programming is a type of coding that focuses on preventing known vulnerabilities. However, robust programming takes it a step further. It emphasizes building well-structured programs that anticipate and handle the potential problems, not just avoiding the common pitfalls. Basically, robust programming is the foundation for secure coding.
Robust programming is a style of programming that focuses on handling unexpected termination and unexpected actions.
Robust programming demands handling invalid inputs, program terminations, and user actions gracefully. This means providing clear and unambiguous error messages that are easy for the user to understand. These informative messages help users identify and fix problems more easily.
A system's security is defined by its security policy, which outlines the authorized states the system can be in. Any deviation from these permitted states becomes a security breach. Secure programming therefore emphasizes writing code that adheres to the established security policies. These policies define the desired behavior of the system and the acceptable states it can occupy. Consider a program vulnerable to a stack-based buffer overflow. An attacker could exploit this flaw to overwrite the return address on the stack, forcing the program to execute malicious code. If this code grants the attacker elevated privileges (e.g., setuid-to-root
on Linux), it violates the system's security policy by allowing unauthorized actions. In essence, secure programming practices aim to prevent such vulnerabilities and uphold the security policies that safeguard the system.
In the same program, even if the attacker's exploited code doesn't gain elevated privileges, the buffer overflow vulnerability itself remains a security concern. While it might not directly violate a policy focused solely on privilege escalation, it still creates an exploitable weakness. Secure programming principles aim to eliminate such vulnerabilities altogether, preventing attackers from gaining any unauthorized access or control, regardless of privilege levels. This highlights the broader scope of secure programming โ it's not just about preventing specific policy violations, but about building robust systems resistant to various attack vectors.
Exploiting the buffer overflow without privilege escalation might not directly violate a security policy, but it certainly exposes a robustness issue. Robust code anticipates and handles unexpected inputs, including those that could trigger buffer overflows. In this scenario, a robust program would gracefully terminate execution, informing the user about the invalid input that caused the problem. This controlled termination prevents unpredictable behavior and potential system crashes, even if the attacker's goals weren't focused on privilege escalation.
Basic Principles
Robust Code: It focuses on preventing program crashes and unexpected behavior.
Fragile Code: Code that is susceptible to crashes, unexpected behavior due to unhandled errors and bad inputs. It lacks error handling mechanisms, clear error messages, leading to troubleshooting difficulties.
4 Basic Principles
Robust Programming following four basic principles:
Be Paranoid
Assume Stupidity
Donโt hand out Dangerous Implements
Be prepared for Can happen
Be Paranoid
The core principle of defensive programming can be summarized as -
If you didn't generate it, don't trust it.
This approach acknowledges the potential for errors and unexpected behavior, both within your own code and from external sources. Defensive programmers write code with the assumption that their own work might have flaws or bugs. They employ techniques to proactively identify and mitigate these issues as early as possible.
This cautious approach extends to user input. Defensive programming treats all incoming data with suspicion, assuming it could be invalid or malicious. Code written with this philosophy includes robust checks on function calls to ensure successful execution.
Assume Stupidity
This means the code shouldn't rely on users having in-depth knowledge of the system or having read manuals. This could involve input validation techniques to ensure data conforms to expected formats. Instead of relying on cryptic error codes that necessitate manual lookups, error messages should be user-friendly and self-contained. They should clearly explain the encountered problem in a way that's easy to understand. The code should be designed to detect errors as soon as possible during execution. Upon encountering an error, the code should take appropriate actions to prevent it from propagating and causing further problems. This might involve logging the error for analysis, informing the user with a clear message, or performing a controlled termination to maintain system stability. This facilitates easier debugging and system recovery.
Don't Hand Out Dangerous Implements
This principle emphasizes the concept of encapsulation. It encourages the isolation of a code module's internal state. Data structures, libraries, and pointers to data should be hidden from external entities, including the user. By hiding internal details, the code becomes less susceptible to accidental modifications from external sources (like user interaction). Additionally, segregating internal details promotes modularity, making the code more organized and easier to maintain.
Be Prepared For Can Happen
While certain conditions might seem highly unlikely, good practices dictate considering and handling them nonetheless. This principle goes beyond merely anticipating user errors. Code modifications and additions over time can introduce inconsistencies that trigger previously "impossible" scenarios. By incorporating checks for these unlikely yet potential conditions, the code becomes more robust. Even if such checks simply return an error indicator, they serve a valuable purpose.
In essence, defensive programming promotes a culture of anticipating the unexpected. It's not about dwelling on worst-case scenarios, but rather about incorporating safeguards to catch potential problems before they cause critical failures.
Fragile Code
Let's explore some common fragile code examples and their robust counterparts.
Fragile Code Example 1 - Program to calculate average from a list of numbers -
#include <stdio.h>
int main() {
int nums[5] = {5, 3, 6, 2, 8};
int sum = 0;
int avg;
int i;
for (i = 0; i < 5; i++) {
sum += nums[i];
}
avg= sum / 5;
printf("Average: %d\n", avg);
return 0;
}
This code snippet exemplifies fragile programming. It calculates the average of a list by summing the elements and dividing by a fixed value of 5. This approach assumes the list always contains precisely five numbers. Any deviation from this assumption, such as an empty list or a list with a different size, would lead to incorrect results or even program crashes.
Robust Counterpart
#include <stdio.h>
int main() {
int nums[] = {5, 3, 6, 2, 8}; // Array size inferred from initializer
int num_elements = sizeof(nums) / sizeof(nums[0]); // Calculating number of elements
int sum = 0;
int i;
if (num_elements == 0) {
printf("Error: Empty list. Cannot calculate average.\n");
return 1; // Indicates error
}
for (i = 0; i < num_elements; i++) {
sum += nums[i];
}
float avg = (float)sum / num_elements; // Use float for non-integer results
printf("Average: %.2f\n", avg); // Print with 2 decimal places
return 0;
}
Let's see why the above code is more robust. The code infers the array size from the initializer, making it more flexible in case the number of elements changes. It calculates the actual number of elements using sizeof
on the entire array and then dividing by the size of a single element. This works because the array size is known at compile time. It checks if the list is empty and prints an error message if so. The return value of 1 indicates an error condition. It uses a float
variable for the average to handle potential non-integer results accurately and a %.2f
is used to print the average with two decimal places.
Fragile Code Example 2 - Program for modifying variables using pointers -
#include <stdio.h>
int main() {
int num = 10;
printf("Num: %d\n", num); // Printing the value of num
char *ptr = (char*)# // Changing the value of num by mistake
ptr[0] = 0;
ptr[1] = 0;
ptr[2] = 0;
ptr[3] = 0;
printf("Num: %d\n", num); // Print the value of num again
return 0;
}
This code demonstrates a potential pitfall when modifying variables through pointers. Initially, the variable num
is assigned a value of 10. While the code then prints this value using printf()
, a later section mistakenly modifies num
indirectly through a pointer to its memory address.
Robust Counterpart
#include <stdio.h>
#define DEFAULT_NUM 10 // Define a constant value for the initial value of num
int main() {
int num = DEFAULT_NUM;
printf("Num: %d\n", num); // Print the value of num
num = 20; // Attempt to change the value of num (this will not work)
printf("Num: %d\n", num); // Print the value of num again
return 0;
}
Instead of directly assigning a value to num
, this code defines a constant DEFAULT_NUM
to represent its initial state. This constant is then used to initialize num
, promoting clarity and preventing accidental modifications. While the code attempts to change num
's value later by assigning a new value (20), it won't make any differnce because num
is declared as an int
and unlike other variable types, int
variables in C generally cannot be reassigned after their initial declaration. This characteristic, combined with the use of a constant for initialization, safeguards num
's value and contributes to a more robust program.
Fragile Code Example 3 - Program for dividing a number by 0 -
#include <stdio.h>
int main() {
int x = 10;
int y = 5;
int z = x / y; // Dividing x by y and store the result in z
printf("Result: %d\n", z); // Printing the result
return 0;
}
This code snippet seems basic but, harbors a fragility. It hinges on the assumption that the variable y
holds a non-zero value. This assumption becomes a critical point of failure if y
is indeed zero. Dividing by zero in most programming languages results in a runtime error, causing the program to crash. This scenario could easily arise if y
receives its value from an external source like user input or a file.
Robust Counterpart
#include <stdio.h>
int main() {
int x = 10;
int y = 5;
int z ;
if (y == 0) {
printf("Error: Cannot divide by zero\n");
return 1; // Return an error code
}
z= x / y; // Divide x by y and store the result in z
printf("Result: %d\n", z); // Print the result
return 0;
}
The above revised code addresses the critical issue of division by zero. It incorporates a check on the value of y
before attempting the division. If y
is indeed zero, the code just handles the scenario by printing an informative error message and returning a specific error code (1).
Fragile Code Example 4 - Program for accepting values from user -
#include <stdio.h>
int main() {
int num1, num2, sum;
printf("Enter two numbers separated by a space: "); // Prompt the user to enter two numbers
scanf("%d %d", &num1, &num2);
sum = num1 + num2; // Calculate the sum of the two numbers
printf("The sum of %d and %d is %d\n", num1, num2, sum); // Print the result
return 0;
}
The above program exemplifies a fragility in user input handling. It calculates the sum of two integers retrieved from user input. However, the code lacks validation mechanisms to safeguard against unexpected or erroneous user entries. For instance, if the user enters non-numeric characters instead of integers, the program would likely crash due to parsing errors.
Robust Counterpart
#include <stdio.h>
int main() {
int num1, num2, sum;
printf("Enter two numbers separated by a space: ");
if (scanf("%d %d", &num1, &num2) != 2) { // Read user input and check for errors
printf("Invalid input. Please enter two numbers separated by a space.\n");
return 1;
}
sum = num1 + num2;
printf("The sum of %d and %d is %d\n", num1, num2, sum);
return 0;
}
Here, we use the scanf
function to read two integers from the user. However, it goes beyond simply reading the input. The code incorporates error checking by verifying the return value of scanf
. A successful scanf
typically returns the number of items it successfully read. In this case, the code expects to read two integers, so it checks if the return value is precisely 2. If scanf
returns a different value, or if an error occurs during the reading process, the code just exits the program after displaying an error message.
Lessons To Learn
Lesson 1: Prioritize Parameter Clarity
It highlights the importance of designing function parameters for clarity and reducing the risk of errors. One example is the 'flag argument' often used to indicate actions like 'create' or 'delete'. Imagine a flag where 1 represents 'create' and 0 represents 'delete'. Psychologically, programmers might struggle to recall the correct value, potentially leading to unintended behavior (e.g., deleting a queue when they meant to create one). So, instead of flags, using descriptive parameter names that explicitly convey the intended action seems like a good option. For instance, use create_queue
and delete_queue
instead of a single parameter with a flag value.
Lesson 2: Validate Function Inputs
Check function parameters to avoid crashes caused by invalid values (null pointers, non-positive values). If parameters are invalid, handle the errors appropriately (e.g., error messages, return codes). Validate pointer validity (qptr
) and size (size
) during queue creation and deletion to prevent memory allocation issues.
Lesson 3: Avoid Double Free with Pointers
Passing pointers by reference can lead to errors if the function doesn't track allocation history. Consider a function qmanage
that manages a queue using a pointer (qptr
). If qmanage
allocates memory for the queue in the first call (e.g., qmanage(&qptr, 1, 100)
), subsequent calls with deallocation requests (e.g., qmanage(&qptr, 0, 1)
) trying to free the same memory can cause crashes. The function must track allocations or rely on mechanisms to prevent double frees. This could involve using ownership flags or smart pointers (depending on the programming language) to manage memory deallocation.
Lesson 4: Don't Ignore Return Values
Always check the return values of functions, especially those that perform memory allocation or operations that could potentially fail (e.g., multiplication with overflow risk). Check the return value of malloc
to ensure successful memory allocation before using the pointer.
Lesson 5: Guard Against Arithmetic Overflow/Underflow
Overflow typically occurs with positive operands, while underflow happens with negative operands. Use larger data types (if applicable) to accommodate wider ranges of values.
End Note
It's kinda weird to notice in retrospect how much more engaging some concepts become once the external pressure from professors is not there. While studying for exams, I really struggled with this particular concept from my coursework. But, I'll be honest, it's not that bad to learn. The other thing I noticed is that when I stop writing for too long, there's definitely an outpour of content, like today. Sorry not sorry.
Subscribe to my newsletter
Read articles from Pranav Bawgikar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Pranav Bawgikar
Pranav Bawgikar
Hiya ๐ I'm Pranav. I'm a recent computer science grad who loves punching keys, napping while coding and lifting weights. This space is a collection of my journey of active learning from blogs, books and papers.