Algorithms and Data Structures [00]: Get Started!


Ever wondered how Google Maps finds the fastest route in seconds, or how your social media feed loads so quickly? The magic behind these everyday marvels often lies in something called Data Structures and Algorithms, or DSA. If you're new to the world of programming or computer science, these terms might sound intimidating. But fear not! This article will break down the fundamentals of DSA in a way that's easy to grasp, even if you've never written a line of code.
What Exactly is DSA?
Data Structures: Imagine you have a messy room full of clothes. It's hard to find anything, right? A data structure is like organizing your closet – it's a way to store and organize data in a computer's memory (RAM) so that it can be accessed and used efficiently. Just like you might have shelves for sweaters and hangers for shirts, there are different data structures for different types of data and tasks.
Algorithms: An algorithm is simply a sequence of well-defined steps or instructions to solve a specific problem. Think of it as a recipe: follow the steps, and you'll get your desired dish. In programming, algorithms tell the computer exactly what to do, and how to do it, to achieve a certain outcome.
Why to Learn DSA?
Become a Better Developer: Understanding DSA helps you write code that is not only correct but also efficient. It allows you to choose the right tools (data structures) and the best methods (algorithms) for the job.
Efficient Problem Solving: Often, there are multiple ways to solve a programming problem. DSA equips you to analyze these different approaches and pick the one that performs best, saving time and computational resources.
Save Costs and Improve User Experience: Efficient programs require less processing power and memory, which can translate to lower hardware costs. More importantly, faster and smoother applications lead to a much better experience for the end-users.
Boost Your Career: DSA is a cornerstone of computer science and a key topic in technical interviews for software development roles.
Diving Deeper: Analysis of Algorithms
Imagine you have a problem, and you've come up with a few different algorithms to solve it. How do you decide which one is the "best" or most efficient? This is where the analysis of algorithms comes in.
An Example Problem: Sum of Natural Numbers
Let's say we want to write a program to calculate the sum of the first 'n' natural numbers.
If input n = 3, output should be 1 + 2 + 3 = 6.
If input n = 5, output should be 1 + 2 + 3 + 4 + 5 = 15.
Here are three different solutions (algorithms) to this problem:
Solution 1: The Mathematical Formula
int fun1(int n) {
return n * (n + 1) / 2;
}
This solution directly uses the mathematical formula for the sum of the first 'n' natural numbers.
Solution 2: The Iterative Approach
int fun2(int n) {
int sum = 0;
for (int i = 1; i <= n; i++) {
sum = sum + i;
}
return sum;
}
This solution uses a loop to add each number from 1 to 'n'.
Solution 3: The Nested Loop Approach (Less Obvious for this Problem)
int fun3(int n) {
int sum = 0;
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= i; j++) { // Note: The notes have j <= n, corrected to j <= i based on typical sum logic for this structure. If j <= n, it calculates n*n. Assuming it's a typo for demonstrating nested loops in a sum context. For the provided note's fun3, the sum is n*(n+1)/2 but with n*n additions. Let's stick to the note's version to explain its analysis.
sum++;
}
}
return sum;
}
This solution uses nested loops. The inner loop runs 'i' times for each iteration of the outer loop. (As per the notes, the inner loop runs 'n' times, making sum effectively n*n if j <= n. For the analysis below, we'll follow the sum++ structure in nested loops from the notes which for j<=i calculates n*(n+1)/2 but with more operations, and for j<=n calculates n*n.)
Which is Most Efficient? Initial Thoughts
Looking at these, fun1 seems very direct. fun2 involves a loop, and fun3 involves nested loops, which intuitively feels like more work.
However, simply timing how long a program takes to run isn't a reliable way to compare algorithms. Why?
Machine Dependence: A program might run faster on a super-fast computer and slower on an older one, even if it's an inefficient program.
Programming Language: Compiled languages (like C++) often run faster than interpreted languages (like Python or Java) because they are translated directly into machine code.
System Load: Other programs running on your computer can affect the time your program takes.
So, we need a more standardized, theoretical way to analyze algorithms.
Introducing: Asymptotic Analysis
To overcome the limitations of time-based comparisons, we use Asymptotic Analysis. It's a mathematical way to judge an algorithm's efficiency, independent of the specific computer, programming language, or even test cases (we usually consider the worst-case scenario).
The main idea of asymptotic analysis is to measure how the growth rate of an algorithm's time (or space) changes as the input size ('n') increases.. We don't need to implement and run the code; we can analyze it theoretically by looking at the algorithm's steps.
How it Works: Counting Operations
We assume that basic operations (like arithmetic operations, comparisons, assignments) take a constant amount of time. Let's analyze our example functions:
fun1(int n):
return n * (n + 1) / 2;
This involves a multiplication, an addition, and a division – roughly 3 operations. The time taken is constant, say C1. It doesn't depend on how big 'n' is.
fun2(int n):
int sum = 0; // 1 operation (assignment) for (int i = 1; i <= n; i++) { // Loop runs n times sum = sum + i; // 2 operations (addition, assignment) inside the loop } return sum; // 1 operation
The line
int sum = 0;
runs once. Thereturn sum;
runs once. The loop runs 'n' times. Inside the loop,sum = sum + i;
runs 'n' times. So, the total time is something like c2n+c3 (where c2 represents time for operations inside the loop and c3 for operations outside - like sum=0, and returning sum)fun3(int n)
:int sum = 0; // 1 operation for (int i = 1; i <= n; i++) { // Outer loop runs 'n' times for (int j = 1; j <= i; j++) { // Inner loop runs 1, then 2, ..., then 'n' times sum++; // 1 operation inside the inner loop } } return sum; // 1 operation
The
sum++
operation runs1+2+3+…………+n
times which is equal ton*(n+1)/2
times (sum of n natural numbers). This expression can be expanded to(1/2)*n^2 + (1/2)*n
times. So, the total time is something like(c4 * n^2)+ (c5 * n) + c6
.
Understanding Order of Growth
The expressions we derived (a constant time like C1; a linear time like c2*n + c3
; and a quadratic time like c4*n^2 + c5*n + c6
) tell us about the order of growth of the time taken by each function as 'n' (the input size) increases.
fun1
: Time is C1 (a constant). This is Constant Time. The time taken doesn't change with 'n'.fun2
: Time isc2*n + c3
. As 'n' gets very large, thec2*n
term dominates. We say this is Linear Time (grows linearly with 'n').fun3
: Time isc4*n^2 + c5*n + c6
. As 'n' gets very large, thec4*n^2
term dominates. We say this is Quadratic Time (grows with the square of 'n').
Key Idea: Focus on the Dominant Term
In asymptotic analysis, we are interested in how the runtime behaves for large input sizes. So, we can:
Ignore lower-order terms: In an expression like
c4*n^2 + c5*n + c6
, thec5*n
andc6
terms become insignificant compared toc4*n^2
as 'n' grows very large.Ignore constant coefficients: The specific values of
c1, c2, c3, c4, c5, c6
depend on the compiler and machine, but the rate of growth (constant, linear, quadratic) is inherent to the algorithm.
So, we can say:
fun1
has a constant order of growth.fun2
has a linear order of growth.fun3
has a quadratic order of growth.
Why Order of Growth Matters
Let's compare:
Constant:
13
(e.g.,fun1
takes 13 units of time, regardless of 'n').Linear:
2*n + 5
(e.g.,fun2
on some machine).Quadratic: Imagine a
fun3
that simplifies ton^2
.
If n=4:
Constant:
13
Linear:
2*4 + 5 = 13
At n=4, the constant and linear functions might take similar time.
But what if n=100?
Constant:
13
Linear:
2*100 + 5 = 205
Quadratic:
100^2 = 10000
And if n=1000?
Constant:
13
Linear:
2*1000 + 5 = 2005
Quadratic:
1000^2 = 1000000
As 'n' grows, an algorithm with a higher order of growth will always eventually take more time than an algorithm with a lower order of growth, regardless of constant factors or how fast the machine is. Quadratic functions grow much faster than linear functions, which in turn grow faster than constant time functions. This is why we want algorithms with the lowest possible order of growth for large inputs.
Best, Average, and Worst Cases
Sometimes, an algorithm's performance doesn't just depend on the size of the input ('n'), but also on the nature of the input itself.
Example 1: Consider an algorithm to sum all elements in an array:
int getSum(int arr[], int n) {
int sum = 0; // Constant time
for (int i = 0; i < n; i++) { // Loop runs 'n' times
sum = sum + arr[i]; // Constant time operation
}
return sum; // Constant time
}
This algorithm will always perform a fixed number of operations for initialization and return, and the loop will always run 'n' times. The time taken is of the form C1*n + C2
, which is linear order of growth, regardless of what numbers are in the array or their arrangement.
Example 2: Variable Performance (The explanation and code for getSumModified
would remain the same. The description of its performance would be:)
int getSumModified(int arr[], int n) {
if (n % 2 == 0) { // If 'n' is even
return 0; // Constant time
} else { // If 'n' is odd
int sum = 0;
for (int i = 0; i < n; i++) {
sum = sum + arr[i];
}
return sum; // Linear time operations
}
}
Here, the performance depends on whether 'n' is even or odd:
If 'n' is even: The function returns 0 immediately. This takes constant time.
If 'n' is odd: The function calculates the sum using a loop. This takes linear time.
(The rest of the Best, Average, Worst case explanation would follow, referring to "constant time" and "linear time" descriptively.)
Wrapping Up
Understanding Data Structures and Algorithms, along with how to analyze their efficiency using Asymptotic Analysis (focusing on order of growth and worst-case scenarios), is fundamental to writing good, scalable, and efficient software. It's not just about making things run fast; it's about making smart choices in how you solve problems.
This was just a gentle introduction. The world of DSA is vast and fascinating. As you continue your journey, remember these core concepts, keep practicing, and you'll become a more proficient and thoughtful programmer!
Let’s continue the next articles with Order of Growth, direct way to find and compare growth of two algorithms, comparing terms, Asymptotic notations (BigO, Theta, Omega).
Subscribe to my newsletter
Read articles from Rohith Akula directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
