How Teaching Data Structures Helped Me with System Design. Part I: Indexing and Big O

Nicole GathanyNicole Gathany
13 min read

I know a lot of us learn data structures and algorithms (DSA) because we think it will help us land a software engineering role at FAANG or a FAANG-like company. Some, like me, learn them because it’s fun or because our friends are doing it! But many software engineers feel that Leetcode-style questions are not the best way to assess engineering skill. I’m not here to argue against that, but rather to talk about my journey teaching and learning DSA and how it gave me the foundation to learn system design.

Note: I started writing this blog post on system design and data structures but it ended up being pretty long, so I’m breaking it up into several posts. In this first post, we discuss time and space complexity as it relates to clustered and non-clustered indexing.

A Teachable Moment: The Value of O(n²)

I teach two data structures and algorithms (DSA) courses. Sorry. They are not recorded (yet).

One is with #100Devs. The other with #BrilliantBlackMinds.

With Brilliant Black Minds, there’s already a DSA curriculum to use that the students studied independently, so they know their stuff! And we’re moving pretty efficiently. So far, we’ve gotten through arrays, hash tables, stacks, queues, sorting, linked lists, and strings.

And to join, you have to sign up ahead of time and get a Zoom link so everyone who’s there is motivated to learn.

With #100Devs Leetcode Lunch, we’re still trying to figure out the curriculum as we go. We started out with Neetcode roadmap, which felt like a big jump for us. We learned up to two-pointers but had to start over, and so we went back to the beginning: arrays and hash tables. Recently went over recursion, memoization, and closures using the Frontend Masters Course!

So both groups are great!

However, honestly, because Leetcode Lunch is just in a Discord voice channel, at one point, we had a people with a myriad of intentions join us.

Some would come into the voice channel asking, “Does anyone here know Django?” Respectfully: not the time or the place. Go to the help channel. Some came to scam people into joining their Web3 startup that only paid in crypto and equity. The mods handled that.

Some people are just there to hang out.

Some people who attended hadn’t spent that much time studying data structures and algorithms or didn’t see it as important at the time, but wanted to participate anyway at least once.

I have great students and study partners in 100Devs now, but it took us a while to figure out our stride.

One time, I had a student who joined us for the first time, excited! At the time, we were trying a pair programming method where we went through the problem together, I typed, and the students directed me.

The new participant came up with some great ideas for the problem, but needed some help from me to figure out the problem (which is great!). I don’t remember what the problem was, but they wanted to use a loop in a way that needed another loop inside, and wanted to put the accumulator outside the outer loop. This didn’t work, so I suggested the accumulator go inside the outer loop, which allowed all our tests to pass.

I asked them what the time complexity of the algorithm was. They said “O(log(n))!” When I asked why O(log(n), they said, “Because it’s the best one!” I appreciated the enthusiasm, but I showed them the Big-) Complexity chart which says that, while O(log(n)) is great and better than O(n), O(1) was the most efficient. I also said that this particular algorithm had O(n²) time complexity because we were using two loops, one nested inside the other, and that both loops traversed through the array in its entirety.

They told me, “I’d rather work on real applications with real data!” and left.

They never came back.

Wow. Ouch!

While I was hurt, I honestly thought they had a point at least to some extent. While I enjoyed doing Leetcode and CodeWars problems, I just found them fun. I didn’t really find them useful for my first two jobs.

In my first job, I built an API for our survey, and probably not optimally if I’m being honest, but that was ok at least during my time there. On my second job, I spent most of my time improving accessibility, improving CI/CD pipelines, responsive web design, A/B testing features (some of these features were as simple as a copy change).

While I hadn’t found a use for data structures and algorithms in industry yet, my instincts just told me that just because I couldn’t see a practical use for DSA now didn’t mean this would be the case forever.

I realize now that I only agreed (partially) because all the systems I had worked on had already been designed, and I had not studied system design. I mean, I was learning too.

Trusting the Process and Not Rejecting Learning

I have to admit that I’m a person who tends to trust the process (although I am really resistant to learning hashing with an array; why not use an object?). It’s not innate or anything, but my mom told me to never scoff at a lesson or class with the attitude that you’re never going to use what you learned. See, my mom took a typing class in the 1950s or 1960s. She almost rejected the idea of learning typing, asking, “Why do I need to know this?” and stating, “I’m not going to be a secretary!”

And guess what! My mom became a clinical social worker, a therapist if you will, and in the 2000s, she had to type notes DURING HER SESSIONS with patients. Without time in between patients to write notes.

This is probably a poor reflection of the system, but because she learned to type at a high WPM (word per minute), she was able to survive it.

Even outside of my mom’s context, typing is a valuable skill. If she (or anyone else in her typing class) had become a writer, a journalist, or a software engineer, the typing skills would have helped her here, too.

Leon Noel of 100Devs often says that on tech teams, the engineer who gets asked for help the most isn’t necessarily the best engineer but rather the fastest typer.

My mom’s typing teacher was really onto something.

Gone are the days when you wrote your papers by hand and then hired a typist to type it for you. Albeit when I lived at the University of Ibadan in Nigeria for 2 months in 2010, the older doctoral students were doing hiring typists to type up their dissertations, and they asked me a if I was a typist because I “typed so fast.” Not sure if it’s that way now, though. I do know that around the same time, I saw a really elderly man at the Duke library typing with two fingers. I hope this doesn’t come across as ageist, but I found it amusing. I also think it was a reflection of the time in which he grew up. Imagine if your therapist was writing your notes with two fingers.

All that to say that things change, so don’t block the lessons.

As my mom used to say, “You never know what skills you’re going to need.”

Back to System Design and DSA

My mom’s story reminded me: just because something doesn’t seem useful right now doesn’t mean it won’t become essential later. And that’s exactly how I came to see DSA in a new light through system design.

Now that I’m learning system design through the Learn System Design Podcast, I really do see the value in data structures and algorithms. So like my mom’s typing instructor, I was onto something too.

You can learn system design without DSA (I know some engineers who are good at the former but not the latter), but some of the core concepts of DSA built the foundation for me to learn system design.

I will explain how, but first, I want to explain why system design is important in the first place.

Why System Design is Important

System design skills are important for several reasons. Not only can they help you to grow in your career, but they can help your company save money and/or make more money.

System Design for Your Career

A while back, I had a coffee chat with a senior backend engineer at a large travel discount company.

She encouraged me to study system design (even though I personally found it intimidating) because while I could get a junior or mid-level position without it, it would be harder to get a senior position at some companies and while I could enter as a junior or mid-level engineer, it might be harder to get promoted. This is not true for every company, so I don’t want to discourage people who don’t know system design, but I really wanted to work here and I do encourage you to be curious.

I’ve also had interviews with companies that basically expected me to be a product owner, so I missed out on some opportunities by not knowing system design–again I’m not trying to discourage you from going for certain roles; I’m only trying to encourage curiosity.

System Design for Your Application

Outside of the individual monetary and career benefits, system design is just better for applications, especially if you’re building a distributed system—a system where you’re expected to have a lot of users or a system where your user base grows at significant rates.

Without system design, your application can slow down or crash as the number of users and/or requests increases. With this decreased speed in your system, you could lose users and therefore revenue.

In 2006, Amazon found that for every 1 ms of latency, they lost 1% of users. Today, that could translate into billions of dollars.

In other cases, you might be able to keep your users, but your AWS bill (or your Google Cloud bill, or your Azure bill, or your Supabase bill, or your Firebase bill) could be higher the next month with increased API calls.

All that being said, you might not need system design skills for every software engineering role at every company.

You might be at a company where the system has already been built, and you’re only expected to build new features. That was me for years. You might be building a project where you’re not expected to have millions of users.

However, in some cases, learning system design can open doors for you! It can help you earn more money for yourself as an individual or the company/startup you’re working for. At the very least, it can prevent you from losing money.

DSA: A foundation for System Design Learning

Databases

Databases are the bottlenecks of applications and can slow systems down as users are added to a system or the calls to the database increase.

Different types of indexing can improve the speed of database fetching.

Time complexity and Optimizing Database thru Indexing

One concept we go over in DSA courses a lot is time complexity. Time complexity could be defined as the number of steps required to complete an algorithm with respect to the size of the dataset.

When we say that databases are the bottlenecks of applications and that they slow systems down, we mean that the time it takes to complete a task can grow with the size of the database.

When we say that indexing can improve the speed of database fetching, we mean that with respect to size.

One type of indexing is called primary key indexing.

This type of indexing is based on primary keys and “reduces the need to full table scans or sequential lookups.”

Primary-key indexing reduces time complexity for searching from O(b) or O(log(b), where b is the number of blocks.

In most of the DSA courses I’ve taken, taught, or led, we spent a good amount of time discussing, learning, and reviewing time complexity.

So I know first-hand how much better O(log(b)) is than O(b). Like my drive-by student said O(log(b)) is the best…well it’s not the best, but it’s definitely better than O(b).

Because I studied DSA for so long, it’s pretty obvious to me that O(b) means that to find the data point, we need to look at every single block in the database. Whereas a process with O(logb) significantly reduces the need to loop at every block.

If you’re familiar with DSA and time complexity, you might compare this to a iteration versus divide and conquer. The run time for searching for an element in an array using a standard loop is O(n) while searching using the divide and conquer method would be O(long).

While primary indexing—a type of clustered indexing—improves time complexity for searching, removing or adding to a database with clustered indexing alters the order of the database every time.

Therefore, removing or adding to a clustered index like a primary key index can cause performance issues.

This makes me think of how with arrays in JavaScript, the methods .shift(), .unshift(), .slice(), or .splice()—which also involve adding or removing data from the beginning or middle of an array—all have a time complexity of O(n) where n is the size of the array.

We get O(n) time complexity here because removing or adding requires us to shift the order of all remaining elements by 1. Meaning we have to make n shifts after adding or removing from an array in JavaScript.

With non-clustered indices, removing or adding to the beginning does not alter the order of the table. It creates a separate data structure with only the index columns being ordered and then a reference to each corresponding row. So, removing or adding to a table with non-clustered indices will have a time complexity of O(1).

Space Complexity, Indexing, and trade-offs

Another concept discussed in DSA courses is space complexity. Space complexity is the amount of memory an algorithm requires with respect to size of the dataset.

For example, I’m creating an algorithm. In my algorithm, do I need to add an array or a object? How large would that object or array get with respect to the size of the original data set? How many times would I need to create a new array or object?

As stated before, non-clustered indices do no not alter the order of a table so the time complexity of removal from databases with non-clustered indices is constant or O(1).

However, because it does create a separate data structure, it does take of more physical space.

This reminds me of how hashmaps, hashtables, objects, or caches can improve lookup speed and therefore the time complexity of an algorithm. However, they do make space complexity worse.

Take the classic two-sum problem, for example. In this problem, you’re given an array of integers and a target integer. You want to return an array with the indices of the two numbers whose sum equal the target. We can’t use the same number twice

So if we have an array of [2, 10, 5, 8] and a target of 10, we want our function to return [0, 3] because 2+8=10. 5+5 equals 10 and 10 equals 10, but we need two numbers separate numbers and can’t use the same element twice.

We can use a brute force solution.

function twoSumBrute(array, target){
    for(let i=0; i<array.length; i++){
        for(let j=1; j<array.length; j++){
            const sum = array[i] + array[j];

            if(i !=== j && sum === target) return [i, j];
        }
    }
}

This solution should work, and its space complexity is O(1), but it has a time complexity of O(n²).

We could also try to get a solution with a better time complexity.

function twoSumConstantTime(array, target){
    const hashMap = {};
    for(let i=0; i<array.length; i++){
        const difference = target-array[i];
        if(array[i] in hashMap) return [i, hashMap[array[i]]
        else {
            hashMap[difference] = i
        }
    }
}

This algorithm has a time complexity of O(n) and a space complexity of O(n).

This algorithm improves our time complexity but takes up more space compared to the brute force solution.

Like non-clustered indexing in relation to clustered indexing, the time complexity improves, but the space complexity gets worse. Because I’ve studied time and space complexity at length, the importance of indexing and the trade-offs of each method come easily to me.

Conclusion

I’ve seen many engineers sleep on data structures and algorithms. Even those of us who study them may not see the importance of them for “real applications,” and to an extent that is fair. You can get away with not knowing them in some jobs where you’re only building new features or doing email development but if you reject them forever, you could hit a wall in your career.

If you scoff at DSA and system design forever, you might miss out, like my mom almost did with typing.

Data structures and algorithms have built a strong foundation for me and have helped me in my system design learning journey. Time and space complexity, as they relate to clustered indexing and non-clustered indexing, just are just two examples. In subsequent blog posts, I also plan to discuss hashing and caching.

So please stay tuned. I’m still learning, so please provide feedback!

0
Subscribe to my newsletter

Read articles from Nicole Gathany directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Nicole Gathany
Nicole Gathany

I am a people-centered software engineer with a past life in public health and reproductive justice. I'm using this blog to combine my love for tech and communication.