What Happened in the Recent CrowdStrike Outage?

In this article, I will discuss what I understood about the recent CrowdStrike outage.


The outage occurred on July 19, 2024, and impacted Windows 10 and the latest versions of the operating system. Before diving into the details of the outage, let’s first review some technical terms.

What is CrowdStrike and CrowdStrike Falcon?

CrowdStrike is a cybersecurity company that provides cloud-based software to various organizations. It is a leading provider of endpoint protection solutions. Founded in 2011 by George Kurtz, formerly of McAfee, and based in Austin, Texas, CrowdStrike offers crucial security services.

CrowdStrike Falcon is a software product developed by CrowdStrike to help organizations protect their systems from security issues.

What are Operating Systems and the Kernel?

Think of the operating system as a bridge between you and the computer's hardware. It allows you to use your computer by handling tasks like opening programs, saving files, and managing hardware such as printers and monitors.

The kernel is the core component of an operating system. It is a system program that translates user commands into machine language.

What are Windows Pipes?

In Windows, pipes allow different processes to communicate with each other.

Example: Imagine a system where a logging application records activities, like user logins or file accesses, and sends this information through a pipe. At the same time, another monitoring application reads from this pipe to receive updates in real-time. If the monitoring application detects any unusual activity, it will immediately alert us. This allows the logging and monitoring programs to work together. The logging application sets up a pipe to send information, and the monitoring application reads the information from the pipe and processes it.

What is BSOD?

BSOD stands for Blue Screen of Death. It is an error screen that appears on a Microsoft Windows system when the operating system fails. This can be caused by driver incompatibility, virus issues, etc. It can be resolved by restarting the system, updating the drivers, and so on.

Regarding the Outage

The issue occurred on July 19, 2024, a Friday. For some corporate employees, it might have been a relaxed workday, while those responsible for resolving the outage had a challenging start to the weekend. However, the good news is that CrowdStrike was able to resolve the issue in 79 minutes, though most services were affected until then.

This is considered one of the biggest IT outages. The issue happened due to a CrowdStrike Falcon software update. CrowdStrike Falcon is installed on systems to monitor and alert any security issues. It is essentially an advanced antivirus program. CrowdStrike aimed to update their Falcon software to better address potential threats caused by Windows pipes. However, this update led to every Windows system displaying a BSOD. As mentioned, BSOD occurs when there is a problem with the operating system. Windows has kernel patch protection that shows the BSOD screen if something is wrong with the kernel. CrowdStrike Falcon monitors what happens in the system by embedding itself within the operating system. In Windows, there are user-level and kernel-level drivers. User-level drivers run in a separate memory space from the operating system kernel and have limited access to hardware resources. Kernel-level drivers have full access to the entire system memory.

When coding software that runs at the user level, it's relatively safe because these applications can be updated without significantly affecting the system. For example, updating a browser can be done within the browser itself.

On the other hand, software at the kernel level operates at the core of the operating system and has direct control over hardware and system resources. For example, updating the Linux kernel is more complex and involves kernel-level operations.

In this case, the update caused an error in the kernel-level code, making the operating systems inoperable and bringing them down. Note that the update to CrowdStrike Falcon was not modifying the kernel driver code but was updating Falcon's definitions. Definitions are rules or patterns that help the software detect malicious activities. The update included new definitions to detect a specific type of threat involving Windows named pipes. This meant CrowdStrike had identified a new attack method and updated their detection rules accordingly. Once this software update was deployed and the drivers loaded, it crashed the entire operating system, causing the outage.

When I first saw the outage news on Instagram, I wondered if I would also see the BSOD on my device. I asked my friend if she had encountered it, but neither she nor I experienced any issues. This intrigued me to understand what exactly went wrong.

Why Didn't It Affect Home PCs?

The software was designed for large enterprises, universities, banks, and similar organizations that need their IT teams to monitor many systems. Home computers use simpler antivirus solutions like Microsoft Defender and are less likely to be affected.

Is This Specific to CrowdStrike?

This issue is not unique to CrowdStrike. Any software running at the kernel level is fragile and can easily cause system-wide issues. Since CrowdStrike Falcon was operating at the kernel level, it led to the problem.

Why Was Microsoft Blamed?

This issue was not caused by Microsoft’s software. The problem arose because security vendors require their software to be deployed at the kernel level for better security. Microsoft had suggested using APIs to limit such crashes, but vendors opposed this. This isn’t a Microsoft issue; any company that allows third-party software to run at the kernel level could face similar consequences if issues arise.

Why Not on macOS and Linux?

macOS doesn’t use third-party security software in the same way as Windows. It has proper controls on how software interacts with the system. Similar CrowdStrike issues occurred on Linux but didn’t cause much trouble due to Linux’s flexibility. It allows customization and implements its own security measures, making it less exposed to specific bugs or vulnerabilities.

Conclusion

Windows might consider adopting a system like Chromebooks, which can roll back to a previous version if an update fails. Windows avoids this approach to maintain compatibility with older software, but it could be beneficial to prevent similar issues in the future.


0
Subscribe to my newsletter

Read articles from Sharon Jonallagadda directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Sharon Jonallagadda
Sharon Jonallagadda

Pursuing my Master’s in IT. Documenting my journey as I work towards securing an internship and a full-time role. Sharing the knowledge and experiences I gain along the way.