Linux Kernel Scheduler
The point of this article is to give you a brief on how the Linux scheduler works.
The Linux kernel scheduler may seem complex initially, partly due to its modular nature. It allows multiple scheduling algorithms to coexist to manage different classes of processes, known as scheduler classes. In this article, we'll focus on two key schedulers: the Completely Fair Scheduler (CFS) and the Real-Time Scheduler. We'll also explore a potential replacement for the CFS scheduler after its 15-year reign.
Completely Fair Scheduler (CFS)
The CFS aims to ensure that every process in the system receives a fair share of CPU time. It applies to processes in the SCHED_NORMAL, SCHED_BATCH, and SCHED_IDLE classes. Here's an overview of how CFS works:
CFS uses a Red-Black (RB) tree where each node represents a task ready to run.
The leftmost node in the RB tree represents the task that has received the least CPU time.
The kernel selects the leftmost task and runs it for a predefined time slice.
If the task isn't completed within this time slice, it's reinserted into the RB tree; otherwise, it's discarded or exited.
The RB tree rebalances itself to find the next task that has gotten the least CPU time.
The complexity of this algorithm is O(log n) due to the nature of RB trees.
Real-Time Scheduler
Tasks in the Real-Time Scheduler have potential weights that determine their priority. The scheduler operates in two classes: SCHED_RR and SCHED_FIFO.
SCHED_FIFO tasks run until they complete or yield.
SCHED_RR tasks are run in a round-robin fashion for a maximum slice time.
SCHED_RR and SCHED_FIFO tasks have higher priority than SCHED_NORMAL and SCHED_BATCH tasks, and they can preempt lower-priority tasks.
Global Scheduling Policy and Multi-core Systems
Almost all chips today are multicore. In a multi-core system, we also have a global scheduling policy that determines how tasks are distributed among the available CPUs to make the most efficient use of the system's resources.
Tasks are assigned to CPUs based on load balancing algorithms that aim to distribute tasks evenly across all available CPUs. The CFS constantly monitors the CPU usage of each task and the overall system load to make decisions about task placement. When a new task is created or an existing task needs to be rescheduled, the CFS uses its load balancing algorithm to determine which CPU should run the task. This algorithm considers factors such as the current CPU load, the CPU affinity of the task (if specified), and the historical CPU usage of the task.
By distributing tasks evenly across all CPUs, the global scheduling policy helps to maximize the overall throughput of the system and ensure that no CPU is over or under-utilized. This approach improves the overall performance and responsiveness of the system, especially in multi-core systems where efficient task distribution is critical.
Scheduler Implementation
These schedulers operate per CPU, with each CPU managing processes under various algorithms. The “structure rq” structure contains data structures for managing these, such as struct rt_rq and struct cfs_rq. You can find these data structures in the linux/sched.h header file.
Potential Replacement: Earliest Eligible Virtual Deadline First (EEVDF) Scheduler
After 15 years, there's a potential replacement for the CFS scheduler in the form of the EEVDF scheduler. This new scheduler aims to further improve upon the fairness and efficiency of task scheduling in the Linux kernel. It promises better performance and fairness by leveraging virtual deadlines to manage task scheduling more effectively.
This revised version includes the necessary corrections and additional details for clarity and completeness.
Subscribe to my newsletter
Read articles from Harsh Agarwal directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Harsh Agarwal
Harsh Agarwal
Embarking on a tech odyssey! 🚀 Currently charting the realms of networking at the platform team in Arista, after navigating the dynamic universe at Qualcomm. My journey commenced in the Linux Kernel APPS stability for Snapdragon IOT chipsets—think wearables, smartwatches, and smart cameras. Crafted two slick Windows apps, automating parsing and earning a Qualstar along the way. Accelerating into the fast lane, I joined the Automotive development team, conquering global Linux USB Subsystem dev work, mastering the latest board bring-ups, and unleashing features like a tech superhero. Then? At the helm of the AR/VR Reference R&D team, I orchestrate Linux drivers for Power and sculpt features for the AR/VR glasses experience. Noteworthy mention? I've written the IPD (Interpupillary Distance) driver from scratch for XR chips at Qualcomm, proudly leading as the sole Software lead from India. The tech adventure continues! 🕶️