Demystifying .NET Garbage Collector: How Memory Management Works


Memory management is critical for building efficient and reliable applications in .NET. The Common Language Runtime (CLR)'s Garbage Collector (GC) automates this process, freeing developers from manual memory allocation and preventing common issues like memory leaks. In this blog, we’ll dive deep into how the GC works and its phases.
What is the Garbage Collector?
The Garbage Collector (GC) in .NET is responsible for automatic memory management. It allocates memory for objects, tracks object usage, and frees up memory when objects are no longer needed. This ensures efficient memory usage and eliminates the risk of memory leaks caused by manual allocation errors.
Key Concepts:
Object References: Objects are referenced by other objects or by the runtime (e.g., static fields, method parameters). The GC uses these references to determine which objects are reachable and should not be collected.
Unmanaged Code: In unmanaged code, the programmer is responsible for memory management, resource allocation, and security. This can lead to issues like memory leaks and vulnerabilities if not handled carefully.
Managed Code: Managed code is executed under the CLR, which handles compiling IL code into machine code, memory management, exception handling, and security.
Managed Heap: The managed heap is a contiguous region of address space reserved for a process. It has two segments: Small Object Heap (SOH) and Large Object Heap (LOH). The heap is divided into generations to optimize garbage collection:
Generation 0: Stores short-lived objects. Most collections occur here.
Generation 1: Acts as a buffer between short-lived and long-lived objects.
Generation 2: Stores long-lived objects.
Large objects are placed in the LOH. By default objects with size 85000 bytes or larger are considered large objects and placed in LOH.
Garbage collection Process
As application creates objects, runtime allocates memory for it in the address space in contiguous manner, As long as address space is available , the runtime continues to allocate space for new objects in this manner.
The garbage collector's optimizing engine determines the best time to perform a collection based on the allocations being made. When the garbage collector performs a collection, it releases the memory for objects that are no longer being used by the application. It determines which objects are no longer being used by examining the application's roots.
Roots: The GC must understand which objects are still in use to know what it can safely collect. It does this by examining the root set — the set of objects that can be directly accessed by the program.
Here are some examples of these roots:
Static Variables: These are values that belong to a class and stay available for the whole time your app is running.
Active Method Variables: When your app is running a method, any variables in that method are considered active and are kept.
Special References Used by the System: Sometimes, the system or outside code holds onto certain objects. These are also treated as roots to make sure they aren’t deleted too soon.
A garbage collection happens in three phases:
Marking Phase:
The GC starts by marking all root objects.
It then traverses the object graph starting from these roots, marking all reachable objects (objects that can be accessed directly or indirectly from the roots).
Suppose there is a stack variable
pA
referencing an objectParentA
, which in turn references two objects:ChildA
andChildB
. Similarly, another stack variablepB
referencesParentB
, which also referencesChildB
. Now, suppose the stack variablepB
is removed before the next garbage collection cycle begins.Then, in the marking phase, the Garbage Collector begins with active roots — in this case,
pA
. It traverses the object graph starting frompA
, markingParentA
as live, and then marking bothChildA
andChildB
, since they are directly referenced byParentA
. AlthoughpB
has been removed andParentB
is no longer reachable from any root,ChildB
remains reachable due to its reference fromParentA
. As a result,ParentB
is considered unreachable and becomes a candidate for garbage collection, whileChildB
is preserved.
Relocating Phase:
After marking, the GC updates references to objects that will be compacted. This involves changing pointers in the object graph to ensure that references point to the new locations of objects.
This step is crucial for managing objects that need to be moved to reduce fragmentation.
Compacting Phase:
The GC compacts the heap by moving the marked (live) objects towards the beginning of the heap.
This process reclaims the space occupied by dead (unreachable) objects and ensures that the remaining objects are arranged contiguously.
Compacting the heap helps reduce fragmentation and optimize memory usage.
This process is sometimes referred to as the mark-and-sweep algorithm:
Mark: Identify all reachable (live) objects.
Sweep: Remove all unmarked (unreachable) objects from memory.
Survival and promotions
Objects that aren't reclaimed in a garbage collection are known as survivors and are promoted to the next generation:
Objects that survive a generation 0 garbage collection are promoted to generation 1.
Objects that survive a generation 1 garbage collection are promoted to generation 2.
Objects that survive a generation 2 garbage collection remain in generation 2.
When does garbage collection in gen1 and gen 2 happens
Generation 1 Garbage Collection
Generation 1 (Gen 1) acts as a buffer between short-lived objects in Generation 0 (Gen 0) and long-lived objects in Generation 2 (Gen 2). Garbage collection for Gen 1 occurs in the following situations:
Promotion from Gen 0: When objects in Gen 0 survive a garbage collection, they are promoted to Gen 1. If Gen 0 collections happen frequently and promote many objects, it can trigger a Gen 1 collection.
Memory Pressure: If the heap's memory usage grows, leading to insufficient space in Gen 1, a Gen 1 garbage collection may be triggered to free up space and make room for new allocations.
Generation 2 Collection: A Gen 2 collection also includes a Gen 1 collection. This means whenever a Gen 2 collection occurs, objects in Gen 1 are also considered for collection.
Explicit Collection: While it's generally not recommended, developers can explicitly trigger a garbage collection using the
GC.Collect
() method. If this method is called with the parameter for Gen 1, it will trigger a collection for both Gen 0 and Gen 1.
Generation 2 Garbage Collection
Generation 2 (Gen 2) stores long-lived objects and typically undergoes garbage collection less frequently than Gen 0 and Gen 1. Gen 2 collections occur in the following situations:
Promotion from Gen 1: When objects in Gen 1 survive a garbage collection, they are promoted to Gen 2. Over time, if many objects are promoted, this can lead to a Gen 2 collection.
High Memory Usage: If the overall memory usage of the application grows significantly, leading to memory pressure, a Gen 2 collection may be triggered to reclaim space from long-lived objects.
Large Object Heap (LOH) Pressure: The LOH, which is part of Gen 2, stores large objects. If there is memory pressure in the LOH, it can trigger a Gen 2 collection to reclaim space.
Allocation Thresholds: The GC monitors allocation patterns and adjusts thresholds for triggering collections. If many large objects are allocated or there is consistent memory pressure, a Gen 2 collection may be initiated.
Explicit Collection: Developers can explicitly trigger a Gen 2 collection using the
GC.Collect
method with the appropriate parameter. This will collect objects from Gen 0, Gen 1, and Gen 2.
Balancing Allocations and Collections
The GC continually balances two priorities:
Not allowing the application's working set to get too large by delaying garbage collection.
Ensuring garbage collection does not run too frequently, which could lead to performance overhead.
By analyzing the root sets and managing object lifetimes across generations, the GC optimizes memory usage and application performance.
Conclusion
The .NET Garbage Collector is a sophisticated memory management tool that automates the allocation and deallocation of memory. By understanding its inner workings and following best practices, developers can build high-performance, memory-efficient applications.
Subscribe to my newsletter
Read articles from Shahzad Ahamad directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
