HashSet in C#

Generic definition
HashSet is a data structure used to store a collection of unique elements. Unlike arrays or lists, the elements in a HashSet are not stored in a continuous order. When a value is added to a HashSet, it is first passed through a hash function to compute a hash code, which is then used to determine the index (or bucket) where the value will be stored internally.
Key features of HashSet include:
No duplicate values: Each element must be unique. If an element that already exists is added again, it is ignored.
Unordered storage: The elements are not stored in the order they are added.
Fast operations: HashSet offers constant-time performance (O(1)) for basic operations like add, remove, and contains, assuming a good hash function and a low collision rate.
HashSet in C#
In C#, HashSet
is found in the System.Collections.Generic
namespace and is a type of generic collection.HashSet
is similar to a Hashtable
, but it only stores values (no key-value pairs).
In the case of a Hashtable
, the index is calculated based on the hash code of the key, whereas in a HashSet
, the index is calculated based on the hash code of the value.
Duplicate values are not accepted in a HashSet
, although duplicate indexes are possible if two values generate the same hash code. We’ll discuss how this is handled in C# later.
Declaring Syntax
HashSet<T> referenceVariable = new HashSet<T>();
How elements are added in HashSet
To add an element to a HashSet
, its index is generated first using the formula:
Index = hash code % count
In the above line:
hash code
refers to the hash code of the value.capacity
refers to the current size of the internal array in theHashSet
.
Capacity is always maintained as a prime number (default is 17), and it automatically increases as elements are added.
Here, capacity means the size of the internal array where values are actually stored.
The hashcode is generated using existing method GetHashCode
of System.Object class that gets overriden in System.String class which calculate hascode based on the characters. Where sum of the ASCII value of all the characters in the string become as the hascode of particular string.
Hash codes are generated using the GetHashCode()
method from the System.Object
class. This method is overridden in classes like System.String
, the hash code is calculated based on the characters in the string typically using an algorithm that considers position and value..
For numerical values, the number itself is often used as the hash code.
For custom classes, we generally override
GetHashCode()
to return a unique value (often based on an ID).
How to Use HashSet with Custom Classes :
When working with collections in C# like HashSet
, if you use custom classes (which are reference types), the default behavior for determining whether two objects are equal is based on reference equality, not value equality. This means that even if two objects have the same data, they are considered different objects if they are located at different memory addresses (i.e., they are stored in different positions in memory).
By default, C# does not know how to compare custom objects for equality. If you add objects of a custom class (a reference type) to a HashSet
, all objects will be inserted even if there are duplicate objects because HashSet
uses the default Equals()
and GetHashCode()
methods, which compare objects based on their reference in memory (not the values they contain).
public
class
Department
{
public
string
Name { get; set; }
public
int
YearFormed { get; set; }
public
int
NumberOfMembers { get; set; }
}
HashSet<Department> departments = new
HashSet<Department>();
departments.Add(new Department() { YearFormed = 1990, Name = "Finance", NumberOfMembers = 25 }); departments.Add(new Department() { YearFormed = 2000, Name = "Marketing", NumberOfMembers = 30 }); departments.Add(new Department() { YearFormed = 2010, Name = "HR", NumberOfMembers = 15 }); departments.Add(new Department() { YearFormed = 1990, Name = "Finance", NumberOfMembers = 25 });
Suppose we have a Department
class, and we want to use it in a HashSet
. In order to ensure that duplicate departments (based on certain criteria) are not added, we need to implement the IEqualityComparer<Department>
interface. This allows us to define custom comparison logic for Department
objects when adding them to the HashSet
.
class DepartmentComparer : IEqualityComparer
{
public bool Equals(Department x, Department y) {return x.NameEquals(y.Name, StringComparison.InvariantCultureIgnoreCase);
}
public
int
GetHashCode(Department obj)
{
return
obj.Name.GetHashCode();
}
}
Now if try to insert similar object then it wont allow.
How duplicate indexes are handled in HashSet
When two values generate the same index in a C# HashSet
, it results in a hash collision. The HashSet
handles collisions internally using a strategy called chaining, which involves storing multiple items in the same bucket (index).Which is done using Linked List. below i have mentioned the steps that happens in backend if hash collision occurs.
Steps :
Hashing: Both values are passed through the hash function, and they produce the same hash code.
Bucket index: The hash code is then used (usually modulo the number of buckets) to determine the storage index.
Collision detection: The
HashSet
sees that the bucket already contains an item.Equality check: The
HashSet
uses theEquals()
method to check if the new value is equal to any existing item in the bucket:If it is equal → The new item is not added (since duplicates aren't allowed).
If it is not equal → The new item is added to the same bucket (typically stored in a linked list or similar structure).
Important HashSet operations and properties in C#
Creating a HashSet object:
HashSet<string> messages = new HashSet<string>();
The left-hand side creates the reference variable, and the right-hand side creates the
HashSet
object.Initializing:
HashSet<string> messages = new HashSet<string>
{ “Good Morning“" , ”How are you” , ”Have a good day!”}
;Count Property:
The
Count
property returns the number of unique values present in theHashSet
, not the underlying array size.int count = messages.Count;
Add
,Remove
,Contains
,Clear
, Iterating through hashset:Add method adds an element:
messages.Add(“YO“);
Remove method removes an element and returns
true
if successful:bool isRemoved=messages.Remove(“Welcome“);
RemoveWhere removes elements based on a condition/predicate:
int removedCount = messages.RemoveWhere(m => m.EndWith(“You“));
it returns the number of elements removed.
Contains checks if an element exists:
bool exists = messages.Contains(“Yo“);
Note:
Contains
is fast — it avoids linear search and directly accesses the computed index.Iteration:
foreach(string message in messages)
{
Console.WriteLine(message);
}
While iterating through a
HashSet
, the order is not guaranteed (unlike Lists). However, in practice, you may see insertion-like order — but you shouldn’t rely on it.Clear removes all elements:
messages.Clear();
The
Clear()
method doesn't return anything and does not dispose of theHashSet
itself — it just clears its contents. It's safe to call even if the set is already empty.
UnionWith
andIntersectWith
methods:These are two important methods for set operations:
Example of
UnionWith
:HashSet<string> employee2021 = new HashSet<string>(){“Ron“,”Tiffany”,”Robin”};
HashSet<string> employee2022 = new HashSet<string>(){“Alice“,”Frank”,”Lucy”};
employee2021.UnionWith(employee2022);
This appends all values from
employee2022
intoemployee2021
.Example of
IntersectWith
:HashSet<string> employee2021 = new HashSet<string>(){“Ron“,”Tiffany”,”Robin”};
HashSet<string> employee2022 = new HashSet<string>(){”Tiffany”,”Robin”,“Alice“,”Frank”,”Lucy”};
employee2021.IntersectWith(employee2022);
Use case: To find employees who worked in both 2021 and 2022. "Ron" won’t be part of the result set since he left.
Real-Time Use Case:
Suppose you want to show user hobbies as filters/autofill options in a search input. Many users may have the same hobbies, and if you collect hobbies directly from all users using a List
, you’ll likely get duplicates.
To avoid this, use a HashSet
to automatically filter out duplicates:
HashSet<string> hobbies = new HashSet<string>(listOfHobbies);
This ensures you only get unique hobbies.
🙏 Thank You!
Thank you for taking the time to read this article on HashSet
in C#. I hope it helped you understand the core concepts, usage patterns, and practical applications of this powerful collection type. If you have any feedback or questions, feel free to share. Happy coding! 😊
Subscribe to my newsletter
Read articles from D_Arya directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

D_Arya
D_Arya
I am a .NET developer passionate about exploring and understanding the intricate world of computers, one line of code at a time.