Internal Implementation of HashSet in Java
In the previous article, we learned what a HashSet is and its different methods. In this article, we are going to learn about the internal implementations of a HashSet in Java. In order to learn any implementation in Java, it is essential to go through the Java source code in your IDE.
Before diving into its implementation, we ought to remember that:
A HashSet does not accept duplicate keys.
It permits at most only one null element.
HashMap as a Backing Data Strcture to a HashSet
Ideally, a HashSet's internal implementation uses a HashMap with a generic key of type E
and Object
as the value (Remember: To implement a HashMap you need a key and its associated value). Let's have a look at this snippet:
public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable, java.io.Serializable
{
@java.io.Serial
static final long serialVersionUID = -5024744406713321676L;
private transient HashMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
/**
* Constructs a new, empty set; the backing {@code HashMap} instance has
* default initial capacity (16) and load factor (0.75).
*/
public HashSet() {
map = new HashMap<>();
}
}
If we recall, a HashMap permits duplicate objects as values to keys. This attribute is efficiently applied in the HashSet implementation.
A close review of the above snippet, we can see that whenever a key is added, that element is put in a HashMap with the static field PRESENT
as its value. What this means is that for every element added, we expect its value to be of type Object
and a constant PRESENT.
Consider the following implementation of add ()
method:
public class SetandHashSet {
public static void main(String[]args){
HashSet<String > names = new HashSet<>();
names.add("Kamau");
names.add("John");
names.add("Kiraitu");
names.add("Mugeni");
names.add("Thoithi");
}
}
In the above example, I have created a HashSet of type String
and added a few names. Internally, this implementation will create a HashMap of type String with mappings as shown below:
We can see that the value of each key is stored as PRESENT which is defined by the code private static final Object PRESENT = new Object();
Now, let us look at a few methods and how they apply.
add(E e) Method
The add()
method adds the specified element in the HashSet if it does not exist. If it exists, the HashSet remains unchanged and it returns false. consider the following example:
public class SetandHashSet {
public static void main(String[]args){
HashSet<String > names = new HashSet<>();
boolean name1 = names.add("Kamau");
boolean name2 = names.add("Kamau");
System.out.println(name1);
System.out.println(name2);
}
}
The above code will return
true
false
This is because name2
is a duplicate of name1
, thus the addition returns false
. The internal implementation of the add ()
method uses put()
from HashMap.
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
In the above implementation, the put()
method returns the previous value associated with key, or null
if there was no mapping for key. This means that if the element is not present in the HashSet, the put ()
method returns null
then the add()
method returns null (i.e null == null)
as well and the element is added into the HashSet.
However, if the element is present, put()
executes to true since a value is PRESENT
which is not null
. Thus, the add()
execution resolves to false (i.e PRESENT == null).
remove() Method
If the HashSet contained the specified element, the remove()
method resolves to true.
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}
This method uses the remove()
method of the HashMap. The method returns the value of the element in the HashSet (i.e PRESENT
). If no value is found, the method then returns null
;
Other Methods Implementation
iterator()
: This method returns an iterator over the HashSet. The returned elements are not ordered.
public Iterator<E> iterator() {
return map.keySet().iterator();
}
size()
: Returns the size of the HashSet based on the numbers present.
public int size() {
return map.size();
}
isEmpty():
Returnstrue
if the HashSet contains no elements andfalse
if it contains elements
public boolean isEmpty() {
return map.isEmpty();
}
contains()
: Returnstrue
if the HashSet contains the specified element
public boolean contains(Object o) {
return map.containsKey(o);
}
clear():
Removes all elements in the HashSet
public void clear() {
map.clear();
}
clone():
Returns a shallow copy of the HashSet instance.
public Object clone() {
try {
HashSet<E> newSet = (HashSet<E>) super.clone();
newSet.map = (HashMap<E, Object>) map.clone();
return newSet;
} catch (CloneNotSupportedException e) {
throw new InternalError(e);
}
}
Subscribe to my newsletter
Read articles from Denis Kinyua directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Denis Kinyua
Denis Kinyua
I am a Backend Software Engineer with extensive experience in building scalable applications utilizing technologies such as Java, Spring, SQL, and NoSQL databases, AWS and Oracle Cloud. I am an Oracle Cloud Certified Data Management Associate, Oracle Cloud Infrastructure Certified Associate, and Oracle Cloud AI Certified Associate.