Advanced HashMap Series: The equals() and hashCode() Contract

HarshavardhananHarshavardhanan
7 min read

Most devs think they understand equals() and hashCode().

They don’t.

They override one and forget the other. Furthermore, they assume a working HashMap means a correct one. They let mutability creep into keys. And then act surprised when retrieval fails or duplicates sneak in.

Here's the truth:

  • If you override equals(), you must override hashCode().

  • If two objects are equal, they must return the same hash.

Break these, and your hash-based logic doesn’t throw errors — it just quietly goes wrong.

You don’t need 1000 lines of bad code to break a system.
One forgotten method is enough.

This post is not a tutorial. It’s a teardown — of every silent bug you invite by breaking the contract.


Let's Build a Broken System

We’ll use a basic Employee class — a common domain model.

class Employee {
    private final String id;
    private final String department;
    private final int yearOfJoining;

    public Employee(String id, String department, int yearOfJoining) {
        this.id = id;
        this.department = department;
        this.yearOfJoining = yearOfJoining;
    }
}

Seems harmless.

Now add this test:

@Test
void shouldBeEqualWhenDataMatches() {
    Employee e1 = new Employee("E100", "Finance", 2022);
    Employee e2 = new Employee("E100", "Finance", 2022);

    assertEquals(e1, e2); // ❌ Fails
}

Of course, it fails. Java’s default equals() checks for reference equality:

public boolean equals(Object obj) {
    return this == obj;
}

Since every Java class inherits from the Object class, and Object implements the equals() method – every Java class by default implements equals().

Two objects with the same data? Still not equal — unless they’re the same instance. The above method will only return true for objects stored in the same memory location.

Welcome to the first pitfall.


Fixing equals() the Right Way

Here’s a proper override for equals():

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;

    Employee that = (Employee) o;

    if (yearOfJoining != that.yearOfJoining) return false;
    if (!Objects.equals(id, that.id)) return false;
    return Objects.equals(department, that.department);
}

This makes value equality work.

Now the test shouldBeEqualWhenDataMatches passes.

But the problem just got deeper — because this fix is incomplete.

Before moving on, let’s understand some prime directives for overriding the equals() method.


Prime Directives for equals()

You don’t get to override equals() casually. It must obey 5 rules:

  1. Reflexivex.equals(x) must be true.

  2. Symmetric — If x.equals(y), then y.equals(x) must be true.

  3. Transitive — If x.equals(y) and y.equals(z), then x.equals(z).

  4. Consistent — It must return the same result unless the fields change.

  5. Non-nullx.equals(null) must always return false.

This isn’t style. This is contract.

Break it and you’ll get unpredictable behavior in every collection that relies on equality — which includes almost all of them.


The HashMap Contract: Why equals() Alone is Not Enough

Let’s test this:

Employee e1 = new Employee("E100", "Finance", 2022);
Employee e2 = new Employee("E100", "Finance", 2022);

Map<Employee, String> map = new HashMap<>();
map.put(e1, "Employee 1");
map.put(e2, "Employee 2");

System.out.println(map.size()); // ❌ Prints 2

Wait — same employee, but two entries?

Yes. Because equals() was overridden — but hashCode() wasn’t. The most important thing to remember about hashCode() is that each time we override equals() implementation, we have to override hashCode() as well.

So, why did the above test fail? For that, let’s first understand the default behavior of hashCode()


What hashCode() does by default?

The default hashCode() provided by Object class:

Note: The hashcode() method is a native implementation in C\C++. It is not written in Java like equals(). There is a JNI (Java Native Interface) hook in Object class for hashCode() — it tells the JVM to execute a platform-specific native implementation.

  • Returns an integer derived from the memory address of the object (not the actual memory address, but a value typically derived from it).

  • Each object will have a unique hashCode unless equals() is overridden to say otherwise. So unless we override hashCode() in Employee class, two distinct object instances will have different hash codes, even if their contents are identical. Meaning - different employees.


The hashCode() Rule

When you put a key into a HashMap, this is what happens:

  1. Compute the key’s hashCode() — find the bucket

  2. Inside the bucket, use equals() to resolve key collisions

If hashCode() returns different values, the objects land in different buckets. equals() isn’t even called.

So, even if your data is identical — HashMap treats them as separate keys.

Let’s fix this:

@Override
public int hashCode() {
    int result = id != null ? id.hashCode() : 0;
    result = 31 * result + (department != null ? department.hashCode() : 0);
    result = 31 * result + yearOfJoining;
    return result;
}

This implementation ensures:

  • Objects with same content → same hashCode.

  • Hash function distributes well due to prime multiplier (31).

Now run the test again — size is 1.
Now the old value is replaced by the new one.


Prime Directives for hashCode()

  1. Must return the same result across multiple invocations (unless the object mutates).

  2. If x.equals(y) is true, then x.hashCode() == y.hashCode() must be true.

  3. If x.equals(y) is false, their hashCodes can be equal — but you should reduce that probability.

  4. Unequal hashCodes? Then equals() must return false.

These aren’t style guides. These are compiler-trusted invariants.


HashSet Behaves the Same Way

Set<Employee> set = new HashSet<>();
set.add(e1);
set.add(e2);

Even if e1.equals(e2) is true — unless hashCode() matches, both will be stored.

Sets rely on the same hashCode-then-equals pipeline.


Three Gotchas That Break Real Systems

1. Mutable Fields Inside Keys

Employee emp = new Employee("E100", "Finance", 2022);
map.put(emp, "Finance");

emp.setDepartment("HR"); // mutation

map.get(emp); // ❌ Returns null

Why?

Because the mutated object now hashes to a different bucket. You’re holding the same reference, but it’s invisible to the map.

Lesson: Never mutate fields used in equals() or hashCode() after insertion. Either make them final, or never use the object as a key.


2. Constant HashCode Destroys Performance

@Override
public int hashCode() {
    return 1;
}

This satisfies the contract — but now every object lands in the same bucket.

Congratulations: your HashMap is now a linked list. Lookups are now O(n).

This might not be visible in a dev environment. But in prod? On a hot path? You’ve silently killed performance.

Use a proper hash distribution. Combine multiple fields. Use primes like 31.


3. False Duplicates or Misses

If:

  • You override hashCode() but not equals() → different objects are considered the same based on the hashCode() implementation.

  • You override equals() but not hashCode() → same objects live in different buckets

In both cases:

  • HashMap.put() inserts duplicate keys

  • HashSet.add() doesn’t deduplicate

  • get() fails even with same data

  • Performance degrades unpredictably


HashMap Internals: How Retrieval Works

Let’s break down a get() call in a HashMap:

map.get(someKey);

Here’s what actually happens:

  1. Call someKey.hashCode() → jump to a bucket

  2. In that bucket, iterate entries → call equals() on each key

  3. If a match is found, return the value. Else, return null.

So:

  • Bad hashCode() → wrong bucket → retrieval fails

  • Bad equals() → right bucket, no match → retrieval fails

You can hold the same reference, and get() will still return null if the object was mutated or contracts were broken.


De-Duplication ≠ Just equals()

In hash-based collections:

  • hashCode() determines where to look

  • equals() decides whether it matches

Both are required. Without both:

  • Objects may duplicate

  • Entries may overwrite unexpectedly

  • Lookups may silently fail

No compiler will warn you. No exception will be thrown.
The system will just silently behave wrong.


Final Word

You can write a 100,000-line application and ignore equals() and hashCode().
It’ll work — until it doesn’t.
And when it fails, it won’t break loudly. It will rot silently.

Your logs won’t help.
Your tests won’t catch it.
Your data will lie.

The only fix is to respect the contract.
And to write your classes like you're designing keys for a vault — not just objects for a list.


Up Next

We’ve now laid down the fundamentals of identity and hashing. Next up in the series — a deep comparison between HashMap, LinkedHashMap, and TreeMap:

  • How they differ in internal structure

  • Performance tradeoffs across read/write-heavy workloads

  • Ordering guarantees and their cost

  • Real-world use cases where one breaks and another shines

You can read it here.

0
Subscribe to my newsletter

Read articles from Harshavardhanan directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Harshavardhanan
Harshavardhanan