๐Ÿ” Today I Learned Hash Tables โ€” Collisions, Resizing, and a Duplicate Finder in Python

kasumbi philkasumbi phil
2 min read

๐Ÿ“˜ What are Hash Tables?

A hash table (also known as a dictionary or map) stores data in key-value pairs. It's like a super-fast lookup table.

Example:

student_scores = {
    "Alice": 90,
    "Bob": 85,
    "Charlie": 78
}
  • Keys ("Alice") are hashed to a memory index.

  • Values (90) are stored at that index.

  • This makes lookup, insert, and delete operations nearly O(1) in time.


๐Ÿ’ฅ What is a Collision?

A collision happens when two keys are hashed to the same memory location.

Example: If "cat" and "tac" both land on index 5, the hash table has to handle that.

๐Ÿ”ง Collision handling methods:

  • Chaining: Use a list at each index to store multiple values.

  • Open addressing: Find another empty slot (linear probing, quadratic, etc.)


๐Ÿ“ฆ What is Resizing?

Hash tables grow dynamically.

  • When the table gets too full (usually past 70% load), it resizes itself.

  • This involves:

    1. Creating a new bigger array (usually double the size).

    2. Rehashing all existing keys into this new array.

Resizing helps maintain fast access time even as more elements are added.


๐Ÿงช Task: Finding Duplicates Using a Hash Table

I practiced by solving a real problem: finding all duplicate numbers in a list.

๐Ÿง  Python Code:

array = [10, 3, 6, 9, 10, 10, 3, 5, 8, 60, 30, 4, 3, 5]

count_dict = {}
duplicates = []

# Step 1: Count each element
for num in array: 
    if num in count_dict:
        count_dict[num] += 1
    else:
        count_dict[num] = 1

# Step 2: Collect duplicates
for key in count_dict:
    if count_dict[key] > 1:
        duplicates.append(key)

# Step 3: Print results
print("duplicates:", duplicates)

for n in duplicates:
    print(f"{n} appeared {count_dict[n]} times")

โœ… Output:

duplicates: [10, 3, 5]
10 appeared 3 times
3 appeared 3 times
5 appeared 2 times

๐Ÿง  What I Learned

  • Hash tables are ideal for quick lookup tasks.

  • Collisions are normal, and we need strategies to handle them.

  • Resizing keeps the hash table efficient as it grows.

  • Real-world use: Hash tables make duplicate detection fast and simple.

0
Subscribe to my newsletter

Read articles from kasumbi phil directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

kasumbi phil
kasumbi phil