🚀 "Mastering Rabin–Karp: The Rolling Hash Trick Every C++ Coder Must Know"


Rabin-Karp Algorithm: String Matching with Rolling Hash
String matching is one of the fundamental problems in Data Structures and Algorithms (DSA).
Among the most famous algorithms for pattern searching is the Rabin-Karp Algorithm, which uses hashing to efficiently find all occurrences of a pattern inside a text.
🔹 Theory & Approach
The Rabin-Karp algorithm solves the substring search problem:
Given a text
T
of lengthn
and a patternP
of lengthm
, find all the occurrences ofP
inT
.
Instead of checking each substring of the text directly (which would be slow), Rabin-Karp uses a hash function to convert strings into numbers. This allows us to quickly compare numbers instead of comparing full substrings.
Key Idea:
Compute the hash value of the pattern
P
.Compute the hash value of the first window of size m in
T
.Slide the window one character at a time:
Update the hash value in O(1) using a rolling hash formula.
If the hash values match, do a final character check (to avoid collisions).
Continue until the end of the text.
🔹 Visual Explanation
Imagine you have:
Text: A B A C A B A
Pattern: A B A
Compute hash of
"ABA"
→ say it is123
.Compute hash of the first substring
"ABA"
of text →123
. ✅ Match found.Slide the window to
"BAC"
→ hash456
≠123
. ❌ Not a match.Slide to
"ACA"
,"CAB"
, and so on until the end.
👉 The rolling hash lets us compute the next hash without recomputing the whole substring.
For example: NewHash = (OldHash - FirstChar Base^(m-1)) Base + NewChar
🔹 Pseudocode
patternHash = hash(pattern)
textHash = hash(first m characters of text)
for i in 0 to n - m:
if patternHash == textHash:
if substring(text, i, m) == pattern:
print "Pattern found at index i"
# Slide window: compute new hash
if i < n - m:
textHash = roll(textHash, text[i], text[i+m])
🔹 C++ Implementation
#include <bits/stdc++.h>
using namespace std;
// Function to perform Rabin-Karp pattern search
void rabinKarp(string text, string pattern) {
int n = text.size();
int m = pattern.size();
int base = 256; // Number of possible characters
int mod = 101; // A prime number (to reduce collisions)
int patternHash = 0; // Hash of the pattern
int textHash = 0; // Hash of current text window
int h = 1; // base^(m-1) % mod
// Precompute h = pow(base, m-1) % mod
for (int i = 0; i < m - 1; i++)
h = (h * base) % mod;
// Compute initial hash of pattern and first window
for (int i = 0; i < m; i++) {
patternHash = (base * patternHash + pattern[i]) % mod;
textHash = (base * textHash + text[i]) % mod;
}
// Slide the window over the text
for (int i = 0; i <= n - m; i++) {
// If hash values match, check characters one by one
if (patternHash == textHash) {
bool match = true;
for (int j = 0; j < m; j++) {
if (text[i + j] != pattern[j]) {
match = false;
break;
}
}
if (match)
cout << "Pattern found at index " << i << endl;
}
// Compute hash for next window
if (i < n - m) {
textHash = (base * (textHash - text[i] * h) + text[i + m]) % mod;
if (textHash < 0)
textHash = (textHash + mod);
}
}
}
// Driver code
int main() {
string text = "ABABACABA";
string pattern = "ABA";
cout << "Text: " << text << endl;
cout << "Pattern: " << pattern << endl;
rabinKarp(text, pattern);
return 0;
}
🔹 Time and Space Complexity
Average Case Time Complexity:
O(n + m)
Worst Case Time Complexity:
O(n * m)
(when many hash collisions occur)Space Complexity:
O(1)
(constant extra space, apart from input storage)
🔹 Related Problems
Here are some famous problems where Rabin-Karp or hashing-based string search can be applied:
LeetCode 28: Find the Index of the First Occurrence in a String
LeetCode 187: Repeated DNA Sequences
LeetCode 1044: Longest Duplicate Substring
GFG: Pattern Searching
Problems from Blind 75 / NeetCode 150 / Striver’s A2Z DSA Sheet under Strings.
🔹 GitHub Repository
👉 Check the full solutions on my GitHub Repository
https://github.com/UmairDevloper/DSA-in-C-.git
🔹 Tags
#DSA #Algorithms #RabinKarp #StringMatching #Hashing #C++ #LeetCode #Blind75 #NeetCode #Hashnode #GitHub
Subscribe to my newsletter
Read articles from Muhammad Umair Ullah directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
