🚀 Mastering KMP (Knuth–Morris–Pratt) String Matching Algorithm

📌 Introduction

When working with text processing, one of the most common problems is finding a substring (pattern) inside a larger string (text). The naive approach checks all possibilities, but this can take a lot of time for large strings.

The Knuth–Morris–Pratt (KMP) algorithm is a powerful solution that avoids unnecessary comparisons. It preprocesses the pattern to build a prefix table (LPS array) that helps skip repeated work, achieving linear time complexity.

📝 Problem Statement

Given two strings:

A text string T of length n.
A pattern string P of length m.

👉 Determine whether the pattern P occurs in T. If yes, return the starting index/indices.

🔍 Concept (with Visualization)

The naive algorithm re-checks characters after every mismatch:

Text:    A B A C A B A B A
Pattern: A B A B

At index 0, it matches A B A, then mismatches on the C.
The naive method goes back and starts again at index 1.
This causes repeated work. ❌

✅ KMP optimization:
KMP builds an LPS (Longest Prefix Suffix) array for the pattern.
The LPS tells us where to resume in the pattern instead of starting all over again.

Example LPS for pattern "ABABAC":

Pattern	A	B	A	B	A	C
LPS	0	0	1	2	3	0

That means:

After mismatch at C, instead of starting over, jump to index 3 in the pattern (skipping redundant checks).

Thus, KMP ensures O(n + m) complexity compared to O(n × m) for the naive approach.

✅ Example

Input:

Text    = "ABABABCABABABCABAB"
Pattern = "ABABAC"

Process (with LPS):

Match fails at index 5, but instead of restarting, KMP uses LPS to skip ahead.

Output:

Pattern found at index 0
Pattern found at index 7
Pattern found at index 12

💻 C++ Implementation

#include <bits/stdc++.h>
using namespace std;

// Function to build the LPS array
vector<int> computeLPS(string pattern) {
    int m = pattern.size();
    vector<int> lps(m, 0);

    int len = 0; // length of previous longest prefix suffix
    int i = 1;

    while (i < m) {
        if (pattern[i] == pattern[len]) {
            len++;
            lps[i] = len;
            i++;
        } else {
            if (len != 0) {
                len = lps[len - 1]; // fallback
            } else {
                lps[i] = 0;
                i++;
            }
        }
    }
    return lps;
}

// KMP algorithm function
void KMP(string text, string pattern) {
    int n = text.size();
    int m = pattern.size();

    vector<int> lps = computeLPS(pattern);

    int i = 0; // index for text
    int j = 0; // index for pattern

    while (i < n) {
        if (pattern[j] == text[i]) {
            i++;
            j++;
        }

        if (j == m) {
            cout << "Pattern found at index " << i - j << endl;
            j = lps[j - 1];
        } else if (i < n && pattern[j] != text[i]) {
            if (j != 0)
                j = lps[j - 1];
            else
                i++;
        }
    }
}

int main() {
    string text = "ABABABCABABABCABAB";
    string pattern = "ABABAC";

    cout << "Text: " << text << endl;
    cout << "Pattern: " << pattern << endl;
    KMP(text, pattern);

    return 0;
}

⚖️ Time & Space Complexity

Time Complexity:
- Preprocessing LPS: O(m)
- Searching: O(n)
- Total: O(n + m)
Space Complexity:
- LPS array: O(m)

✅ Advantages

Linear time complexity.
Efficient for multiple pattern searches.
No backtracking in the text.

❌ Disadvantages

Slightly complex to implement compared to naive search.
Needs preprocessing (LPS) even for one-time searches.

LeetCode 28: Implement strStr()
LeetCode 686: Repeated String Match
Pattern Searching in large DNA sequences
Detecting plagiarism in text

🔗 GitHub Repository

👉 My GitHub Repository Link ( Check Out )

🏷️ Tags

Algorithms C++ String Matching KMP DSA Competitive Programming LeetCode

"KMP Algorithm – The Secret Behind Efficient String Search"

🚀 Mastering KMP (Knuth–Morris–Pratt) String Matching Algorithm

📌 Introduction

📝 Problem Statement

🔍 Concept (with Visualization)

✅ Example

💻 C++ Implementation

⚖️ Time & Space Complexity

✅ Advantages

❌ Disadvantages

🔗 GitHub Repository

🏷️ Tags

Subscribe to my newsletter

Muhammad Umair Ullah

Muhammad Umair Ullah

"KMP Algorithm – The Secret Behind Efficient String Search"

🚀 Mastering KMP (Knuth–Morris–Pratt) String Matching Algorithm

📌 Introduction

📝 Problem Statement

🔍 Concept (with Visualization)

✅ Example

💻 C++ Implementation

⚖️ Time & Space Complexity

✅ Advantages

❌ Disadvantages

📚 Related Problems (LeetCode & CP)

🔗 GitHub Repository

🏷️ Tags

Subscribe to my newsletter

Muhammad Umair Ullah

Muhammad Umair Ullah