Z-Algorithm Explained with C++ Code: Fast String Pattern Matching

๐Ÿš€ Mastering Z-Algorithm: Fast String Matching Explained with C++

๐Ÿ“Œ Problem Statement

Given a text string T and a pattern string P, we want to find all occurrences of P inside T efficiently.
Traditional string matching (like the brute force approach) takes O(n*m) time, which becomes slow for large strings.
We need a faster algorithm that works in linear time.


๐Ÿ’ก Solution: Z-Algorithm

The Z-Algorithm is a powerful string matching technique that preprocesses a string and computes an array Z[], where each Z[i] represents the longest substring starting from index i that is also a prefix of the string.

By cleverly combining the pattern and the text, we can find all matches in O(n + m) time.


๐Ÿ“– Definition

  • Z-Array: An array of length n for a string S where Z[i] is the length of the longest substring starting at i which is also a prefix of S.

  • Example: For S = "aabxaabx", the Z array is [0, 1, 0, 0, 4, 1, 0, 0].


๐Ÿ”Ž Theory Approach

  1. Construct a new string as:
    S = P + "$" + T
    where $ is a special delimiter not present in either string.

  2. Compute the Z-array for S.

  3. Whenever Z[i] == length(P), it means the pattern P occurs in T at position (i - length(P) - 1).

This reduces the problem to a simple array computation instead of repeated comparisons.


๐Ÿงฎ Example

Input:

Text = "abcabcabc" Pattern = "abc"

markdown Copy code

Processing:

  • Construct string: abc$abcabcabc

  • Compute Z-array.

  • Matches found at positions: 0, 3, 6.

Output:

Pattern found at index 0 Pattern found at index 3 Pattern found at index 6

arduino Copy code


๐Ÿ’ป Solution in C++

#include <bits/stdc++.h>
using namespace std;

// Function to create Z-array
vector<int> computeZ(string s) {
    int n = s.length();
    vector<int> Z(n);
    int L = 0, R = 0;
    for (int i = 1; i < n; i++) {
        if (i <= R) 
            Z[i] = min(R - i + 1, Z[i - L]);
        while (i + Z[i] < n && s[Z[i]] == s[i + Z[i]])
            Z[i]++;
        if (i + Z[i] - 1 > R) {
            L = i;
            R = i + Z[i] - 1;
        }
    }
    return Z;
}

 / Function for Z-Algorithm string matching
void zAlgorithmStringMatching(string text, string pattern) {
    string combined = pattern + "$" + text;
    vector<int> Z = computeZ(combined);
    int m = pattern.length();

    for (int i = 0; i < Z.size(); i++) {
        if (Z[i] == m) {
            cout << "Pattern found at index " << (i - m - 1) << endl;
        }
    }
}

int main() {
    string text = "abcabcabc";
    string pattern = "abc";
    zAlgorithmStringMatching(text, pattern);
    return 0;
}

โณ Time and Space Complexity

  • Time Complexity:

    • Preprocessing Z-array: O(n + m)

    • Pattern matching: O(n + m)
      โœ… Overall: O(n + m) (linear)

  • Space Complexity:

    • Extra Z-array storage: O(n + m)
      โœ… Overall: O(n + m)

๐Ÿ“ Further Examples to Try

Example 1:

Text = "aaaaa"
Pattern = "aa"

Output:

Pattern found at index 0
Pattern found at index 1
Pattern found at index 2
Pattern found at index 3

Example 2:

Text = "abcdabc"
Pattern = "abc"

Output:

Pattern found at index 0
Pattern found at index 4

Example 3:

Text = "hello world"
Pattern = "world"

Output:

Pattern found at index 6

๐Ÿ”— GitHub Repository

๐Ÿ‘‰ GitHub Repo Link Here


๐Ÿท๏ธ Tags

#Algorithms #DataStructures #StringMatching #ZAlgorithm #C++ #CompetitiveProgramming #CodingInterview #Hashnode #GitHubProjects #DSA

0
Subscribe to my newsletter

Read articles from Muhammad Umair Ullah directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Muhammad Umair Ullah
Muhammad Umair Ullah