Regular Expression Matching (LeetCode #10)

Hi there! Here’s my first entry where I document my experiences solving LeetCode problems using the PREP method. I mostly plan to employ this method when working through problems on my own, and later record the approaches I went through in these articles, eventually maybe adding on more reflections for me to consider in the future! Today, I’m going over the “Regular Expression Matching” problem which I initially received through the Daily Coding Problem newsletter. I conveniently missed the part that this was labeled “Hard” but decided to attempt it anyway. The description appeared simple enough and I figured the solution would be straightforward, though possibly tedious.

Description

Given an input string s and a pattern p, implement regular expression matching with support for '.' and '*' where:

  • '.' Matches any single character.​​​​

  • '*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Initial Approach (Bad)

My first idea was a convoluted strategy which involved:

  1. breaking up the regular expression string into chunks where the wildcard symbols are separated from the direct matches; e.g. given the regex string ".*at..*", I would break this up into [".*", "at", ".", ".*"]

  2. matching the regex chunks to the input string as linearly as possible, i.e. match the direct characters first and the wildcard characters later

I took some time pseudo-coding down this route for a while since I didn’t have a better idea but eventually noticed this led to a more streamlined solution. For the wildcard chunks, I would need some sort of loop to figure out how many chunks matched the wildcard parts specifically. How would I know which “at” would match the exact characters if I had "aaatatat" with a regex of ".*at"? This then reminded me of recursion, which inspired my next approach.

Second Approach (Recursive)

With a recursive approach, on each step of the problem, validating the regex match boils down to checking if the first couple characters match and then recursing on the remainder of the strings. That is:

  1. If regex starts with an asterisk wildcard pair (i.e. “[character]*”), run a loop to see incrementally check how many characters match with the wildcard from start of input string. With each matching character, check if the remainder of the string matches with the remainder of the regex string.

  2. If regex starts with a single character, check if first character of the string matches, then check if the remainder of the string matches with the remainder of the regex string.

My eventual solution then looked like this in PSEUDOCODE:

match(s, reg):
    if len(reg) == 0:
        return s == ""
    elif len(reg) == 1:
        return len(s) == 1 and matches single character

    if first two characters of reg contain "*"
        index = 0
        while True:
            if s[index] matches reg[0]:
                if match(s[index+1:], reg[2:]) is True:
                    return True
                else:
                    index += 1
            else:
                return False
    else:
         check if s[0] matches single regex character and match(s[1:], reg[1:]) returns True

On the surface, this looks like it would have a horrid time complexity but could be potentially improved with a dynamic programming solution. Additionally, it assumes that the regex string provided is valid and well-formed and doesn’t handle cases where there may not be a preceding character behind an asterisk.

Other Solutions

I believe the better dynamic programming solution is where the real trickiness of this problem lies. It’s easy to stop with the recursive solution but in practice this would be too inefficient to actually employ. In the future, I’m hoping to be able to understand the dynamic programming approach since this is not yet intuitive enough to me!

Total Time Spent

This took me about an hour from start to finish, which is extremely long! Most of the time went into going down the first approach which was way more complicated than I think would be acceptable in a technical interview.

0
Subscribe to my newsletter

Read articles from Sudeshna Pontula directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Sudeshna Pontula
Sudeshna Pontula