Understanding Regular Expressions (Regex)

Vukani GcabasheVukani Gcabashe
5 min read

Introduction

Regular expressions, commonly known as regex, are powerful tools we use for pattern matching within strings when we are trying to perform operations on a string.

Some of the useful ways to use it include: search, edit, and manipulate text based on specific patterns, making tasks like validation, searching, and text processing more efficient.

This article aims to explain regex categorically and in simple terms, providing examples and use cases for each category, complete with code examples and results in JavaScript. I created this as reference for me to look on when I need to remember it & I hope it helps you too.

1. Basic Characters and Metacharacters

1.1. Literal Characters

Description: Literal characters are the simplest form of regex. They match exactly what they represent.

Example:

const text = "The cat sat on the mat.";
const pattern = /cat/;
const match = text.match(pattern);
console.log(match ? match[0] : "No match");

Result:

cat

Use Case: Finding specific words in a text.

1.2. Metacharacters

Description: Metacharacters are characters with special meanings in regex.

Examples:

  • . (dot): Matches any single character except newline.

    const text = "The cat sat on the mat.";
    const pattern = /c.t/;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['cat']
    
  • \d: Matches any digit (0-9).

    const text = "abc123";
    const pattern = /\d/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['1', '2', '3']
    

Use Case: General pattern matching and validation.

2. Character Classes

2.1. Predefined Character Classes

Description: Character classes represent sets of characters.

Examples:

  • \w: Matches any word character (alphanumeric + underscore).

    const text = "hello_123";
    const pattern = /\w/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['h', 'e', 'l', 'l', 'o', '_', '1', '2', '3']
    
  • \s: Matches any whitespace character.

    const text = "hello world";
    const pattern = /\s/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    [' ']
    

Use Case: Validating and parsing text inputs.

2.2. Custom Character Classes

Description: Custom character classes are defined by enclosing characters in square brackets.

Examples:

  • [aeiou]: Matches any vowel.

    const text = "hello world";
    const pattern = /[aeiou]/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['e', 'o', 'o']
    
  • [0-9]: Matches any digit (same as \d).

    const text = "abc123";
    const pattern = /[0-9]/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['1', '2', '3']
    

Use Case: Filtering specific sets of characters.

3. Quantifiers

3.1. Basic Quantifiers

Description: Quantifiers specify how many instances of a character or group must be present.

Examples:

  • *: Matches 0 or more occurrences.

    const text = "aaaab";
    const pattern = /a*/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['aaaa', '']
    
  • +: Matches 1 or more occurrences.

    const text = "aaaab";
    const pattern = /a+/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['aaaa']
    
  • ?: Matches 0 or 1 occurrence.

    const text = "abc";
    const pattern = /a?/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['a', '']
    

Use Case: Defining flexible patterns in text.

3.2. Range Quantifiers

Description: Range quantifiers specify a specific number of occurrences.

Examples:

  • {n}: Matches exactly n occurrences.

    const text = "aaaab";
    const pattern = /a{3}/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['aaa']
    
  • {n,}: Matches at least n occurrences.

    const text = "aaaab";
    const pattern = /a{2,}/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['aaaa']
    
  • {n,m}: Matches between n and m occurrences.

    const text = "aaaab";
    const pattern = /a{2,4}/g;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['aaaa']
    

Use Case: Controlling the number of repetitions in patterns.

4. Anchors

4.1. Start and End Anchors

Description: Anchors are used to match positions within a string.

Examples:

  • ^: Matches the start of a string.

    const text = "The quick brown fox.";
    const pattern = /^The/;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['The']
    
  • $: Matches the end of a string.

    const text = "The quick brown fox.";
    const pattern = /fox\.$/;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['fox.']
    

Use Case: Ensuring patterns appear at specific positions.

4.2. Word Boundaries

Description: Word boundaries match positions between a word character and a non-word character.

Examples:

  • \b: Matches a word boundary.

    const text = "The cat sat.";
    const pattern = /\bcat\b/;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['cat']
    
  • \B: Matches a non-word boundary.

    const text = "The category.";
    const pattern = /\Bcat\B/;
    const match = text.match(pattern);
    console.log(match);
    

    Result:

    ['cat']
    

Use Case: Precise word matching.

5. Groups and Alternations

5.1. Capturing Groups

Description: Groups allow parts of a regex to be treated as a single unit.

Example:

  • (): Denotes a capturing group.
    const text = "The cat sat on the mat.";
    const pattern = /(cat)/;
    const match = text.match(pattern);
    console.log(match ? match[1] : "No match");
    
    Result:
    cat
    

Use Case: Extracting specific parts of a match.

5.2. Alternation

Description: Alternation matches one pattern or another.

Example:

  • |: Denotes alternation (OR).
    const text = "The cat and the dog.";
    const pattern = /cat|dog/g;
    const match = text.match(pattern);
    console.log(match);
    
    Result:
    ['cat', 'dog']
    

Use Case: Matching multiple patterns.

Conclusion

Regular expressions are versatile tools for text processing, providing powerful capabilities to match, search, and manipulate strings based on specific patterns.

By understanding and utilizing various regex categories, developers can efficiently handle complex text-based tasks. From simple character matching to advanced pattern definitions, regex offers a wide range of functionalities that are essential in modern programming. By mastering regex, you can significantly enhance your ability to work with text data, making your code more robust and efficient.

Happy Coding

0
Subscribe to my newsletter

Read articles from Vukani Gcabashe directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vukani Gcabashe
Vukani Gcabashe

This is Vukani Gcabashe, a software engineer / software developer living in South Africa. I am committed to Inspiration Starts Here but also practice programming by making websites or web applications on contract. I love what I do & trying to be quite good at it but getting there is a process, but we can do it together.