Solving Strings with Python Tries

Have you ever relied on a spell checker to catch those elusive typos or experienced the convenience of autocomplete while typing a search query? If so, you have seen with your own eyes the power of experimentation. In this blog post, we will look at the realm of Tries.

A Trie is a tree-like data structure that stores a dynamic set of strings, where each node represents a single character of the string. Unlike arrays or linked lists, Tries are great at performing string operations with extreme efficiency. The most significant advantage lies in the structure's ability to share common prefixes among different strings, reducing redundancy and increasing storage space.

Implementing A Trie Data structure In Golang(Search engine) | by itachi  sasuke | Medium

Each node in the tree usually consists of the following components:

  1. Value/Character: The character associated with the node.

  2. Children: pointers to child nodes, representing subsequent characters in the string.

  3. End of Word Marker: A marker that marks the end of a word.

Implementing a Trie in Python:

class Trie():
    def __init__(self):
        # Initialize the Trie with an empty root node
        self.root = Node('')

    def has_word(self, word):
        # Check if the Trie contains a given word
        return self.root.has_continuation(word)

    def add_word_to_the_trie(self, word):
        # Add a word to the Trie
        self.root.add_continuation(word)

    def display_trie(self):
        # Display the Trie structure
        self.root.display_node(0)

    def get_all_words(self):
        # Retrieve all words stored in the Trie
        return self.root.all_words()


class Node():
    def __init__(self, letter, parent=None):
        # Initialize a Trie node with a letter and an optional parent node
        self.children = []
        self.parent = parent
        self.letter = letter

    def add_continuation(self, string):
        # Add a continuation to the Trie
        if string:
            # If the first character of the continuation is not in children, add a new node
            if string[0] not in {child.letter for child in self.children}:
                new_child = Node(string[0], parent=self)
                new_child.parent = self
                self.children.append(new_child)
                # Sort the children for consistent ordering
                self.children.sort(key=lambda child: child.letter)
                new_child.add_continuation(string[1:])
            else:
                # If the first character exists, continue adding the rest of the word
                correct_child = next(child for child in self.children if child.letter == string[0])
                correct_child.add_continuation(string[1:])

    def has_continuation(self, string):
        # Check if the Trie has a given continuation
        if string:
            # If the first character is not in children, the continuation is not present
            if string[0] not in {child.letter for child in self.children}:
                return False
            else:
                # If the first character exists, check the continuation in the correct child
                correct_child = next(child for child in self.children if child.letter == string[0])
                return correct_child.has_continuation(string[1:])
        else:
            # If the continuation is empty, the word is present in the Trie
            return True

    def display_node(self, level):
        # Display the node and its children with indentation based on the level
        print('  ' * level + self.letter)
        for child in self.children:
            child.display_node(level + 1)

    def upword(self, continuation):
        # Construct the full word by going upward in the Trie
        if self.letter == '':
            return continuation
        else:
            return self.parent.upword(self.letter + continuation)

    def all_words(self):
        # Retrieve all words stored in the subtree rooted at this node
        if self.children == []:
            return [self.upword('')]
        else:
            all_children_words = []
            for child in self.children:
                all_children_words.extend(child.all_words())
            return all_children_words

# Example usage:
list_of = ['variable', 'lion', 'list', 'variant', 'class', 'classroom', 'classic', 'string']
my_trie = Trie()
for item in list_of:
    my_trie.add_word_to_the_trie(item)

# Display the Trie structure
my_trie.display_trie()

# Retrieve and print all words stored in the Trie
print(my_trie.get_all_words())

Applications of Tries:

  1. Spell Checking and Autocorrection

  2. Autocomplete

  3. IP Routing

  4. Symbol Tables in Compilers

  5. Data Compression

  6. Text and DNA Matching

  7. Network Databases

  8. Contact Management and Address Books

  9. Routing Algorithms in Networking

  10. Recommendation Systems

Tries in Python provide a powerful solution for efficiently managing and retrieving strings. Their versatility makes them indispensable in various applications where quick and effective string operations are paramount. By understanding the basic structure and operations of Tries, developers can leverage this data structure to enhance the performance of their applications, ultimately unlocking new possibilities in data processing and manipulation.

1
Subscribe to my newsletter

Read articles from Khrystyna Klapushchak directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Khrystyna Klapushchak
Khrystyna Klapushchak