Solving Strings with Python Tries


Have you ever relied on a spell checker to catch those elusive typos or experienced the convenience of autocomplete while typing a search query? If so, you have seen with your own eyes the power of experimentation. In this blog post, we will look at the realm of Tries.
A Trie is a tree-like data structure that stores a dynamic set of strings, where each node represents a single character of the string. Unlike arrays or linked lists, Tries are great at performing string operations with extreme efficiency. The most significant advantage lies in the structure's ability to share common prefixes among different strings, reducing redundancy and increasing storage space.
Each node in the tree usually consists of the following components:
Value/Character: The character associated with the node.
Children: pointers to child nodes, representing subsequent characters in the string.
End of Word Marker: A marker that marks the end of a word.
Implementing a Trie in Python:
class Trie():
def __init__(self):
# Initialize the Trie with an empty root node
self.root = Node('')
def has_word(self, word):
# Check if the Trie contains a given word
return self.root.has_continuation(word)
def add_word_to_the_trie(self, word):
# Add a word to the Trie
self.root.add_continuation(word)
def display_trie(self):
# Display the Trie structure
self.root.display_node(0)
def get_all_words(self):
# Retrieve all words stored in the Trie
return self.root.all_words()
class Node():
def __init__(self, letter, parent=None):
# Initialize a Trie node with a letter and an optional parent node
self.children = []
self.parent = parent
self.letter = letter
def add_continuation(self, string):
# Add a continuation to the Trie
if string:
# If the first character of the continuation is not in children, add a new node
if string[0] not in {child.letter for child in self.children}:
new_child = Node(string[0], parent=self)
new_child.parent = self
self.children.append(new_child)
# Sort the children for consistent ordering
self.children.sort(key=lambda child: child.letter)
new_child.add_continuation(string[1:])
else:
# If the first character exists, continue adding the rest of the word
correct_child = next(child for child in self.children if child.letter == string[0])
correct_child.add_continuation(string[1:])
def has_continuation(self, string):
# Check if the Trie has a given continuation
if string:
# If the first character is not in children, the continuation is not present
if string[0] not in {child.letter for child in self.children}:
return False
else:
# If the first character exists, check the continuation in the correct child
correct_child = next(child for child in self.children if child.letter == string[0])
return correct_child.has_continuation(string[1:])
else:
# If the continuation is empty, the word is present in the Trie
return True
def display_node(self, level):
# Display the node and its children with indentation based on the level
print(' ' * level + self.letter)
for child in self.children:
child.display_node(level + 1)
def upword(self, continuation):
# Construct the full word by going upward in the Trie
if self.letter == '':
return continuation
else:
return self.parent.upword(self.letter + continuation)
def all_words(self):
# Retrieve all words stored in the subtree rooted at this node
if self.children == []:
return [self.upword('')]
else:
all_children_words = []
for child in self.children:
all_children_words.extend(child.all_words())
return all_children_words
# Example usage:
list_of = ['variable', 'lion', 'list', 'variant', 'class', 'classroom', 'classic', 'string']
my_trie = Trie()
for item in list_of:
my_trie.add_word_to_the_trie(item)
# Display the Trie structure
my_trie.display_trie()
# Retrieve and print all words stored in the Trie
print(my_trie.get_all_words())
Applications of Tries:
Spell Checking and Autocorrection
Autocomplete
IP Routing
Symbol Tables in Compilers
Data Compression
Text and DNA Matching
Network Databases
Contact Management and Address Books
Routing Algorithms in Networking
Recommendation Systems
Tries in Python provide a powerful solution for efficiently managing and retrieving strings. Their versatility makes them indispensable in various applications where quick and effective string operations are paramount. By understanding the basic structure and operations of Tries, developers can leverage this data structure to enhance the performance of their applications, ultimately unlocking new possibilities in data processing and manipulation.
Subscribe to my newsletter
Read articles from Khrystyna Klapushchak directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
