python, data structures and algorithms, programming

Introduction

Anagrams are words or phrases that are formed by rearranging the letters of another word or phrase. In other words, they are two or more words made up of the same letters, but in a different order. For example, the word “listen” is an anagram of the word “silent”. Anagrams can be formed from single words, multiple words, or even entire sentences, and they are often used for puzzles, word games, and creative writing.

Determining whether two words are anagrams can be a computational challenge, especially for longer words or phrases. Fortunately, Python offers a powerful and efficient solution to this problem through its built-in data structures and string manipulation functions. In this article, we will introduce and explain a Python solution to the anagram problem, demonstrating how to use a dictionary to count the frequency of each letter in each word and compare the resulting dictionaries to determine whether the words are anagrams. We will also compare this Python solution to a naive solution and analyze its time and space complexity. By the end of this article, you will have a clear understanding of how to use Python to solve the anagram problem quickly and efficiently.

The Problem

The problem of determining whether two words are anagrams involves checking if the two words have the same letters in the same quantity, but in a different order. This requires comparing the letters in each word and determining if they occur in the same frequency and combination in both words. If the two words have the same letters in the same quantity, then they are anagrams of each other.

Solving the problem

A Simple Solution

A simple solution would be to sort the characters of each word alphabetically and then compare the sorted strings. If the sorted strings are identical, then the two words are anagrams of each other. However, this solution can be inefficient for larger words or a large number of words since sorting takes time, and we have to perform it for each word. Therefore, it’s not the most optimal solution to this problem.

The time complexity of this simple solution to the anagram problem is O(n log n), where n is the length of the longest word. This is because the solution involves sorting the characters of each word, which takes O(n log n) time, and then comparing the sorted strings, which takes O(n) time. Since the sorting operation dominates the time complexity, the overall time complexity is O(n log n).

The space complexity of this simple solution is also O(n), where n is the length of the longest word. This is because we need to store the sorted strings for each word, and the length of the sorted string is the same as the length of the original word. Therefore, the space complexity is proportional to the length of the longest word.

The Python Solution

The Anagram Python solution is an efficient and elegant approach to solving the problem of determining whether two words are anagrams of each other. Instead of sorting the characters of each word, this solution involves counting the frequency of each letter in each word using a dictionary. The resulting dictionaries are then compared to determine if the words are anagrams. This solution takes advantage of Python’s built-in data structures and string manipulation functions, making it an optimal solution for the anagram problem. In the following sections, we will dive deeper into how this solution works and compare it to the naive solution in terms of time and space complexity.

Below is an efficient Python solution to the Anagram problem; it has a time complexity of O(n) and a space complexity of O(1), and it uses an array of size 26 to keep track of the frequencies of each character in the input strings.

class Anagram:
    def is_anagram(s, t):
        s = s.replace(" ", "").lower()
        t = t.replace(" ", "").lower()

        if len(s) != len(t):        
            return False                  

        frequency = [0] * 26        

        for i in range(len(s)):     
            frequency[ord(s[i]) - ord('a')] += 1            
            frequency[ord(t[i]) - ord('a')] -= 1          

        for count in frequency:     
            if count != 0:      
            return False          

        return True

Let’s walk through the block of code above:
First, we remove any whitespace from the strings using the replace method and convert them to lowercase using the lower method. This ensures that we are comparing two strings with the same characters, regardless of case and spacing.

s = s.replace(" ", "").lower()
t = t.replace(" ", "").lower()

Then we check if the two input strings s and t have the same length. If they don’t, then they cannot be anagrams, so we return False.

if len(s) != len(t):
    return False

Next, we initialize an array of size 26 (the 26 letters of the alphabet) called the frequency with all values set to 0. This array will be used to keep track of the frequencies of each character in the input strings.

frequency = [0] * 26

We then loop through each character in the input strings s and t, and update the corresponding frequency in the frequency array. We do this by subtracting the ASCII value of ‘a’ from the ASCII value of the character to get its zero-based index in the frequency array.
Note: The ord() function returns an integer representing the Unicode character.

for i in range(len(s)):
    frequency[ord(s[i]) - ord('a')] += 1    
    frequency[ord(t[i]) - ord('a')] -= 1

Finally, we loop through the frequency array and check if all of the frequencies are zero. If they are, then the two input strings are anagrams, so we return True. If any frequency is not zero, then the two input strings are not anagrams, so we return False.

for count in frequency:
    if count != 0:  
    return False

return True

Here’s an example usage of the solution above:

s = "Listen"
t = "Silent"

if Anagram.is_anagram(s, t):
    print("The two strings are anagrams")

else:
    print("The two strings are not anagrams.")

This solution has a time complexity of O(n), where n is the length of the input strings s and t. We loop through each character in the input strings only once, and updating the frequency array and checking its values takes constant time since the array size is fixed at 26 (the 26 letters of the alphabet). This solution also has a space complexity of O(1), since we use a fixed-size array of 26 elements to keep track of the frequencies of each character in the input strings.

In summary, the idea behind this solution is to use an array of size 26 to keep track of the frequencies of each character in the input strings. By updating the frequencies of each character and checking if they are all zero at the end, we can determine if the two input strings are anagrams in O(n) time and O(1) space, which is much faster and more memory-efficient than the brute force approach of sorting the input strings and comparing them.

Python Programming: The Perfect Solution for the Anagram Problem

Table of contents