Secret Codes Unlocked: How to Implement Substitution Ciphers

StackzeroStackzero
9 min read

Are you ready to implement substitution ciphers with your own hands and take your encryption skills to the next level?
In our previous article, we explored the fascinating world of substitution ciphers and how we can use them to encode our secret messages.
But if you’re serious about learning cryptography, you must learn how to implement these ciphers in a programming language. And what better scripting language to use than Python? It’s one of the most popular and versatile languages in the world of cybersecurity and data science!

In this article, we’ll take you through the world of substitution ciphers and show you how to implement some of the most classic ones, including Caesar’s and Vigenere’s ciphers.
Whether you’re a beginner or an experienced coder, our step-by-step tutorial will guide you through the process of creating your own secret codes and encrypting your messages like a pro.

Of course, I am joking, if you have read the previous article you should know the weaknesses of such ciphers by now. However, only theoretical knowledge is not enough for a deep understanding-
So buckle up, grab your favourite beverage, and get ready to unlock the power of cryptography. It’s time to implement your own substitution ciphers!

Atbash Cipher

This is perhaps the simplest substitution cipher ever.
Its implementation is therefore the best way to warm up to it. In the previous article: Substitution ciphers? An overview of the basics we only mentioned without going in depth.
Like all monoalphabetic substitution ciphers they replace each letter with a corresponding one.

In this case, however, we have no key, but the algorithm replaces each letter with the one having the same index but this time in the reverse alphabet.

The code

def atbash_cipher(alphabet, plain_text):
    key = alphabet[::-1]
    cipher_text = ""
    for c in plain_text:
        if alphabet.find(c) >= 0:
            cipher_text += alphabet[key.index(c)]
        else:
            cipher_text += c
    return cipher_text

I have tried not to skip any steps so as to make everything as clear as possible. However, I feel it necessary for the sake of the record to comment on all the main points.

  1. The function atbash_cipher takes two parameters: alphabet and plain_text. alphabet is a string that contains all the letters of the alphabet in order, and plain_text is the text that needs to be encrypted.

  2. The variable key is created by reversing the alphabet string using slicing notation. This creates a new string that contains the same letters as alphabet, but in reverse order.

  3. The variable cipher_text is initialized as an empty string.

  4. The code then loops through each character c in the plain_text string.

  5. If the character c is a letter in the alphabet string (i.e., it’s not a space, punctuation mark, or another non-letter character), the code finds its index in the alphabet string using the find() method.

  6. The function then uses this index to look up the corresponding “opposite” letter in the key string, and appends that letter to the cipher_text string.

  7. If the character c is not a letter in the alphabet string, it is simply appended to the cipher_text string as is.

  8. Once all the characters in plain_text have been processed, the cipher_text string is returned as the output of the function.

Cesar’s Cipher

We have already covered Caesar’s cipher in theoretical form, however, before seeing its implementation let us refresh our memory for a moment.
Caesar’s cipher, also called shift cipher, It works by shifting each letter in a message a certain number of places down the alphabet.
For example, if you shift each letter in the message “STACKZERO” by 3 places, you get “VWDFNCHUR”.

The code

But now it is time to move on to the code:

def cesar_cipher(alphabet, key, plain_text):
    cipher_text = ""
    for i in range(len(plain_text)):
        if alphabet.find(plain_text[i]) >= 0:
            new_index = (alphabet.index(plain_text[i])+key)%len(alphabet)
            cipher_text += alphabet[new_index]
        else:
            cipher_text += plain_text[i]
    return cipher_text

I tried to make the code self-explanatory, however, a step-by-step analysis can only do us good.

  1. The cesar_cipher function takes three parameters: alphabet, key, and plain_text.

  2. The function initializes an empty string cipher_text that will eventually hold the encrypted message.

  3. The for loop iterates through each character in the plain_text string using the range() function and len() method.

  4. The if statement checks whether the current character is in the alphabet string. If it is, the code proceeds to encrypt it; otherwise, the character is added to the cipher_text string as is.

  5. Inside the if statement, the code finds the index of the current character in the alphabet string using the find() method.

  6. The code adds the key to the index found in step 5 and gets the remainder of the result when divided by the length of the alphabet using the modulo operator (%). This step ensures that the index is always within the bounds of the alphabet.

  7. The code retrieves the character at the new index in the alphabet string and appends it to the cipher_text string.

  8. The for loop continues until all characters in plain_text have been processed.

  9. The cipher_text string is returned as the output of the function.

Vigenere’s Cipher

This is the first polyalphabetic cipher in our list, it’s a bit more complex than the previous ones, but a way harder to decrypt.
Even though it has an entire dedicated paragraph in our introductory article I want to spend some words on it.

It shifts each letter in a message by a different amount based on a secret keyword. The idea behind this algorithm is to repeat the key to match the length of the message and then shift each letter according to the corresponding character in the keyword

The code

But again, the best way to understand is to do it

def vigenere_cipher(alphabet, key, plain_text):
    cipher_text = ""
    for k, c in zip(cycle(key), plain_text):
        if alphabet.find(c) >= 0:
            index = (alphabet.index(c) + alphabet.index(k))%len(alphabet)
            cipher_text+= alphabet[index]
        else:
            cipher_text += c
    return cipher_text

I’m sure that you are clear about the behaviour of this code, however again let us see the algorithm together step by step!

  1. The vigenere_cipher function takes three parameters: alphabet, key, and plain_text.

  2. The function initializes an empty string cipher_text that will eventually hold the encrypted message.

  3. The for loop iterates through each pair of characters in the key and plain_text strings using the zip() function and the cycle() function from the itertools module. The cycle() function repeats the key indefinitely until it matches the length of plain_text.

  4. The if statement checks whether the current character c in the plain_text string is in the alphabet string. If it is, the code proceeds to encrypt it; otherwise, the character is added to the cipher_text string as is.

  5. Inside the if statement, the code finds the index of the current character c in the alphabet string using the find() method.

  6. It also finds the index of the current character k in the key string using the index() method.

  7. It adds the indices found in steps 5 and 6 and gets the remainder of the result when divided by the length of the alphabet using the modulo operator (%). This step ensures that the index is always within the bounds of the alphabet.

  8. The code retrieves the character at the new index in the alphabet string and appends it to the cipher_text string.

  9. The for loop continues until all characters in plain_text have been processed.

  10. The cipher_text string is returned as the output of the function.

Implement Substitution Ciphers Bonus

We have seen how to implement several substitution ciphers step-by-step. This is a paragraph not necessary for understanding the algorithms, however, you may find it challenging and fun so I recommend you try to read it.

This is a fast overview of how to rewrite the previous algorithms in just one line:

Atbash Cipher

def oneline_atbash(alphabet, plain_text):
    return "".join([c if alphabet.find(c) == -1 else alphabet[len(alphabet)-alphabet.index(c)-1] for c in plain_text])

For each character in the plain_text, the function checks whether it’s in the alphabet string using the find() method:

  • If the character is not in the alphabet, it’s added to the output string as is.

  • Else the code calculates its mirror image by subtracting its current index from the length of the alphabet and 1. This gives the index of the character’s mirror image in the alphabet string.

The code then retrieves the character at the new index in the alphabet string and appends it to the output string. The list comprehension continues until all characters in plain_text have been processed, and the output string is returned as the output of the function.

Cesar’s Cipher

def oneline_cesar(alphabet, key, plain_text):
    return "".join([c if alphabet.find(c) == -1 else alphabet[(alphabet.index(c)+key)%len(alphabet)] for c in plain_text])

The function uses a list comprehension to iterate through each character c in the plain_text string. For each character, the list comprehension checks whether it’s in the alphabet string using the find() method.

  • If the character is not in the alphabet, it’s added to the output string as is.

  • Else the code calculates its new index in the alphabet string by adding the key parameter to its current index and taking the result modulo the length of the alphabet like Atbash cipher.

The code then retrieves the character at the new index in the alphabet string and appends it to the output string. The list comprehension continues until all characters in the plain_text have been processed, and the output string is returned as the output of the function.

Vigenere’s Cipher

def oneline_vigenere(alphabet, key, plain_text):
    return "".join([c if alphabet.find(c) == -1 else alphabet[(alphabet.index(k)+alphabet.index(c))%len(alphabet)] for k, c in zip(cycle(key), plain_text)])

The function uses a list comprehension to iterate through each character pair k, c in the zip() of cycle(key) and plain_text. The cycle() function ensures that the keyword is repeated as necessary to match the length of the message.

For each character pair, the list comprehension checks whether c is in the alphabet string using the find() method. If c is not in the alphabet, it’s added to the output string as is.

If c is in the alphabet, the code calculates its new index in the alphabet string by adding the index of k in the alphabet string to the index of c in the alphabet string, and taking the result modulo the length of the alphabet.

The code then retrieves the character at the new index in the alphabet string and appends it to the output string. The list comprehension continues until the processing of all character pairs, and the function’s result is the output string.

Conclusion

Finally, we are at the end, as you may have guessed my goal in writing about how to implement substitution ciphers is to help readers develop a thorough understanding of how they work.

I believe that practice is the best way to learn, and I hope that this article provides both valuable content and a helpful approach.
If you’ve found this useful, I encourage you to follow our blog and social media for more similar content. I will try to do my best to deliver high-quality content that keeps improving over time.

Thank you for your time!

1
Subscribe to my newsletter

Read articles from Stackzero directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Stackzero
Stackzero