Tiny PowerShell Project 3 - Checksum verification
I cannot count the number of times that I had to work with engineers that held fancy titles but simply didn't know how to do a basic checksum verification.
MD5 stands for "Message Digest Algorithm 5." It is a widely used cryptographic hash function that takes an input (message or data) of arbitrary length and produces a fixed-size (128-bit or 16-byte) hash value. The primary purpose of MD5 is to generate a unique representation (hash) of the input data in a way that makes it challenging to reverse-engineer the original data or find two different inputs that produce the same hash (collision resistance).
It's important to note that MD5 is considered cryptographically broken and unsuitable for security purposes due to its vulnerability to collision attacks. Because of these vulnerabilities, MD5 is not recommended for cryptographic purposes like secure password storage or digital signatures. However, MD5 still has non-security-related uses, like checksum verification for data integrity in non-critical applications or generating unique identifiers for non-security-critical tasks.
To get the MD5 hash of a single file using PowerShell, you can use the Get-FileHash
cmdlet. Here's how you can do it:
# Define the file path of the image
$filePath = "C:\Path\To\Your\Image.jpg"
# Get the MD5 hash of the file
$hashInfo = Get-FileHash -Algorithm MD5 -Path $filePath
# Output the MD5 hash value
Write-Host "MD5 hash of $($filePath): $($hashInfo.Hash)"
Before we dive into today's tiny project, let's also talk about hash tables first. In PowerShell, a hashtable is a data structure used to store key-value pairs. It allows you to associate a value (data) with a unique key (identifier). Hashtables are useful when you need to quickly lookup values based on their keys, making them efficient for data retrieval.
In PowerShell, you can create a hashtable using the @{}
notation. Each key-value pair is separated by =. Here's the general syntax of a hashtable:
$hashtable = @{
Key1 = Value1
Key2 = Value2
...
}
Now, let's go through some CRUD (Create, Read, Update, Delete) examples with hashtables:
1. Create a hashtable: You can create a new hashtable and add key-value pairs to it using the @{}
notation:
# Create a new hashtable
$personInfo = @{
Name = "John Doe"
Age = 30
Occupation = "Software Engineer"
}
2. Read from a hashtable: To access the value associated with a specific key, you can use the key inside square brackets:
# Access specific values using keys
$personName = $personInfo["Name"]
$personAge = $personInfo["Age"]
Write-Host "Name: $personName, Age: $personAge"
3. Update a hashtable: You can update the value associated with a key or add new key-value pairs to an existing hashtable:
# Update values or add new key-value pairs
$personInfo["Age"] = 31 # Update Age to 31
$personInfo["City"] = "New York" # Add a new key-value pair
# Print the updated hashtable
$personInfo
4. Delete from a hashtable: To remove a key-value pair from a hashtable, you can use the Remove()
method:
# Delete a key-value pair
$personInfo.Remove("City")
# Print the hashtable after deletion
$personInfo
Now that we have learned how to calculate the MD5 hash values of files and effectively interact with objects in PowerShell, let's explore the possibility of crafting a script to detect duplicate images. The underlying concept involves associating each file with a unique hash value, allowing us to identify duplicated copies by locating repeating hash values. Our approach begins by storing the computed hash values along with their corresponding file names in a hashtable. Subsequently, we perform a lookup to ascertain the existence of each hash value. If a particular hash value is not present, we add it to the hashtable; however, if it already exists, it indicates the presence of a duplicate image.
# Define the folder path where the images are located
$folderPath = "C:\Users\Username\Pictures"
# Initialize a hash table to store the MD5 hashes and their corresponding files
$hashTable = @{}
# Get all image files in the folder and subfolders
$imageFiles = Get-ChildItem -Path $folderPath -Recurse -Include *.jpg, *.jpeg, *.png, *.bmp, *.gif
# Loop through each image file
foreach ($imageFile in $imageFiles) {
# Get the MD5 hash of the image file
$hashInfo = Get-FileHash -Algorithm MD5 -Path $imageFile.FullName
$hash = $hashInfo.Hash
# Check if the hash already exists in the hash table
if ($hashTable.ContainsKey($hash)) {
# The hash already exists, so the image file is a duplicate
Write-Host "Duplicate image found: $($imageFile.FullName)"
} else {
# Add the hash to the hash table
$hashTable.Add($hash, $imageFile.FullName)
}
}
There you have it! before we end today's writing, let's see some of the useful methods available to us.
In PowerShell, hashtables are implemented as System.Collections.Hashtable objects, which means they have several useful methods that you can use to work with the data they contain. Here are some of the commonly used methods for hashtables:
Add(key, value)
: Adds a new key-value pair to the hashtable.
$myHashtable = @{}
$myHashtable.Add("Name", "John")
$myHashtable.Add("Age", 30)
Remove(key)
: Removes a key-value pair from the hashtable based on the specified key.
$myHashtable.Remove("Age")
ContainsKey(key)
: Checks if the hashtable contains a specific key and returnsTrue
orFalse
.
if ($myHashtable.ContainsKey("Name")) {
Write-Host "Name exists in the hashtable."
}
ContainsValue(value)
: Checks if the hashtable contains a specific value and returnsTrue
orFalse
.
if ($myHashtable.ContainsValue("John")) {
Write-Host "The value 'John' exists in the hashtable."
}
Clear()
: Removes all key-value pairs from the hashtable, making it empty.
$myHashtable.Clear()
GetEnumerator()
: Returns an enumerator that allows you to iterate through the key-value pairs in the hashtable.
$enumerator = $myHashtable.GetEnumerator()
while ($enumerator.MoveNext()) {
Write-Host "$($enumerator.Key): $($enumerator.Value)"
}
Count
: Retrieves the number of key-value pairs in the hashtable.
$numberOfItems = $myHashtable.Count
Write-Host "Number of items in the hashtable: $numberOfItems"
Keys
: Returns an array containing all the keys in the hashtable.
$keysArray = $myHashtable.Keys
Values
: Returns an array containing all the values in the hashtable.
$valuesArray = $myHashtable.Values
Here is Microsoft about page for hash tables and dotnet's page for hashtable class.
Subscribe to my newsletter
Read articles from Hooman Pegahmehr directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Hooman Pegahmehr
Hooman Pegahmehr
Hooman Pegahmehr is a performance-driven, analytical, and strategic Technology Management Professional, employing information technology best practices to manage software and web development lifecycle in alignment with client requirements. He builds high-quality, scalable, and reliable software, systems, and architecture while ensuring secure technology service delivery as well as transcending barriers between technology, creativity, and business, aligning each to capture the highest potential of organization resources and technology investments. He offers 8+ years of transferable experience in creating scalable web applications and platforms using JavaScript software stack, including MongoDB, Express, React, and Node, coupled with a focus on back-end development, data wrangling, API design, security, and testing. He utilizes a visionary perspective and innovative mindset to collect and translate technical requirements into functionalities within the application while writing codes and producing production-ready systems for thousands of users. He designs, develops, and maintains fully functioning platforms using modern web-based technologies, including MERN Stack (MongoDB, Express, React, Node). As a dynamic and process-focused IT professional, Hooman leverages cutting-edge technologies to cultivate differentiated solutions and achieve competitive advantages while supporting new systems development lifecycle. He excels in creating in-house solutions, replacing and modernizing legacy systems, and eliminating outsourcing costs. He exhibits verifiable success in building highly responsive full-stack applications and incident management systems using advanced analytical dashboards while translating complex concepts in a simplified manner. Through dedication towards promoting a culture of collaboration, Hooman empowers and motivates diverse personnel to achieve technology-focused business objectives while administering coaching, training, and development initiatives to elevate personnel performance and achieve team synergy. He earned a winning reputation for transforming, revitalizing, streamlining, and optimizing multiple programs and web-based applications to drive consistent communications across cross-functional organization-wide departments. He manages multiple projects from concept to execution, utilizing prioritization and time management capabilities to complete deliverables on time, under budget, and in alignment with requirements.