Tiny PowerShell Project 8 - Data Extraction
What's often the most significant expense in a project? It's integration. To put it simply for the non-technical reader, companies purchase technologies from various vendors, and integration is the crucial step of making sure these different technologies function cohesively together.
In the context of software engineering, "integration" refers to the process of combining different software components or systems to function as one unified entity. The term can be used in various contexts and can denote different activities, including:
Software Integration: This involves combining individual software modules or components that have been developed separately. These components might be functions, classes, libraries, or even entire systems. The goal is to ensure that they work harmoniously together.
System Integration: This is the process of connecting different IT systems, including both software and hardware, to work together within an organization. System integrators might use various techniques and products, such as middleware, to make disparate systems communicate and function as one cohesive unit.
In each context, the core idea of integration is to ensure different pieces of software, whether they're components, systems, or services, can effectively work together to achieve a desired result. Proper integration is crucial in software engineering to ensure system consistency, reliability, and scalability.
In the realm of application support, it's commonplace to encounter scenarios requiring data transfer from one system to another. This challenge arises frequently because many technologies lack compatible software stacks or well-defined APIs. You might be surprised to learn how many companies choose batch processing for records over API integration.
Today's tiny project is going to focus on writing an extraction. In this scenario, we'll have to connect to a database, run our query, and store the extracted information to our destination.
The first step would be to securely capture the needed credentials. In essence, we have to write a code that does the following steps:
Prompts the user for a password.
Converts the password to a SecureString.
Encrypts the SecureString.
Writes the encrypted string to a file.
Use this credential to query our database
Store the extracted information to a given destination
Let's see what the code would look like:
# Define constants
$PasswordFilePath = "C:\Users\scoappjobs\Downloads\password.txt"
$User = "yourUserName"
$DatabaseServer = "YourServerName"
$DatabaseName = "yourDatabaseName"
$CsvOutputPath = "C:\Users\$env:username\Downloads\output.csv"
# Check and save password if it doesn't exist
if (-not (Test-Path $PasswordFilePath)) {
Read-Host "Enter password to secure:" -AsSecureString |
ConvertFrom-SecureString | Out-File $PasswordFilePath
}
# Load email credentials
$Pass = Get-Content $PasswordFilePath | ConvertTo-SecureString
$Credential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $User, $Pass
function Get-Data {
[CmdletBinding()]
param()
begin {}
process {
try {
$SQLquery = @"
USE [DatabaseName];
SELECT T1.[FirstNameColumn]
,T1.[LastNameColumn]
,T2.PhoneColumn
,T2.ID1
,T2.ID2
,T2.ID3
,T3.DescriptionColumn
,GETDATE()
,T4.OfficeNameColumn
,T5.[LanguageColumn]
,T6.EmailColumn
FROM [DatabaseName].[schema].[Table1] as T1
INNER JOIN [DatabaseName].[schema].[LanguageTable] T5 ON T1.LanguageIDColumn = T5.LanguageID
INNER JOIN [DatabaseName].[schema].[Table2] AS T6 ON T1.IDColumn = T6.IDColumn AND T6.RoleColumn = 'RoleName'
INNER JOIN [DatabaseName].[schema].[Table3] AS T2 ON T2.IDColumn = T6.IDColumn AND T2.PhoneColumn != '' AND T2.DateColumn IS NULL
INNER JOIN [DatabaseName].[schema].[StatusTable] AS T2Status ON T2.IDColumn = T2Status.IDColumn AND (T2Status.StatusDescriptionColumn = 'Status1' OR T2Status.StatusDescriptionColumn = 'Status2')
INNER JOIN [DatabaseName].[schema].[CareTable] AS T3 ON T2Status.CareIDColumn = T3.CareID AND T3.SettingColumn = 'SettingValue'
INNER JOIN [DatabaseName].[schema].[OfficeTable] AS T4 ON T4.OfficeIDColumn = T2.OfficeIDColumn
"@
$result = invoke-sqlcmd -query $SQLquery -serverinstance $DatabaseServer -database $DatabaseName -Credential $Credential
$result | ConvertTo-Csv -Delimiter '|' -NoTypeInformation |
ForEach-Object { $_.Replace('"', '') } | Out-File $CsvOutputPath -Encoding ascii
}
catch {
Write-Error "$($_.Exception.Message) - Line Number: $($_.InvocationInfo.ScriptLineNumber)"
}
}
end {}
}
# Call the function
Get-Data
Let's break down the code further:
Read-Host "Enter password to secure:" -AsSecureString
: This prompts the user to enter a password. The-AsSecureString
parameter ensures that the input is treated as a SecureString. A SecureString is a special string in .NET that keeps the string data encrypted in memory to protect sensitive information, such as a password. The characters you type will be obscured as you type them (e.g., displayed as asterisks).ConvertFrom-SecureString
: This cmdlet converts a SecureString into an encrypted standard string. The encryption is based on the Windows Data Protection API (DPAPI). By default, only the user who encrypted the data (on the same computer where it was encrypted) can decrypt it.Out-File $PasswordFilePath
: This cmdlet writes the output (the encrypted standard string from the previous step) to a file. The path to that file is stored in the$PasswordFilePath
variable, which should be defined elsewhere in your script or session.
This is often used as a way to save passwords or other sensitive data to disk in a more secure manner than plain text, although the protection it offers is primarily for casual security (since the decryption can be done by the same user on the same machine). For more robust security solutions, you would typically look into other encryption methods or secrets management tools.
Certainly! The provided code essentially retrieves an encrypted password from a file, decrypts it, and then creates a PowerShell credential object with a username and the decrypted password. Let's break it down:
$Pass = Get-Content $PasswordFilePath | ConvertTo-SecureString
:Get-Content $PasswordFilePath
: This cmdlet retrieves the content of the file specified by$PasswordFilePath
. The content of this file is expected to be the encrypted string version of a password, as saved (possibly by the previous code you shared).| ConvertTo-SecureString
: The pipe (|
) passes the encrypted string to theConvertTo-SecureString
cmdlet, which decrypts it. It uses the Windows Data Protection API (DPAPI) to do this decryption, and as mentioned before, by default, only the user who encrypted the data (on the same computer where it was encrypted) can decrypt it. The result is a SecureString object, which is then stored in the$Pass
variable.
$Credential = New-Object -TypeName
System.Management
.Automation.PSCredential -ArgumentList $User, $Pass
:New-Object -TypeName
System.Management
.Automation.PSCredential
: This creates a new object of typePSCredential
, which represents a set of credentials in PowerShell (username and password).-ArgumentList $User, $Pass
: ThePSCredential
type requires two arguments: a username (a plain string) and a password (a SecureString).$User
presumably contains the username, and$Pass
(from the previous step) contains the password as a SecureString. These are passed to thePSCredential
constructor using the-ArgumentList
parameter.
The result is a $Credential
object, which contains both the username and the SecureString password. This object can then be used in various PowerShell commands that require credentials, such as remote operations or accessing secure resources.
The biggest problem with this extraction is that we're running a query against the database instead of a stored procedure. Using stored procedures over raw SQL queries embedded in application code or scripts offers several benefits:
Maintainability: Once a stored procedure is tested and added to the database, it can be used across multiple applications or scripts without having to duplicate the SQL. If there's ever a need to modify the query, the change is done in one location (the stored procedure) rather than multiple locations in application/script code.
Performance: SQL Server can optimize the execution plan of stored procedures, making them faster to execute over time compared to ad-hoc SQL queries.
Security: Stored procedures provide an extra layer of security. You can restrict permissions on the base tables and give execute permissions on stored procedures, thus ensuring that users can only perform the actions allowed by the stored procedure and nothing more.
Reduced SQL Injection Risk: While parameterized queries (like the one in your script) already reduce the risk of SQL injection, stored procedures can further decrease this risk since the execution of a stored procedure is typically a non-dynamic, single command to the server.
Encapsulation of Business Logic: This allows for business logic to be housed within the database itself, leading to consistent data operations, especially when multiple applications interact with the same database.
It would be ideal to place the SQL query inside a stored procedure. Not only would this encapsulate the logic within the database, but it would also make the PowerShell script cleaner and easier to read. Instead of a lengthy SQL query within the script, you'd have a single invoke-sqlcmd
call to execute the stored procedure.
For instance, instead of:
$SQLquery = @"
USE [DatabaseName];
-- A lot of SQL here...
"@
$result = invoke-sqlcmd -query $SQLquery -serverinstance $DatabaseServer -database $DatabaseName -Credential $Credential
You could have:
$result = invoke-sqlcmd -StoredProc "YourProcedureName" -serverinstance $DatabaseServer -database $DatabaseName -Credential $Credential
This assumes you've created a stored procedure named "YourProcedureName" that encapsulates the SQL logic. It's a cleaner, more maintainable, and potentially safer approach.
Subscribe to my newsletter
Read articles from Hooman Pegahmehr directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Hooman Pegahmehr
Hooman Pegahmehr
Hooman Pegahmehr is a performance-driven, analytical, and strategic Technology Management Professional, employing information technology best practices to manage software and web development lifecycle in alignment with client requirements. He builds high-quality, scalable, and reliable software, systems, and architecture while ensuring secure technology service delivery as well as transcending barriers between technology, creativity, and business, aligning each to capture the highest potential of organization resources and technology investments. He offers 8+ years of transferable experience in creating scalable web applications and platforms using JavaScript software stack, including MongoDB, Express, React, and Node, coupled with a focus on back-end development, data wrangling, API design, security, and testing. He utilizes a visionary perspective and innovative mindset to collect and translate technical requirements into functionalities within the application while writing codes and producing production-ready systems for thousands of users. He designs, develops, and maintains fully functioning platforms using modern web-based technologies, including MERN Stack (MongoDB, Express, React, Node). As a dynamic and process-focused IT professional, Hooman leverages cutting-edge technologies to cultivate differentiated solutions and achieve competitive advantages while supporting new systems development lifecycle. He excels in creating in-house solutions, replacing and modernizing legacy systems, and eliminating outsourcing costs. He exhibits verifiable success in building highly responsive full-stack applications and incident management systems using advanced analytical dashboards while translating complex concepts in a simplified manner. Through dedication towards promoting a culture of collaboration, Hooman empowers and motivates diverse personnel to achieve technology-focused business objectives while administering coaching, training, and development initiatives to elevate personnel performance and achieve team synergy. He earned a winning reputation for transforming, revitalizing, streamlining, and optimizing multiple programs and web-based applications to drive consistent communications across cross-functional organization-wide departments. He manages multiple projects from concept to execution, utilizing prioritization and time management capabilities to complete deliverables on time, under budget, and in alignment with requirements.