Grumpy Guys Guide To SQL Through AI


I am middle aged grumpy guy and a bit amused as well irritated by all this hype around AI. So decided to use the AI thing to help me with something I was unable to master or refused to master in my career, SQL. Hate it or love it, you must have run into some long, cryptic SQL query written by another Grumpy guy, which made no head or tails to you.
So here I am a 40 something lazy guy who want to use AI to overcome one of my innumerable shortcomings called SQL.
Where is a database when you need one?
Yeah, I work with Databases all the time but when you need one, I could not find any half decent database to play around with. Frustrated a spun up a new SQL database on Azure. Be careful I used the additional settings to populate it with adventure works sample data/schema. And no I am not going to give you a crash course on Azure SQL DB sorry.
Few points to remember
Add your IP address to the allowed list or else you cannot connect.
If possible become the Entra Admin for the DB
For ease of use I added a SQL user and a login
--Run this against the master DB
CREATE LOGIN coolusername WITH PASSWORD = 'somebrilliantpassword';
-- Run this against the DB you just created
CREATE USER coolusername FOR LOGIN coolusername;
ALTER ROLE db_datareader ADD MEMBER coolusername;
ALTER ROLE db_datawriter ADD MEMBER coolusername;
Yeah! Provision Azure Open AI Resource
Ok half of you are intelligent enough to guess what this entails but to those who are really thick and do not understand what I am talking about please follow these steps.
In your Azure Portal
Search for Azure OpenAI in the top search bar.
Click Create.
Fill in: Subscription
Resource group
Region Name
Pricing tier
Click Review + Create
then press Create.
Please record the endpoint and the API key you will need it later in code.
But wait you are not done yet, now you need to go for the Foundry experience, what ever that means. Basically open up the AI studio.
Go to deployments
Click on Deploy Model
Choose GPT-3.5-Turbo, you will see a deployment name, this will be needed later.
Fill in Name, Model Version etc.
Click Deploy
Why did I choose this particular model, money baby money its all about the money, I needed to keep the cost down.
.NET Code for Azure Open AI
We need to connect to our open AI resource and then query the database in a human readable form. And no SQL Query isn’t human readable, to most of us it looks and feels like gobbledygook .
Create a console application and in the Program class add few class level variable.
private static string connectionString =
"Server=tcp:abctestsqlserver.database.windows.net,1433;" +
"Initial Catalog=TestLabsOpenAIDB;" +
"Persist Security Info=False;" +
"User ID=xyzall;" +
"Password=somestrongpassword;" +
"MultipleActiveResultSets=False;" +
"Encrypt=True;" +
"TrustServerCertificate=False;" +
"Connection Timeout=30;";
private static string endpoint = "https://abctestsqlserver.openai.azure.com/";
private static string apiKey = "8Ndl7gWrGO16333222222dedccdedddd";
private static string deploymentName = "gpt-35-turbo"; // e.g., gpt-35-turbo or gpt-4
Schema for dumb AI
As you might be aware, if not then I am making you aware that this engine needs the database schema which you are going to use. There are two ways of doing it. Either you add a schema information manually, which gives you the ability to include only those tables which are need for the query or you add the whole schema using a query and using the output of the query as the schema definition needed by the AI engine.
As you may have guessed, I am a lazy grumpy guy so obviously I chose the latter.
private static async Task<string> GetDatabaseSchemaAsync()
{
var schemaBuilder = new StringBuilder();
using (SqlConnection conn = new SqlConnection(connectionString))
{
await conn.OpenAsync();
string schemaQuery = @"
SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS
ORDER BY TABLE_SCHEMA, TABLE_NAME, ORDINAL_POSITION;
";
using (SqlCommand cmd = new SqlCommand(schemaQuery , conn))
using (SqlDataReader reader = await cmd.ExecuteReaderAsync())
{
string tempTable = "";
while (await reader.ReadAsync())
{
string schema = reader.GetString(0);
string table = reader.GetString(1);
string column = reader.GetString(2);
string type = reader.GetString(3);
if (table != tempTable )
{
schemaBuilder.AppendLine($"\nTable {schema}.{table}:");
tempTable = table;
}
schemaBuilder.AppendLine($" {column} ({type})");
}
}
}
return schemaBuilder.ToString();
}
AzureOpenAIClient the master of all
This is the real deep end, so you may like to grab a life boat or a coffee whichever you prefer. We will be writing method to instantiate the main class which connects with Open AI and runs your query. So here is the code
private static async Task<string> GenerateSqlFromQuestionAsync(string question, string schema)
{
AzureOpenAIClient client = new Azure.AI.OpenAI.AzureOpenAIClient(new Uri(endpoint), new AzureKeyCredential(apiKey));
ChatClient chatClient = client.GetChatClient(deploymentName);
var messages = new ChatMessage[]
{
new SystemChatMessage($"You are a SQL assistant. Generate only valid SQL queries. Schema:\n{schema}," +
$"Given an input question, create a syntactically correct {{dialect}} query\r\nto run, then look at the results of the query and return the answer." +
$"Unless the user specifies a specific number of examples they wish to\r\nobtain, **ALWAYS** limit your query to at most {{top_k}} results.\r\n- You can order the results by a relevant column to return the most\r\ninteresting examples in the database.\r\n- Never query for all the columns from a specific table, only ask for\r\nthe relevant columns given the question.\r\n- You have access to tools for interacting with the database.\r\n- You MUST double check your query before executing it.If you get an error\r\nwhile executing a query,rewrite the query and try again." +
$"DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.)\r\nto the database.\r\n- DO NOT MAKE UP AN ANSWER OR USE PRIOR KNOWLEDGE, ONLY USE THE RESULTS\r\nOF THE CALCULATIONS YOU HAVE DONE.\r\n- Your response should be in Markdown. However, **when running a SQL Query\r\nin \"Action Input\", do not include the markdown backticks**.\r\nThose are only for formatting the response, not for executing the command.\r\n- ALWAYS, as part of your final answer, explain how you got to the answer\r\non a section that starts with: \"Explanation:\". Include the SQL query as\r\npart of the explanation section.\r\n- If the question does not seem related to the database, just return\r\n\"I don\\'t know\" as the answer.\r\n- Only use the below tools. Only use the information returned by the\r\nbelow tools to construct your query and final answer.\r\n- Do not make up table names"),
new UserChatMessage(question)
};
ChatCompletion completion = await chatClient.CompleteChatAsync(messages);
string sql = completion.Content[0].Text.Trim();
return sql;
}
First two lines are pretty self explanatory so I will not insult your intelligence by explaining what we are trying to achieve.
Next important part is the Message or should I say the prompt for the model. DISCLAIMER: I have copied this prompt from a python based tutorial. But you get the gist. As you can see we have passe a schema variable which is the string the earlier method returned to us. As for the question variable again this the string the user will ask going forward on the console.
We send the prompt and the question to open AI by calling CompleteChatAsync. Whatever value or text is returned by Open AI is fed to the main program.
The main program, taskmaster
static async Task Main(string[] args)
{
string schema = await GetDatabaseSchemaAsync();
Console.WriteLine("Mortal ask away the oracle of sequel is online.(say 'mercy' to quit):");
while (true)
{
Console.Write(">> ");
string question = Console.ReadLine();
if (question?.ToLower() == "mercy")
break;
string sqlQuery = await GenerateSqlFromQuestionAsync(question, schema);
Console.WriteLine("HUSH! Oracle of sequel speaketh");
Console.WriteLine(sqlQuery);
}
}
Do I need to explain the lines above or you have understood? If not then please take up some other profession may be kick boxing or knitting because clearly computer science isn’t for you.
The Output, FLAWLESS VICTORY!
Whatever may be the effort but when the code runs as expected it is the best feeling in the world. Yes yes I know the model isn’t perfect but neither are you so, simmer down and admire a thing of beauty
So my dear student where to next, shall we decipher the meaning of this existence or shall we try our hand with some JSON parsing. Honestly even I am not sure.
Subscribe to my newsletter
Read articles from Sudeep Mukherjee directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
