Season of AI for Developers: Learn Azure OpenAI, SK & Copilot

If you’re passionate about Artificial Intelligence and application development, this series is for you. Season of AI for Developers is a special 5-episode season from Microsoft Reactor, where experts share everything from the fundamentals of Azure OpenAI to the latest announcements from Microsoft Build 2024, and advanced frameworks like Semantic Kernel for building truly intelligent applications.

Each session blends theory and practice, with clear explanations, live demos, and resources to help you apply what you’ve learned right away. You’ll discover how to integrate language models into your solutions, optimize performance, manage costs, implement patterns like RAG, and even develop your own personalized copilot.

Get ready for a complete journey through the tools, techniques, and best practices that are shaping the future of AI-powered development. Here’s a recap of each episode along with the links to watch them whenever you like.

📺 Episode 1 – Introduction to Azure OpenAI

In this opening episode of Season of AI for Developers, Luis Beltrán and Pabito welcome us to a five-part series designed to take developers from the basics of Azure OpenAI to building intelligent copilots. The session begins with a clear explanation of what Generative AI is, how it differs from traditional AI, and why Large Language Models (LLMs) like GPT have revolutionized human–machine interaction. They also introduce concepts like “tokens,” “probability in generation,” and the transformer architecture that powers these models.

Throughout the talk, the presenters share practical examples of how these models can generate text, images, code, and more—while debunking common misconceptions. They explain how tokenization works, how models are trained on massive datasets, and why parameters like temperature and top p are important for balancing creativity and accuracy. Viewers also learn the differences between model versions (e.g., GPT-3.5 vs. GPT-4) and how training cut-off dates affect the knowledge available in each.

Finally, we see a full demonstration in Azure OpenAI Studio: creating a resource in Azure, deploying a model, setting system messages, and using the playground for live queries. We also showcase .NET and Python integrations, including multimodal examples with image inputs and DALL·E-powered image generation. It’s a solid foundation for understanding the technical basics and setting the stage for the more advanced topics to come.

https://www.youtube.com/watch?v=CFIieS7Vu0g

📺 Episode 2 – Considerations for Implementing Models in Azure OpenAI

In this second session of Season of AI for Developers, Dive into the practical aspects of deploying and customizing Azure OpenAI models. The episode focuses on two key approaches for enriching large language models (LLMs) with private knowledge: fine-tuning, which adapts a model to specific contexts or tasks, and Retrieval-Augmented Generation (RAG), which combines document retrieval with LLMs to deliver accurate, context-aware answers. The presenters highlight how RAG overcomes common LLM limitations—such as outdated training data or lack of internal company knowledge—by integrating both up-to-date public data and proprietary information into the response process.

The session walks through the RAG architecture, explaining how documents are ingested, chunked, and indexed in Azure AI Search, and how user queries flow through document retrieval, semantic re-ranking, and finally into a GPT model to produce grounded answers. They demonstrate hybrid search techniques (combining keyword and vector search), the role of embeddings in capturing semantic meaning, and how cosine similarity helps identify the most relevant results. Real-world demos show RAG in action with both Python and .NET, including a scenario where company policy documents are queried to answer benefit-related questions, complete with citations.

Beyond RAG, the presenters address important considerations around security, privacy, and quotas. They outline Azure’s enterprise-grade protections, including role-based access, data isolation, and responsible AI safeguards. On the performance side, they explain how token-per-minute (TPM) quotas work, the impact of exceeding limits, and strategies like request redirection, retry logic with exponential backoff, and API Management for load balancing. The episode closes with resources, GitHub repos, and best practices for scaling production-grade AI solutions in Azure.

https://www.youtube.com/watch?v=I6Ipt1oNu1Q

📺 Episode 3 – What’s New from Microsoft Build: PHI-3, GPT-4o, Azure Content Safety & Azure AI Studio

In this mid-season episode of Season of AI for Developers, unpack the most significant AI announcements from Microsoft Build, focusing on PHI-3, GPT-4o, Azure Content Safety, and Azure AI Studio. They begin with PHI-3, Microsoft’s family of Small Language Models (SLMs) designed for efficiency, on-premises deployment, and cost reduction. These open-source models—available in sizes like mini, small, and medium—can run locally or in the cloud, making them ideal for scenarios with strict privacy, compliance, or offline requirements. Despite their smaller size, PHI-3 models can be fine-tuned, support prompt engineering, and even integrate with the RAG pattern for grounded responses.

The session then shifts to GPT-4o and its multimodal capabilities. Unlike traditional text-only models, GPT-4o can process both text and images in a single prompt, enabling powerful use cases like extracting structured data from receipts, analyzing contracts, describing complex images, or generating travel content directly from photos. Through live demos, they showcase GPT-4o’s ability to handle mathematical queries from diagrams, summarize documents, and even create marketing copy with a specific tone and format. They also highlight practical considerations—like the model’s faster speed and lower cost compared to earlier GPT-4 variants—and preview how audio inputs may soon extend its multimodal reach.

Finally, the presenters cover Azure AI Studio, Microsoft’s unified platform for exploring, deploying, and managing AI models. They walk through the model catalog, which includes OpenAI models, PHI-3, LLaMA, Mistral, and others, plus Hugging Face integrations. Without signing in, developers can test models and compare benchmarks; by signing in with an Azure account, they can provision deployments with either dedicated compute or cost-effective serverless hosting. The episode closes with a brief grounding demo inside Azure AI Studio using the RAG pattern—retrieving answers from internal documents with citations—and a look ahead to the next topic: Semantic Kernel.

https://www.youtube.com/watch?v=mYmtXf_iJDU

📺 Episode 4 – Getting Started with Semantic Kernel

In this fourth episode of Season of AI for Developers, introduce Semantic Kernel—an open-source SDK from Microsoft designed to help developers integrate AI capabilities into their applications more easily. Semantic Kernel acts as a bridge between AI researchers and enterprise developers, unifying workflows regardless of programming language or AI provider. It enables you to orchestrate your own code (“native functions”) alongside AI-powered capabilities, so you can infuse existing apps with LLM-driven features without rewriting everything from scratch. The SDK supports C#, Python, and (in preview) Java, and its design allows developers to switch between providers like Azure OpenAI, OpenAI, and Hugging Face with minimal changes.

The presenters explain the core concepts: the kernel as the central orchestrator, planners to break down user requests into actionable steps, semantic functions (prompts), and native functions (existing code). They show how planners leverage function calling to decide which functions to execute and in what order. They also cover optional components like memories (for context persistence), connectors (for external data), and how Semantic Kernel’s flexibility supports various app types—from console utilities to web APIs and mobile apps. Prompt templates and variable injection are demonstrated, with the ability to store prompts in structured plugin folders using config.json and .skprompt.txt files for reusability.

In the demos, walk through building a Semantic Kernel app from scratch in C#, connecting to Azure OpenAI for chat completions, and running three types of prompts: simple requests, controlled-length outputs, and streaming responses. They showcase native plugins like a Time Plugin to fetch the current date, and a custom GitHub Plugin to query a user’s repositories—illustrating how the AI automatically decides which plugin to invoke based on the query. The session closes with a preview of image generation using DALL·E through Semantic Kernel and a look ahead to Episode 5, where they’ll dive into building full copilots with planners, memories, RAG, and unit testing.

https://www.youtube.com/watch?v=NE9cty3m1UM

📺 Episode 5 – Build Your Own Copilot with Semantic Kernel

In the season finale of Season of AI for Developers, we take everything covered in previous episodes and show how to combine it into a full copilot solution using Semantic Kernel. They begin by deep-diving into plugins—both built-in and custom—and explain how decorators like KernelFunction and detailed descriptions expose your application’s native functions to large language models (LLMs). This lets AI agents automatically decide which functions to call (via function calling) to answer user questions or perform multi-step tasks. Examples include simple native plugins like the Time Plugin and API integrations such as a weather service, where the AI invokes the plugin twice for two different cities and compares the results to answer the query.

From there, they introduce planners, which orchestrate multiple plugins to fulfill a user’s request. Using function calling as the primary mechanism, planners analyze all available functions, select the appropriate ones, and execute them in the correct order—passing results back into the conversation until a final response is ready. The process is illustrated step-by-step, from generating JSON schemas for each function to invoking them with arguments, chaining calls, and returning integrated responses. Developers can choose to let the AI auto-invoke functions or retain full manual control.

The session then moves to memories, showing how to integrate RAG (Retrieval-Augmented Generation) with Semantic Kernel to extend an LLM’s knowledge using embeddings and a vector database. They demonstrate storing embeddings in Azure AI Search, automatically creating indexes and vector profiles, and querying them for semantically relevant matches. In the demo, a set of text documents is indexed, and the copilot answers questions based solely on that content, maintaining chat history for context. Finally, they showcase a ready-to-run copilot reference app that combines local memory, plugins, and external data sources, serving as a starting point for integrating Semantic Kernel into web or enterprise applications. The episode closes with a comparison of Semantic Kernel and LangChain, and an invitation to the community to continue exploring real-world AI integration scenarios.

https://www.youtube.com/watch?v=OUZ_lpDQVz4

Don’t miss it! Rewatch each episode to discover how you can take your applications to the next level with Microsoft AI.

Enhance your AI skills with the Season of AI for Developers learning collection on Microsoft Learn — packed with resources, hands-on labs, and guidance to help you apply everything from the series in real-world scenarios. Explore the collection here.

For additional technical content, insights, and related community discussions, check out the Season of AI for Developers page on Microsoft Tech Community.

Season of AI for Developers!

Table of contents

📺 Episode 1 – Introduction to Azure OpenAI

📺 Episode 2 – Considerations for Implementing Models in Azure OpenAI

📺 Episode 3 – What’s New from Microsoft Build: PHI-3, GPT-4o, Azure Content Safety & Azure AI Studio

📺 Episode 4 – Getting Started with Semantic Kernel

📺 Episode 5 – Build Your Own Copilot with Semantic Kernel

Stay curious, keep learning!

Subscribe to my newsletter

Pablo Piovano

Pablo Piovano