My Favorite OpenAI Agents SDK Feature (And The Most Understated!)

In our previous tutorial, we built a restaurant customer support chatbot using OpenAI's Agents SDK. In this follow-up, we’ll explore guardrails—a critical feature that enhances AI chatbot safety and reliability.

What Are Guardrails in AI Agents?

Guardrails act as a safety net for AI agents, ensuring they operate within predefined boundaries and preventing misuse.

They work alongside agents, validating user inputs and outputs to safeguard against errors and inappropriate responses.

There are two types of guardrails:

  • Input Guardrails: Validate user inputs before processing.

  • Output Guardrails: Ensure the final response is appropriate before delivering it to the user.

Let’s see them in action!

Input Guardrails

  • These validate initial user inputs before passing them to expensive models.

  • They operate in three steps: receiving input, running validation functions, and triggering errors if misuse is detected.

Input guardrails are mechanisms put in place to validate, sanitize, and preprocess user inputs before they reach an AI model. These safeguards help in preventing:

  • Malicious injections (e.g., prompt injection attacks)

  • Profanity, hate speech, and harmful language

  • Unstructured or irrelevant input that reduces model efficiency

  • Bias amplification

By implementing input guardrails, developers can ensure that AI models receive well-structured and appropriate input, leading to better and safer outputs.

Why Are Input Guardrails Important?

  1. Security: Prevents prompt injections, SQL injections, and adversarial attacks.

  2. Quality Assurance: Filters out irrelevant or poorly structured queries.

  3. Bias Mitigation: Helps remove explicit bias in prompts.

  4. User Experience: Ensures clear and understandable input for meaningful responses.

  5. Compliance: Adheres to ethical AI principles and regulatory requirements.

How to Implement Input Guardrails

Create the guardrail agent as below. Use the @input_guardrail decorator for the guardrail method.

More here in the documentation.

Input guardrails run in 3 steps:

  1. First, the guardrail receives the same input passed to the agent.

  2. Next, the guardrail function runs to produce a GuardrailFunctionOutput, which is then wrapped in an InputGuardrailResult

  3. Finally, we check if .tripwire_triggered is true. If true, an InputGuardrailTripwireTriggered exception is raised, so you can appropriately respond to the user or handle the exception.


Pass the guardrail as an argument to the triage agent.

Handle the InputGuardrailTripwireTriggered exception.

Exception raised when the input is “Why is my order delayed? You guys are pathetic“

Output Guardrails

  • These validate the final outputs generated by agents before they are delivered to users.

  • They operate similarly to input guardrails but focus on the output stage to ensure accuracy and safety.

Output guardrails run in 3 steps:

  1. First, the guardrail receives the same input passed to the agent.

  2. Next, the guardrail function runs to produce a GuardrailFunctionOutput, which is then wrapped in an OutputGuardrailResult

  3. Finally, we check if .tripwire_triggered is true. If true, an OutputGuardrailTripwireTriggered exception is raised, so you can appropriately respond to the user or handle the exception.

In the example below - we want to check for the “card” in the response of the order_agent and raise an exception accordingly.

Add the output guardrail as the argument to the final agent in the workflow.

We simulated the response of the order_agent for order 12346 to contain the word “card” and this is how the exception is caught.

Code

https://github.com/zahere-dev/openai-agents-sdk-tutorial

Conclusion:

  • Guardrails are vital components of AI agent systems, ensuring they operate safely and efficiently.

  • By implementing guardrails, developers can enhance user trust and prevent misuse scenarios effectively.

0
Subscribe to my newsletter

Read articles from Zahiruddin Tavargere directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Zahiruddin Tavargere
Zahiruddin Tavargere

I am a Journalist-turned-Software Engineer. I love coding and the associated grind of learning every day. A firm believer in social learning, I owe my dev career to all the tech content creators I have learned from. This is my contribution back to the community.