How to get structured outputs from Openai every time.

GuruGenGuruGen
2 min read

Get Structured Outputs from OpenAI Using Pydantic

Are you struggling to get structured outputs from OpenAI? Tired of inconsistent formats and hallucinations? 🤯

Structured Outputs

Well, worry no more! OpenAI has launched structured outputs in its beta version. Here's how you can take advantage of it.

Using Pydantic Models for Structured Outputs

First, you need to define your Pydantic models, which serve as a schema for the values you want to extract or generate.

Step 1: Define Your Schema

Let's define a schema to extract novel details from OpenAI. Create a file called schema.py and add the following:

from pydantic import BaseModel, Field
from typing import List
from datetime import date

class NovelDetails(BaseModel):
    novel_name: str = Field(..., description="The name of the novel")
    writer_name: str = Field(..., description="The author's name")
    year_published: date = Field(..., description="The publication year of the novel")

class Novel(BaseModel):
    novels: List[NovelDetails] = Field(..., description="List of novels with their details")

Step 2: Use OpenAI's Structured Output API

Now, let's use OpenAI's beta structured output API to extract these fields.

import openai
from schema import Novel

openai.api_key = "xxxxxxxxxxxxxxxxxxxxx"

response = openai.beta.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {"role": "system", "content": "You are a book expert with vast knowledge about books. Answer user questions accurately."},
        {"role": "user", "content": "Give me a list of 10 thriller novels."}
    ],
    response_format={"type": "json_schema", "schema": Novel.model_json_schema()}
)

parsed_data = Novel(**response.choices[0].message.parsed)
print(parsed_data)

What’s Happening Here?

  1. Define the schema: The Novel class ensures that the API response follows a strict format.
  2. Send a request: Using openai.beta.chat.completions.create, we request OpenAI to return structured data.
  3. Parse the response: The Novel model ensures that the extracted data is well-structured and correctly formatted.

Output Example

{
  "novels": [
    {
      "novel_name": "Gone Girl",
      "writer_name": "Gillian Flynn",
      "year_published": "2012-05-24"
    },
    {
      "novel_name": "The Girl with the Dragon Tattoo",
      "writer_name": "Stieg Larsson",
      "year_published": "2005-08-23"
    }
  ]
}
0
Subscribe to my newsletter

Read articles from GuruGen directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

GuruGen
GuruGen

Hi, I'm Vikrant, a passionate software developer with a strong belief in the power of teamwork, empathy, and getting things done. With a background in building scalable and efficient backend systems, I've had the privilege of working with a range of technologies that excite me - from Express.js, Flask, and Django to React, PostGres, and MongoDB Atlas. My experience with Azure has given me a solid understanding of cloud infrastructure, and I've had a blast building and deploying applications that make a real impact. But what really gets me going is exploring the frontiers of AI and machine learning. I've had the opportunity to work on some amazing projects, including building advanced RAG applications, fine-tuning models like Phi2 on custom data, and even dabbling in web3 and Ethereum. For me, it's not just about writing code - it's about understanding the people and problems I'm trying to solve. I believe that empathy is the unsung hero of software development, and I strive to bring a human touch to everything I do. Whether it's collaborating with colleagues, communicating with clients, or simply trying to make sense of complex technical concepts, I'm always looking for ways to make technology more accessible and more meaningful. If you're looking for a team player who is passionate about building innovative solutions, let's connect! I'm always up for a chat about the latest tech trends, or just about life in general.