Comment Moderation API with FastAPI and Google Generative AI

In this guide, we'll develop a comment moderation API using Python's FastAPI framework and Google Generative AI. The API will analyze comments for potentially harmful content and return a JSON object indicating if the comment should be flagged. This tutorial walks through each step of the code and explains how it works.

Project Setup and Requirements

This project leverages:

FastAPI: A modern, fast (high-performance) web framework for building APIs.
Google Generative AI: Used for content moderation with machine learning.
Pydantic: For data validation and parsing.
Dotenv: To load environment variables securely.

Before running the code, install the required packages:

pip install fastapi google-generativeai pydantic python-dotenv

Code Breakdown

1. Initializing FastAPI and Setting Up Environment Variables

First, we import necessary modules and initialize our FastAPI app:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import os
import json
from dotenv import load_dotenv
import google.generativeai as genai

app = FastAPI()

Here, dotenv loads environment variables from a .env file, keeping sensitive data, like API keys, secure.

# Load environment variables from .env file
load_dotenv()

2. Configuring Google Generative AI

To use Google Generative AI, configure it with an API key stored in your .env file:

genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
model = genai.GenerativeModel("gemini-1.5-flash")  # Use the appropriate model version

The GenerativeModel object allows you to specify the AI model to use. In this case, we are using "gemini-1.5-flash".

3. Setting up Pydantic for Data Validation

Using Pydantic's BaseModel, we define a class for incoming comment data. Pydantic ensures that the data is structured correctly, reducing errors.

class CommentData(BaseModel):
    comment: str

4. Creating the Moderation Endpoint

The main logic of the API resides in this endpoint. Here’s the detailed breakdown:

@app.post("/moderate/")
def moderate_comment(comment_data: CommentData):
    """
    Analyzes the given comment for harmful content.

    Args:
        comment_data (CommentData): A Pydantic model containing the comment to be moderated.

    Returns:
        dict: A dictionary containing the result (True or False) for harmful content.
    """
    comment = comment_data.comment

When a POST request is made to /moderate/, the API receives a JSON object with a single field, comment, which it then analyzes.

5. Constructing the Moderation Prompt

The prompt instructs Google Generative AI to check for harmful content. This specific prompt ensures that only relevant JSON data is returned.

prompt = f"""
Role: You are a sophisticated content moderation system designed to analyze text comments for harmful content.

Task: Given a comment, you must strictly check for the presence of any racist, harmful, misleading, or abusive content. 
Your output should be a JSON object with a single key `result` that will contain either `true` or `false`. If any of the 
specified harmful attributes are detected, return `true`. Otherwise, return `false`. Your response should be strictly limited to this JSON format.

Important: Return only the JSON object, without any other text or formatting. 
Important: Return the True or False with the first letter capitalized and the rest lowercase.

Comment: "{comment}"
"""

6. Sending the Request to Google Generative AI

To obtain a moderation response, we send the constructed prompt to the AI model.

try:
    # Generate content using the model
    response = model.generate_content(prompt)

The response may contain several candidates, but we only need the first one.

7. Handling Safety Stops and Extracting JSON Content

If the AI detects that the comment contains inappropriate content, it triggers a "SAFETY" stop:

# Check if the response triggered a safety stop
if candidates and hasattr(candidates[0], 'finish_reason') and candidates[0].finish_reason == "SAFETY":
    return {"result": True}  # Automatically mark as harmful

If no "SAFETY" stop occurs, we analyze the response’s content:

if candidates and hasattr(candidates[0], 'content'):
    content = candidates[0].content
    if hasattr(content, 'parts') and content.parts:
        response_text = content.parts[0].text.strip()

This code extracts the JSON from the AI’s response and formats it correctly for further parsing.

8. Cleaning Up and Parsing the Response

To ensure the JSON response is clean, we perform some basic processing:

# Clean up the response (remove backticks and extraneous text)
if response_text.startswith("```json"):
    response_text = response_text[7:].strip()
if response_text.endswith("```"):
    response_text = response_text[:-3].strip()

After cleanup, we parse the JSON string:

# Fix JSON format (single to double quotes and Python booleans to JSON booleans)
valid_json = response_text.replace("'", '"')
result = json.loads(valid_json)
return result

This part ensures that any formatting issues are resolved before parsing.

9. Error Handling

If any errors occur during processing, a 500 HTTP response is returned with an error message:

except Exception as e:
    raise HTTPException(status_code=500, detail=f"Error processing the comment: {str(e)}")

Complete Code

Below is the complete code for our content moderation API:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import os
import json
from dotenv import load_dotenv
import google.generativeai as genai

app = FastAPI()

# Load environment variables from .env file
load_dotenv()

# Configure Google Generative AI API with API key
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
model = genai.GenerativeModel("gemini-1.5-flash")  # You can use the appropriate model version

# Pydantic model to handle the input
class CommentData(BaseModel):
    comment: str

@app.post("/moderate/")
def moderate_comment(comment_data: CommentData):
    """
    Analyzes the given comment for harmful content.

    Args:
        comment_data (CommentData): A Pydantic model containing the comment to be moderated.

    Returns:
        dict: A dictionary containing the result (True or False) for harmful content.
    """
    comment = comment_data.comment

    # Construct the moderation prompt
    prompt = f"""
    Role: You are a sophisticated content moderation system designed to analyze text comments for harmful content.

    Task: Given a comment, you must strictly check for the presence of any racist, harmful, misleading, or abusive content. 
    Your output should be a JSON object with a single key `result` that will contain either `true` or `false`. If any of the 
    specified harmful attributes are detected, return `true`. Otherwise, return `false`. Your response should be strictly limited to this JSON format.

    Important: Return only the JSON object, without any other text or formatting. 
    Important: Return the True or False with the first letter capitalized and the rest lowercase.

    Comment: "{comment}"
    """

    default_response = {"result": "True"}  # Default response in case of issues

    try:
        # Generate content using the model
        response = model.generate_content(prompt)

        # Access candidates from the response
        candidates = response.candidates

        # Check if the response triggered a safety stop
        if candidates and hasattr(candidates[0], 'finish_reason') and candidates[0].finish_reason == "SAFETY":
            return {"result": True}  # Automatically mark as harmful

        # If no safety stop, process the first candidate
        if candidates and hasattr(candidates[0], 'content'):
            content = candidates[0].content
            if hasattr(content, 'parts') and content.parts:
                response_text = content.parts[0].text.strip()

                # Clean up the response (remove backticks and extraneous text)
                if response_text.startswith("```json"):
                    response_text = response_text[7:].strip()
                if response_text.endswith("```"):
                    response_text = response_text[:-3].strip()

                # Fix JSON format (single to double quotes and Python booleans to JSON booleans)
                valid_json = response_text.replace("'", '"') #.replace("True", "true").replace("False", "false")

                # Parse the valid JSON
                result = json.loads(valid_json)
                return result

        # In case of any issues, return default response
        return default_response

    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Error processing the comment: {str(e)}")

Deployment and Testing

Running the FastAPI Server

To run this FastAPI application locally, use the following command:

uvicorn main:app --reload

This starts the server on http://127.0.0.1:8000, where the /moderate/ endpoint can now accept POST requests with comment data.

Testing the API with cURL or Postman

To test the moderation endpoint, you can use a tool like Postman or cURL. Here’s an example cURL command to send a POST request:

curl -X POST "http://127.0.0.1:8000/moderate/" -H "Content-Type: application/json" -d '{"comment": "This is a test comment"}'

You should receive a response indicating whether the comment contains harmful content:

{
  "result": "False"
}

If harmful or inappropriate content is detected, "result": "True" will be returned.

Enhancements and Considerations

1. Improving Accuracy with Additional Parameters

Google Generative AI models can be fine-tuned or configured to improve content moderation accuracy. Testing and adjusting prompt phrasing may also yield better responses. Monitoring API output in real-time will help you refine the model's moderation criteria over time.

2. Handling Sensitive Topics

Certain sensitive topics may need additional logic or specific moderation criteria. Depending on your use case, you can add additional checks before finalizing the moderation result.

3. Logging and Analytics

Adding logging and analytics to your API could help track flagged comments, understand trends in moderation requests, and improve user experience by better addressing common content issues.

4. Rate Limiting and Throttling

To avoid excessive requests to the Google API (and potential costs), consider adding rate limiting or caching to handle repeated requests effectively.

5. Error Handling and Retries

Integrate retry logic for network errors when calling the Google API. Adding structured error handling with more informative error messages will help both developers and end-users understand what went wrong in edge cases.

Security Considerations

Environment Variables: Keep sensitive information, like API keys, out of your source code and version control by using environment variables stored in a .env file.
Input Validation: Always validate and sanitize input data, especially when dealing with user-generated content.
API Key Management: Restrict your Google API key permissions to only the services required for this application and regularly rotate API keys for security.

Conclusion

This API leverages the power of Google Generative AI to automate comment moderation for harmful or inappropriate content. The setup can be further extended to handle different types of content or support more sophisticated moderation tasks. Integrating this API with a frontend or a backend system enables it to support a live application, ensuring that user-generated content stays respectful and safe.

This tutorial covered creating a robust, AI-powered content moderation API using FastAPI and Google Generative AI. This API effectively assesses comments for harmful content and flags those that might need review. Such a solution is valuable for any platform that hosts user comments, providing an automated, scalable, and efficient way to maintain safe online spaces.

With this setup, you can easily integrate this API into a larger application, such as a social media platform, online community, or customer review system. Adapting the API to various forms of user input or expanding its moderation capabilities makes it a powerful tool for modern content management.

FAQ

1. Can I use a different AI model for content moderation?
Yes, Google Generative AI is one option, but other platforms like OpenAI or custom NLP models could also be used, depending on your needs and budget.

2. What other content types can this model moderate?
This API moderates text-based comments but could be adapted to handle other forms, such as short-form text in emails, messages, or even titles.

3. How do I improve the accuracy of the moderation?
Experimenting with prompt engineering, using more precise language in your prompts, and fine-tuning the model if available, can significantly enhance moderation accuracy.

4. Is there a way to get more information on why a comment was flagged?
The response only provides a binary result for simplicity. However, you can customize the prompt to return more detailed explanations, which could be helpful for deeper insights.

5. How does this solution handle updates to harmful content detection standards?
The model’s parameters and prompt structure can be updated as moderation standards evolve, and if Google’s models improve, you can switch to a more advanced version for better results.

This content moderation API is a scalable, efficient, and user-friendly solution that will help keep digital spaces safe, allowing administrators and users alike to maintain focus on engaging content without worrying about harmful interactions.