Step-by-Step: Build a Search-Enhanced Gemini API with Flask

Ever wanted to build an application that doesn't just store information, but can understand, research, and converse about it? The rise of powerful Large Language Models (LLMs) like Google's Gemini makes this more accessible than ever.

But what if you want your AI to have access to up-to-the-minute information from the web? That's where combining an LLM with a search engine comes in handy!

In this post, we'll walk step-by-step through creating a powerful backend API using Flask, Google Gemini, and DuckDuckGo Search. This API will be able to:

Chat conversationally, remembering context.
Optionally perform web searches using DuckDuckGo to inform its chat responses.
Handle direct queries, using web search results to generate detailed answers.
Offer a "deep analysis" mode for more comprehensive responses.

Let's dive in!

(Suggested Cover Image: A graphic combining logos of Flask, Google Gemini, DuckDuckGo, and maybe a brain/network graphic)

Prerequisites

Before we start coding, make sure you have:

Python 3 installed.
pip (Python package installer).
A Google Cloud Project set up with the Generative AI API (Gemini) enabled. You'll need an API Key. (You can often get free tier access to start).
Basic understanding of Python and Flask (though we'll explain each step).
A text editor or IDE (like VS Code)

or you can clone the repository (ensure to fork and contribute)

%[https://github.com/Joshuaatanu/search-ai-beta]

Step 1: Project Setup & Dependencies

First, let's set up our project environment.

Create a Project Directory:

 mkdir flask-gemini-api
 cd flask-gemini-api

Create a Virtual Environment: (Highly recommended!)

 # On macOS/Linux
 python3 -m venv venv
 source venv/bin/activate

 # On Windows
 python -m venv venv
 .\venv\Scripts\activate

Install Required Libraries: We need Flask for the web server, Flask-CORS to allow frontend access, google-generativeai for Gemini, duckduckgo-search for web searches, python-dotenv to manage our API key securely, and requests (often useful, included here).

Create a file named requirements.txt and add the following lines:
```
 Flask
 Flask-CORS
 google-generativeai
 duckduckgo-search
 python-dotenv
 requests
```
Now, install them:
```
 pip install -r requirements.txt
```
Set up Environment Variables: Create a file named .env in your project root. Never commit this file to Git! Add your Gemini API key like this:
```
 GEMINI_API_KEY=YOUR_GOOGLE_GEMINI_API_KEY_HERE
```
Replace YOUR_GOOGLE_GEMINI_API_KEY_HERE with your actual key.

Step 2: Building the Flask App (`app.py`)

Create a file named app.py. This is where our core logic will reside.

Imports and Initial Setup:

# app.py
from flask import Flask, request, jsonify, render_template
from flask_cors import CORS
import requests
from duckduckgo_search import DDGS
import os
from dotenv import load_dotenv
# Import the Gemini client from Google's generative AI library
import google.generativeai as genai

# Load environment variables (like our API key)
load_dotenv()

# Initialize Flask app
app = Flask(__name__)
# Enable CORS (Cross-Origin Resource Sharing) - Allows requests from web browsers
CORS(app)

print("Flask App Initialized and CORS enabled.") # Debug print

We import necessary libraries, load the .env file to access GEMINI_API_KEY, initialize Flask, and enable CORS.

DuckDuckGo Search Integration:

We need functions to query DuckDuckGo and format the results.

# app.py (continued)

def query_duckduckgo_text(query):
    """
    Uses DuckDuckGo's text search via the duckduckgo_search library to get search results.
    """
    try:
        print(f"Attempting DuckDuckGo search for: '{query}'") # Debug print
        ddgs = DDGS()
        # We limit results to keep context concise for the LLM
        results = ddgs.text(keywords=query, region="wt-wt", safesearch="moderate", max_results=3)
        print(f"DDG Search Results for '{query}': {results}") # Debugging print
        return results if results else [] # Ensure empty list if no results
    except Exception as e:
        print(f"DuckDuckGo text search error for query '{query}': {e}")
        return [] # Return empty list on error

def generate_search_context(search_results):
    """
    Generates a combined text string from search results to provide context to Gemini.
    """
    if not search_results: # Handle cases where search fails or returns nothing
        print("No search results provided to generate_search_context.") # Debug print
        return "No relevant search results found."
    combined_text = ""
    for result in search_results:
        title = result.get("title", "No title")
        snippet = result.get("body", "No snippet available")
        url = result.get("href", "")
        combined_text += f"Title: {title}\nSnippet: {snippet}\nURL: {url}\n\n"
    # print(f"Generated Search Context:\n{combined_text}") # Debug print (can be long)
    return combined_text.strip() # Remove trailing newline

query_duckduckgo_text: Takes a search query, uses DDGS().text() to fetch results (limiting to 3), and returns them. Includes basic error handling and debug prints.
generate_search_context: Takes the list of results and formats them into a clean string that Gemini can easily understand. Handles empty results.

Google Gemini Integration:

Now for the AI magic! We need functions to interact with the Gemini API.

# app.py (continued)

def query_gemini(prompt, deep_analysis=False):
    """
    Sends a direct prompt to Gemini, configuring based on deep_analysis.
    Used primarily by the /api/query endpoint.
    """
    try:
        print(f"Configuring Gemini API for query (deep_analysis={deep_analysis})...") # Debug
        genai.configure(api_key=os.getenv("GEMINI_API_KEY"))

        model_name = "gemini-1.5-flash" # Use a capable model
        config = {
             "max_output_tokens": 2048 if deep_analysis else 1024,
             "temperature": 0.7 if deep_analysis else 0.4 # Adjusted temps
        }
        print(f"Using model: {model_name}, Config: {config}") # Debug

        model = genai.GenerativeModel(model_name, generation_config=config)

        print("Sending prompt to Gemini for query...") # Debug
        response = model.generate_content(prompt)

        # --- Safety Check ---
        if not response.parts:
             print("Gemini Warning: Response has no parts (potentially blocked). Checking candidates...") # Debug
             if response.candidates:
                 candidate = response.candidates[0]
                 print(f"Candidate finish reason: {candidate.finish_reason}") # Debug
                 if hasattr(candidate, 'safety_ratings') and candidate.safety_ratings:
                     print(f"Safety Ratings: {candidate.safety_ratings}") # Debug
                     # Provide a more informative error based on ratings if possible
                     blocked_categories = [rating.category for rating in candidate.safety_ratings if rating.probability in ['MEDIUM', 'HIGH']]
                     if blocked_categories:
                         return f"The response could not be generated because it was blocked due to content related to: {', '.join(map(str, blocked_categories))}."
             # Fallback error message
             return "The response could not be generated, possibly due to safety filters or an internal issue."

        print("Received response from Gemini for query.") # Debug
        return response.text

    except Exception as e:
        print(f"Gemini API direct query error: {e}")
        import traceback
        traceback.print_exc() # Print full traceback for debugging
        return None


def query_gemini_chat(messages, context=None, deep_analysis=False, use_search=False):
    """
    Uses the Gemini API chat endpoint, managing history, context, search, and deep analysis.
    """
    try:
        print(f"Configuring Gemini API for chat (deep_analysis={deep_analysis}, use_search={use_search})...") # Debug
        genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
        # Use a model suitable for chat, like 1.5 Flash or Pro
        model = genai.GenerativeModel("gemini-1.5-flash")

        gemini_messages = [] # History format expected by Gemini API

        # 1. Add optional context first (needs careful role handling)
        if context:
            print("Adding provided context to chat history.") # Debug
            gemini_messages.append({"role": "user", "parts": [{"text": f"Important Context for our Conversation: {context}"}]})
            # Add a model acknowledgment to maintain turn structure
            gemini_messages.append({"role": "model", "parts": [{"text": "Understood. I will keep that context in mind."}]})

        # 2. Perform search if requested and add results
        if use_search and messages: # Need a message to search based on
            last_user_message = next((msg for msg in reversed(messages) if msg.get('role') == 'user'), None)
            if last_user_message:
                last_user_message_content = last_user_message['content']
                print(f"Performing search for chat query: '{last_user_message_content}'") # Debug
                search_results = query_duckduckgo_text(last_user_message_content)
                search_context_text = generate_search_context(search_results)
                if search_context_text and search_context_text != "No relevant search results found.":
                    print("Adding search results to chat history.") # Debug
                    # Add search results as user input, then a model acknowledgement
                    gemini_messages.append({"role": "user", "parts": [{"text": f"I found these web search results related to your last message:\n{search_context_text}"}]})
                    gemini_messages.append({"role": "model", "parts": [{"text": "Thank you for the search results. I'll consider them in my response."}]})
                else:
                    print("No usable search results found or generated.") # Debug
            else:
                 print("Search enabled, but couldn't find the last user message to search for.") # Debug


        # 3. Add the actual conversation history (ensuring valid alternating roles)
        print("Processing provided message history...") # Debug
        valid_history = []
        last_role = None # Track the last role added ('user' or 'model')

        # Infer starting role based on history context added
        if gemini_messages and gemini_messages[-1]['role'] == 'model':
            last_role = 'model'

        for msg in messages:
             current_role = 'model' if msg.get('role') == 'assistant' else 'user' # Convert 'assistant' to 'model'
             # Ensure content exists
             content = msg.get('content', '').strip()
             if not content:
                 print(f"Skipping message with empty content for role {current_role}") # Debug
                 continue

             if current_role != last_role:
                 valid_history.append({"role": current_role, "parts": [{"text": content}]})
                 last_role = current_role
             else:
                 # Append content to the last message of the *same* role
                 print(f"Warning: Consecutive messages from role '{current_role}'. Appending content.") # Debug
                 if valid_history and valid_history[-1]['role'] == current_role:
                     valid_history[-1]['parts'][0]['text'] += "\n" + content
                 else:
                      # This might happen if the very first message matches an initial context role, handle gracefully
                      print(f"Edge case: Consecutive role '{current_role}' at the beginning or after context.") # Debug
                      valid_history.append({"role": current_role, "parts": [{"text": content}]})
                      # Don't update last_role here, as the sequence was already broken


        # Combine context/search preamble with the processed history
        gemini_messages.extend(valid_history)

        # Ensure the history ends with a user message if possible for send_message
        if not gemini_messages or gemini_messages[-1]['role'] != 'user':
            print("Warning: Chat history doesn't end with a user message. This might cause issues.") # Debug
            # Depending on strictness, either return an error or try to proceed cautiously.
            # If history is totally empty:
            if not gemini_messages:
                print("Error: No valid messages processed to send to Gemini chat.") # Debug
                return "Error: No conversation history to process."


        # 4. Apply deep analysis prompt engineering *if needed* to the last message
        prompt_prefix = ""
        if deep_analysis:
            print("Applying deep analysis prefix.") # Debug
            prompt_prefix = (
                "Please provide a deep and comprehensive analysis in your response. "
                "Consider multiple perspectives, potential implications, and structure your answer clearly. "
                "Based on our conversation so far:\n\n"
            )

        # Modify the *last* message's content (which should ideally be user role)
        if gemini_messages: # Check if there are messages to modify
            # Don't assume [-1] is always 'user', check it:
            if gemini_messages[-1]['role'] == 'user':
                gemini_messages[-1]['parts'][0]['text'] = prompt_prefix + gemini_messages[-1]['parts'][0]['text']
            else:
                # Handle case where last message isn't 'user' but deep analysis is requested.
                # Option: Add a new user message? Or maybe log a warning?
                print("Warning: Deep analysis requested, but the last message in history isn't 'user'. Prefix not applied directly to history.") # Debug
                 # For safety, let's *not* modify the last message if it's 'model'.
                 # Instead, perhaps the frontend should ensure the final message is 'user'.


        # 5. Start chat and send the message
        print(f"Total processed Gemini messages: {len(gemini_messages)}") # Debug

        # Gemini's `start_chat` prefers history *excluding* the very last message.
        # The last message content is sent via `send_message`.

        chat_history_for_model = []
        current_message_to_send_content = "..." # Default or placeholder

        if len(gemini_messages) > 1:
            chat_history_for_model = gemini_messages[:-1]
            current_message_to_send_content = gemini_messages[-1]['parts'][0]['text']
        elif len(gemini_messages) == 1:
             # If only one message, it's the current one, no history.
             chat_history_for_model = []
             current_message_to_send_content = gemini_messages[0]['parts'][0]['text']
        else:
             # Should have been caught earlier, but double-check
              print("Error: Cannot proceed with Gemini chat, no valid messages.") # Debug
              return "Error: Cannot start chat without messages."

        # Ensure history ends correctly before passing to start_chat if strict API behavior
        # (Check Google's API docs if specific start/end roles are mandated for history)

        chat = model.start_chat(history=chat_history_for_model)
        print(f"Sending to Gemini Chat. History length: {len(chat_history_for_model)}") # Debugging
        # print(f"Current message content being sent: {current_message_to_send_content}") # Debug (potentially very long)

        response = chat.send_message(current_message_to_send_content)

        # --- Safety Check (similar to query_gemini) ---
        if not response.parts:
             print("Gemini Chat Warning: Response has no parts (potentially blocked). Checking candidates...") # Debug
             if response.candidates:
                 candidate = response.candidates[0]
                 print(f"Candidate finish reason: {candidate.finish_reason}") # Debug
                 if hasattr(candidate, 'safety_ratings') and candidate.safety_ratings:
                     print(f"Safety Ratings: {candidate.safety_ratings}") # Debug
                     blocked_categories = [rating.category for rating in candidate.safety_ratings if rating.probability in ['MEDIUM', 'HIGH']]
                     if blocked_categories:
                         return f"The response could not be generated because it was blocked due to content related to: {', '.join(map(str, blocked_categories))}."
             return "The chat response could not be generated, possibly due to safety filters or an internal issue."

        print("Received response from Gemini Chat.") # Debug
        return response.text

    except Exception as e:
        print(f"Gemini API chat error: {e}")
        import traceback
        traceback.print_exc() # Print full traceback for debugging
        return None # Indicate error to the caller

query_gemini: Handles direct prompts, configures Gemini based on deep_analysis, and includes basic safety checks.
query_gemini_chat: Manages conversational history, optionally incorporates context and search results, handles deep_analysis prompt injection, and ensures message roles alternate correctly. Includes more detailed debug prints and safety checks.

Helper for Query Endpoint Prompting:

This function creates a specific prompt structure for the /api/query endpoint.

# app.py (continued)

def generate_answer_from_search(user_query, search_results, deep_analysis=False):
    """
    Constructs a detailed prompt for Gemini based on search results and the user query,
    adjusting instructions based on deep_analysis.
    """
    print(f"Generating answer prompt for query: '{user_query}', deep_analysis={deep_analysis}") # Debug
    combined_text = generate_search_context(search_results) # Reuse formatting

    # Base instructions for Gemini
    base_prompt = (
        "You are an AI assistant tasked with answering a user's query based *only* on the provided web search results.\n\n"
        "Follow these steps:\n"
        "1. Carefully analyze the user's query.\n"
        "2. Examine the provided search results (titles, snippets, URLs).\n"
        "3. Synthesize the information *strictly* from the relevant search results to directly address the query.\n"
    )

    # Add specific instructions based on deep_analysis mode
    if deep_analysis:
        base_prompt += (
            "4. Go beyond simple summarization. Provide a **deep and comprehensive analysis** based *only* on the provided texts.\n"
            "5. Consider multiple perspectives, potential implications, and connections between the information snippets found in the results.\n"
            "6. If technical details are present in the results, explain underlying mechanisms based on that information.\n"
            "7. Structure your response logically with clear headings or bullet points where appropriate.\n"
            "8. Do not add information not present in the search results. You can state if the results are insufficient.\n\n"
        )
    else:
        base_prompt += (
            "4. Extract the key facts and insights directly related to the query from the provided results.\n"
            "5. Provide a clear, concise, and factual answer based *only* on the search results.\n"
            "6. Avoid speculation or information not present in the provided text.\n\n"
        )

    # Combine instructions, search results, and the query into the final prompt
    prompt = (
        f"{base_prompt}"
        "--- PROVIDED SEARCH RESULTS START ---\n"
        f"{combined_text}\n"
        "--- PROVIDED SEARCH RESULTS END ---\n\n"
        "USER QUERY:\n"
        f"{user_query}\n\n"
        "Based *ONLY* on the provided search results above, generate your response now:\n"
    )

    # print(f"--- Prompt for /api/query ---\n{prompt}\n--------------------------") # Debugging (can be very long)
    print(f"Prompt generated for query '{user_query}'. Calling query_gemini...") # Debug

    # Call the simpler Gemini query function
    return query_gemini(prompt, deep_analysis)

This function builds a specific prompt telling Gemini exactly how to use the provided search snippets, changing instructions based on deep_analysis.

Flask API Endpoints:

These routes expose our functionality over HTTP.

# app.py (continued)

@app.route("/api/chat", methods=["POST"])
def handle_chat():
    """Handles POST requests to the /api/chat endpoint."""
    print("Received request to /api/chat") # Debug
    if not request.is_json:
        print("Error: Request is not JSON") # Debug
        return jsonify({"error": "Request must be JSON"}), 415

    data = request.json
    messages = data.get("messages", [])
    context = data.get("context") # Optional context string
    deep_analysis_enabled = data.get("deep_analysis", False) # bool
    use_search_enabled = data.get("use_search", False) # bool
    print(f"Chat Request Params: deep={deep_analysis_enabled}, search={use_search_enabled}, history_len={len(messages)}") # Debug

    if not messages:
        print("Error: No messages provided in chat request") # Debug
        return jsonify({"error": "No messages provided"}), 400

    # Call our Gemini chat function
    gemini_response_text = query_gemini_chat(
        messages,
        context,
        deep_analysis=deep_analysis_enabled,
        use_search=use_search_enabled
    )

    if gemini_response_text is None:
        print("Error: Failed to get response from query_gemini_chat") # Debug
        return jsonify({"error": "Error generating chat response from Gemini"}), 500
    elif "The response could not be generated" in gemini_response_text:
         print(f"Info: Gemini response indicates blocking or issue: {gemini_response_text}") # Debug
         # Return the specific error from Gemini
         return jsonify({"response": gemini_response_text, "error": "Content generation issue", "deep_analysis": deep_analysis_enabled, "use_search": use_search_enabled}), 200 # Or maybe 503 Service Unavailable?


    # Return the successful response along with the flags used
    final_response = {
        "response": gemini_response_text,
        "deep_analysis": deep_analysis_enabled, # Return flags for frontend awareness
        "use_search": use_search_enabled
    }
    print("Successfully generated chat response. Sending back.") # Debug
    # print(f"Chat Response Sent: {final_response}") # Debug - careful logging full AI responses
    return jsonify(final_response)


@app.route("/api/query", methods=["POST"])
def handle_query():
    """Handles POST requests to the /api/query endpoint."""
    print("Received request to /api/query") # Debug
    if not request.is_json:
         print("Error: Request is not JSON") # Debug
         return jsonify({"error": "Request must be JSON"}), 415

    data = request.json
    user_query = data.get("query", "")
    deep_analysis = data.get("enable_deep_analysis", False) # bool flag for this endpoint
    print(f"Query Request Params: deep={deep_analysis}, query='{user_query}'") # Debug

    if not user_query:
        print("Error: No query provided in query request") # Debug
        return jsonify({"error": "No query provided"}), 400

    # 1. Perform search
    search_results = query_duckduckgo_text(user_query)
    # Note: search_results might be [] if DDG fails or finds nothing

    # 2. Generate answer using search results and Gemini
    generated_answer = generate_answer_from_search(
        user_query,
        search_results,
        deep_analysis
    )

    if generated_answer is None:
        print("Error: Failed to get answer from generate_answer_from_search") # Debug
        return jsonify({"error": "Error generating answer from Gemini"}), 500
    elif "The response could not be generated" in generated_answer:
         print(f"Info: Gemini response indicates blocking or issue: {generated_answer}") # Debug
         # Return the specific error from Gemini
         return jsonify({"query": user_query, "answer": generated_answer, "error": "Content generation issue", "deep_analysis": deep_analysis}), 200


    # Return query, results (optional, maybe trim?), answer, and flag
    final_response = {
        "query": user_query,
        # Exclude full search results unless needed by frontend to reduce payload size
        # "search_results_summary": [res.get('title', 'No Title') for res in search_results],
        "answer": generated_answer,
        "deep_analysis": deep_analysis
    }
    print("Successfully generated query answer. Sending back.") # Debug
    # print(f"Query Response Sent: {final_response}") # Debug
    return jsonify(final_response)


@app.route("/")
def home():
    """Serves a simple message at the root."""
    print("Accessed root '/' endpoint.") # Debug
    # Optional: Render a simple HTML template
    # from flask import render_template
    # return render_template("index.html") # Requires a 'templates/index.html' file
    return "Flask-Gemini-API with DuckDuckGo Search is running!"


# Run the Flask app
if __name__ == "__main__":
    # Get port from environment variable or default to 10000
    port = int(os.getenv("PORT", 10000))
    # Run in debug mode for development (auto-reloads on code change)
    # host='0.0.0.0' makes it accessible on your local network, not just localhost
    print(f"Starting Flask server on host 0.0.0.0, port {port}, debug=True") # Debug
    app.run(host='0.0.0.0', debug=True, port=port)

/api/chat: Handles conversational requests, manages state flags. Added error handling for non-JSON requests and Gemini blocking.
/api/query: Handles direct query requests, performs search, then generates answer. Added error handling for non-JSON requests and Gemini blocking.
/: Simple root route.
if __name__ == "__main__":: Runs the Flask development server.

Step 3: Running and Testing

Activate Virtual Environment: Make sure your venv is active (source venv/bin/activate or .\venv\Scripts\activate).
Ensure .env File Exists: Double-check your GEMINI_API_KEY is in the .env file in your project root.
Run the App:
```
 python app.py
```
You should see output indicating the server is running, likely on http://0.0.0.0:10000/, along with the debug print statements.

Test with curl or an API Client (like Postman/Insomnia):

(Make sure the server app.py is running in a separate terminal)

Test Root: Open http://localhost:10000/ in your browser or use:

 curl http://localhost:10000/

Test Chat Endpoint (Basic):

 curl -X POST http://localhost:10000/api/chat \
      -H "Content-Type: application/json" \
      -d '{
            "messages": [
              {"role": "user", "content": "Hello! Briefly explain what Flask is."}
            ]
          }'

Test Chat Endpoint (With Search):

 curl -X POST http://localhost:10000/api/chat \
      -H "Content-Type: application/json" \
      -d '{
            "messages": [
              {"role": "user", "content": "What is the latest version of the google-generativeai Python library?"}
            ],
            "use_search": true
          }'

Test Chat Endpoint (With History & Deep Analysis):

 curl -X POST http://localhost:10000/api/chat \
      -H "Content-Type: application/json" \
      -d '{
            "messages": [
              {"role": "user", "content": "What are the core principles of object-oriented programming?"},
              {"role": "assistant", "content": "Key principles include Encapsulation, Abstraction, Inheritance, and Polymorphism."},
              {"role": "user", "content": "Can you elaborate on polymorphism with a simple analogy?"}
            ],
            "deep_analysis": true
          }'

Test Query Endpoint (Standard Analysis):

 curl -X POST http://localhost:10000/api/query \
      -H "Content-Type: application/json" \
      -d '{
            "query": "What are common use cases for Redis?",
            "enable_deep_analysis": false
          }'

Test Query Endpoint (Deep Analysis):

 curl -X POST http://localhost:10000/api/query \
      -H "Content-Type: application/json" \
      -d '{
            "query": "Compare the consensus mechanisms Proof-of-Work vs Proof-of-Stake",
            "enable_deep_analysis": true
          }'

Check the terminal where app.py is running to see the debug output!

How It Works: The Flow

A client (e.g., your frontend app, curl) sends a POST request with JSON data to either /api/chat or /api/query.
Flask receives the request and routes it to the appropriate function (handle_chat or handle_query).
The handler function parses the JSON data (messages, query, context, flags).
(Conditional - Chat with use_search=true / Query Endpoint): query_duckduckgo_text is called to fetch relevant web search results.
(Conditional): Search results are formatted by generate_search_context.
The appropriate Gemini function is called (query_gemini_chat or generate_answer_from_search which uses query_gemini):
- The API key is loaded via genai.configure.
- A prompt or chat history is constructed, potentially including context, search results, and special instructions (like the deep analysis prefix or query guidelines).
- The request is sent to the Google Gemini API (model.generate_content or chat.send_message).
Gemini processes the request and sends back a generated text response (or indicates blocking).
Our Flask function receives the Gemini response.
The response is packaged into a JSON object.
Flask sends the JSON response back to the client.

Next Steps & Potential Enhancements

Build a Frontend: Create a React, Vue, Svelte, or simple HTML/JS frontend to interact with your new API.
Refine Error Handling: Add more specific error handling for API rate limits, network issues, invalid Gemini API keys, and malformed input JSON.
Asynchronous Operations: For better performance under load, especially with network calls (search, Gemini), explore Flask's async capabilities or libraries like Celery/Redis for background tasks.
Streaming Responses: Implement response streaming from Gemini for the /api/chat endpoint to make the chat feel more responsive. The google-generativeai library supports this (stream=True).
Caching: Cache DuckDuckGo search results for a short period (e.g., 5-10 minutes) for identical queries to reduce search requests.
Authentication/Authorization: Secure your API if it's not meant for public use (e.g., using API keys, JWT tokens).
Database Integration: Store chat histories or query logs in a database for persistence and analysis.
Configuration: Move model names, temperature settings, max tokens, etc., into configuration files or environment variables instead of hardcoding them.
Cost Management: Be mindful of Gemini API costs. Implement logging or tracking of token usage.

Conclusion

Congratulations! You've built a versatile and intelligent API using Flask, Google Gemini, and DuckDuckGo Search. This powerful combination allows you to create applications that can converse naturally and ground their responses in fresh information retrieved from the web.

The flexibility of having distinct chat and query endpoints, along with the "deep analysis" toggle, provides a solid foundation for building sophisticated AI-powered tools, from custom research assistants to dynamic content generators and knowledgeable chatbots.

This project demonstrates how relatively easy it is to integrate multiple services (web framework, search API, LLM API) using Python. Feel free to experiment, enhance, and adapt this code for your own amazing projects! Happy coding!

Build an AI Researcher: Flask API for Deep Analysis with Gemini

Table of contents

Prerequisites

Step 1: Project Setup & Dependencies

Step 2: Building the Flask App (`app.py`)

Imports and Initial Setup:

DuckDuckGo Search Integration:

Google Gemini Integration:

Helper for Query Endpoint Prompting:

Flask API Endpoints:

Step 3: Running and Testing

How It Works: The Flow

Next Steps & Potential Enhancements

Conclusion

Subscribe to my newsletter

Joshua Atanu

Joshua Atanu

Build an AI Researcher: Flask API for Deep Analysis with Gemini

Table of contents

Prerequisites

Step 1: Project Setup & Dependencies

Step 2: Building the Flask App (app.py)

Imports and Initial Setup:

DuckDuckGo Search Integration:

Google Gemini Integration:

Helper for Query Endpoint Prompting:

Flask API Endpoints:

Step 3: Running and Testing

How It Works: The Flow

Next Steps & Potential Enhancements

Conclusion

Subscribe to my newsletter

Joshua Atanu

Joshua Atanu

Step 2: Building the Flask App (`app.py`)