Build Your Own AI Language Tutor

Martin BowlingMartin Bowling
3 min read

Hey there, fellow developers! ๐Ÿ‘‹ I'm excited to share with you my latest project: the AI Language Learning Assistant. This little gem came to life during a #Hack@Home session when I couldn't make it to the Berkeley AI Hackathon. But who says you need to be at a physical event to innovate, right? ๐Ÿš€

Let's dive into the nitty-gritty of this AI-powered polyglot helper!

The Tech Stack ๐Ÿ› ๏ธ

First things first, let's talk about the awesome tech that makes this possible:

  • Groq API: The powerhouse behind our natural language processing

  • Gradio: For that slick, user-friendly interface

  • Replit: Our trusty hosting platform

  • Python: The glue that holds it all together

Key Features ๐ŸŒŸ

  1. Multilingual Support: We're talking 9 languages here! From English to Portuguese, we've got you covered.

  2. Real-time Speech Recognition: Thanks to Groq's Whisper API, we can turn your speech into text in a snap.

  3. Intelligent Feedback: Grammar, vocabulary, pronunciation - we analyze it all!

  4. Contextual Responses: Our AI doesn't just respond; it converses naturally.

  5. Instant Translations: Because sometimes you need that extra help to understand.

The Code Breakdown ๐Ÿ’ป

Let's look at some of the cool parts of our code:

Speech Recognition

We're using Groq's Whisper API for transcription. Check out this async function:

async def transcribe_audio(audio_file):
    try:
        form_data = FormData()
        form_data.add_field('file', audio_file, filename='audio.wav', content_type='audio/wav')
        form_data.add_field('model', 'whisper-large-v3')
        # ... more fields ...

        async with aiohttp.ClientSession() as session:
            async with session.post("https://api.groq.com/openai/v1/audio/transcriptions",
                                    headers={"Authorization": f"Bearer {GROQ_API_KEY}"},
                                    data=form_data) as response:
                # ... handle response ...
                return result["text"]
    except ClientError as e:
        raise GroqError(f"Network error during transcription: {str(e)}")

This function handles the audio file, sends it to Groq's API, and returns the transcribed text. Pretty neat, huh?

Language Analysis

Once we have the text, we analyze it:

async def analyze_language(text, target_language, history):
    prompt = f"""Analyze the following text in {target_language}, considering the conversation history:
    # ... prompt details ...
    """

    response = await async_groq_client.chat.completions.create(
        model="llama3-70b-8192",
        messages=[{"role": "system", "content": "You are an expert language tutor."},
                  {"role": "user", "content": prompt}],
        temperature=0.3,
        max_tokens=1000,
        response_format={"type": "json_object"})
    return json.loads(response.choices[0].message.content)

This function crafts a prompt for Groq's language model, asking it to analyze grammar, vocabulary, and more. We get back a structured JSON response with all the juicy details.

Conversation Management

To keep things flowing naturally, we maintain a conversation history:

def update_conversation_history(user_id, target_language, new_entry):
    key = f"{user_id}_{target_language}"
    history = get_conversation_history(user_id, target_language)
    history.append(new_entry)
    history = history[-5:]  # Keep only the last 5 interactions
    db[key] = history

This function updates our conversation history, keeping the last 5 interactions for context. It's like giving our AI a short-term memory!

The User Experience ๐ŸŽจ

Using Gradio, we've created an interface that's both powerful and easy to use:

iface = gr.Interface(
    fn=language_tutor,
    inputs=[
        gr.Audio(type="filepath", label="Speak here"),
        gr.Dropdown(["English", "Chinese", "German", "Spanish", "Russian", "Korean",
                     "French", "Japanese", "Portuguese"],
                    label="Select target language"),
        # ... more inputs ...
    ],
    outputs=[
        gr.Textbox(label="Session ID"),
        gr.Textbox(label="Target Language"),
        gr.Textbox(label="Transcription"),
        gr.Textbox(label="Language Analysis"),
        gr.Textbox(label="AI Response"),
        gr.Textbox(label="Translation"),
        gr.Textbox(label="Session Management Message")
    ],
    title="AI Language Learning Assistant",
    description="Speak in your target language and get instant feedback and responses!"
)

This creates a web interface where users can speak, select their language, and get instant feedback. It's like having a language tutor in your browser! ๐ŸŒ๐Ÿ‘ฉโ€๐Ÿซ

Wrapping Up ๐ŸŽ

And there you have it, folks! A full-fledged AI language tutor that you can spin up with just a few clicks on Replit. It's been a blast putting this together, and I hope it inspires you to create your own AI-powered tools.

Remember, you don't need to be at a fancy hackathon to innovate. Sometimes, the best ideas come when you're #Hack@Home! ๐Ÿ ๐Ÿ’ก

So, what are you waiting for? Fork the Replit template, add your Groq API key, and start your polyglot journey today!

Happy coding, and may your conversations be ever multilingual! ๐ŸŒ๐Ÿ—ฃ๏ธ

0
Subscribe to my newsletter

Read articles from Martin Bowling directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Martin Bowling
Martin Bowling