Presentr AI: Video to Presentations

The Presentation Dilemma

Creating impactful presentation isn’t easy. Imagine recording a YouTube video filled with valuable insights, then spending hours crafting a presentation to accompany it. Sound exhausting? It is. But what if you could eliminate that manual effort entirely?

The idea for Presentr AI rise from here. For creators, educators, and professionals, crafting engaging presentations is often an tedious and time-consuming process. What if technology could take the heavy lifting off your shoulders?

Essentials Links

Live Demo: presentr.vercel.app
Presentr Repository: https://github.com/nidhinsankar/Presentr
Hypermode Modus (Backend): https://github.com/nidhinsankar/presentr-modus

Introducing Presentr AI

Meet Presentr AI – your ultimate presentation tool. With advanced AI technology, Presentr AI transforms your YouTube videos into professional, visually stunning, and logically structured presentations in minutes.

Presentr AI is designed for anyone who needs to present their ideas effectively without wasting time. Whether you're an educator, a marketer, or a YouTuber, this tool helps you easily turn your videos into engaging slides.

How Presentr AI Works

It’s as simple as 1-2-3:

Upload Your Video: Start by providing a YouTube link of your desired video.
AI-Powered Analysis: The AI dives into your content, analyzing its key points, tone, and context. It doesn’t just converts; it transforms.
Generate Presentation: Within moments, your polished presentation is ready—complete with slide headings, bullet points, and visuals tailored to your video’s message.

Presentr Features

Video-to-Slide Conversion 🎥 → 🌍

Intelligent Content Extraction
Transform full video content into precise, concise slides
Capture and distill core ideas with AI precision

AI-Powered Structuring 🔄

Automatic slide organization
Ensures logical content flow
Maximizes audience engagement and comprehension

Visual Insight Generation 🎨

One-click video insight transformation
Convert complex data into intuitive infographics
Simplifies sophisticated information presentation

Flexible Export Options 🔗

Seamless export to Powerpoint format
Instant presentation-ready output

Converting Idea to Presentation

Research

The concept for Presentr AI was born from a recurring pain point: transforming long-form content into a visually appealing and digestible format. As video content dominates modern media, the need for efficient tools to repurpose this content became clear.

Design

With keeping the user experience in mind, I crafted Presentr AI’s interface to be intuitive and user-friendly. By simplifying complex AI functionalities into straightforward workflows, I ensure that even non-technical users can harness the tool’s power.

Development

Developing Presentr AI involved integrating robust AI models for speech-to-text and natural language processing. Here’s how I tackled it:

Speech-to-Text Conversion: Leveraging Youtube API from rapidAPI for accurate subtitle transcription, even with diverse accents.
Content Summarization: Using GPT-3.5-turbo model to extract and organize key insights into slide-ready content.

I created a query to check if the models are functioning correctly. Here's what the working query looks like:

Testing in Postman: we can test our graphql apis generated from modus using postman client. We need to use graphql type request to test the apis

Configuration with Modus: To streamline AI integration, I utilized Modus’s configuration to manage and invoke different models efficiently. Here’s the configuration setup:

  {
    "$schema": "https://schema.hypermode.com/modus.json",
    "endpoints": {
      "default": {
        "type": "graphql",
        "path": "/graphql",
        "auth": "bearer-token"
      }
    },
    "models": {
      "gpt-3-5-turbo": {
        "sourceModel": "gpt-3.5-turbo",
        "path": "v1/chat/completions",
        "provider": "openai",
        "connection": "openai"
      }
    },
    "connections": {
      "zenquotes": {
        "type": "http",
        "baseUrl": "https://zenquotes.io/"
      },
      "rapidapi": {
        "type": "http",
        "baseUrl": "https://yt-api.p.rapidapi.com/",
        "headers": {
          "x-rapidapi-key": "{{API_KEY}}",
          "x-rapidapi-host": "yt-api.p.rapidapi.com"
        }
      },
      "openai": {
        "type": "http",
        "baseUrl": "https://api.openai.com/",
        "headers": {
          "Authorization": "Bearer {{OPENAI_KEY}}"
        }
      }
    }
  }

Model Invocation with Modus: I leveraged Modus SDK to invoke chatgpt models to transcription,modify transcription content for the slides. Here’s an example of how the invocation works:
clean transcript function: I parsed the content passed from the frontend and also removed grammer errors which will occur when trying to modify the content and returned the result
Convert text to array: Using this function i transformed long format of content into an array of objects with titles and description
Generate title and Description: with help of this i generated suitable title and description for the slide
Improve Presentation Content: To make the content more engaging and good presentation content for slides i used this function for making the content more verbose

        import { models } from "@hypermode/modus-sdk-as";
        import {
          OpenAIChatModel,
          ResponseFormat,
          SystemMessage,
          UserMessage,
        } from "@hypermode/modus-sdk-as/models/openai/chat";
        import { JSON } from "json-as";

        // this model name should match the one defined in the modus.json manifest file
        // const modelName: string = "text-generator";
        const modelName: string = "gpt-3-5-turbo";


        export function cleanTranscript(data: string): string {
          const instruction =
            "You are an AI designed to tidy text.When asked to generate data, always provide it in the requested format without including any extra characters like \n,#, explanations, or unnecessary text.just the data asked in the user";
          let parsedData: string[] = JSON.parse<string[]>(data);
          const newData = parsedData.join(" ");
          //   console.log(newData);

          const requestBody = `Tidy the grammar and punctuation of the following text 
            which was autogenerated from a YouTube video.  Where appropriate correct the words which are spelled incorrectly.just give the appropriate text result.When asked to generate data, always provide it in the requested format without including any extra characters, explanations like Here's the tidied text with corrected spelling and proper punctuation:, or unnecessary text. Only output data i requested . : ${newData}`;
          const model = models.getModel<OpenAIChatModel>(modelName);

          const input = model.createInput([
            new SystemMessage(instruction),
            new UserMessage(requestBody),
          ]);

          input.temperature = 0.7;
          const output = model.invoke(input);

          let result: string = output.choices[0].message.content;
          console.log(result);

          return result;
        }

        export function convertTextToArray(text: string, slideCount: string): string {
          const instruction =
            "Output only a clean array of objects(JSON) like  [{title: 'Title', content: ['Point 1', 'Point 2']}]. DO NOT use any newlines (\\n) or escape characters (/). Don't give the unwanted explaination like(here is the content,the generated content";
          const requestBody = `From the string ${text}, create an array of objects with a title and content property. The content property should be an array of strings. The array should have ${slideCount} objects.
                Generate an Array of objects with properties title (string) and content (array of strings). . Do not just split the text into parts. You need to reword it, improve it and make it friendly for a presentation. I have provided a schema for you to follow.
                    There should be a minimum of 3 content items per objects and a maximum of 4. No string in the content object should exceed 170 characters.
                  `;

          const model = models.getModel<OpenAIChatModel>(modelName);

          const input = model.createInput([
            new SystemMessage(instruction),
            new UserMessage(requestBody),
          ]);

          input.temperature = 0.7;
          const output = model.invoke(input);

          return output.choices[0].message.content;
        }

        export function generateTitleAndDescription(contentArray: string): string {
          let instruction =
            'You are an AI designed to create a title and description for a presentation.Output only a JSON object DO NOT use any newlines (\\n) or escape characters (/). Don\'t give the unwanted explaination just generate data like this {"title":"string",description:"string"}';
          const requestBody = `From the array of objects ${contentArray} create a title and description suitable for a presentation.give the output in form of a object with title and description property The title should be no longer than 15 words and the description should be no longer than 35 words.`;

          const model = models.getModel<OpenAIChatModel>(modelName);

          const input = model.createInput([
            new SystemMessage(instruction),
            new UserMessage(requestBody),
          ]);

          input.temperature = 0.7;
          const output = model.invoke(input);

          return output.choices[0].message.content.trim();
        }

        export function ImprovePresentationContent(content: string): string {
          const instruction =
            "You are an AI designed to improve presentation content. DO NOT use any newlines (\\n) or escape characters (/). Don't give the unwanted explaination like(here is the content,the generated content) just generate data as array like [{'title': 'Title 1', 'content': ['String 1', 'String 2']}]";
          const requestBody = `I am giving you an array of objects and each represents content for a presentation. I want you to loop over each object and improve the content and elaborate upon it so it reaches around 250 characters, remove any unnecessary information and make it more engaging. No string in the content array should be longer than 250 characters.
                  remove any references to the content coming from a YouTube video or any other source. I have provided a schema for how I want the data returned and the data to improve is follows: ${content}
                `;

          const model = models.getModel<OpenAIChatModel>(modelName);

          const input = model.createInput([
            new SystemMessage(instruction),
            new UserMessage(requestBody),
          ]);

          input.temperature = 0.7;
          const output = model.invoke(input);

          return output.choices[0].message.content.trim();
        }

This approach ensured seamless integration and accurate results from multiple models.

Deployment

Deployment was the final milestone. With Vercel’s scalable infrastructure, I launched Presentr AI as a seamless web app that’s fast, reliable, and accessible.

Hopfully, from this process you might have understood about how the hackathon journey looks like.

Tech Stack

For the frontend, I used NextJS (TypeScript) with TailwindCSS and Shadcn UI for the UI part and deployed it on Vercel.

I used Hypermode, Prisma, Supabase for the backend development.

Conclusion

AI is evolving daily in our day-to-day life, offering new possibilities and efficiencies. Presentr AI is a prime example of this transformation, as it enables users to concentrate on crafting and delivering powerful, impactful ideas. Meanwhile, the underlying technology handles the complex tasks, streamlining the entire process. This allows users to focus more on creativity and less on technical details, enhancing productivity and effectiveness.

I would like to thank to Hypermode and Hashnode for organizing this hackathon. Here’s to transforming the way of creating presentations, one step at a time! 🚀

Presentr AI: Transforming Videos into Engaging Presentations Instantly

Table of contents