Decibel - Audio to Text AI Note Taker
I sincerely want to express my gratitude to AWS Amplify and Hashnode, the organizers of this hackathon, for providing us with this remarkable opportunity.
#AWSAmplify #AWSAmplifyHackathon
In this article, the author introduces a web application that serves as an advanced voice-to-text converter and summarizer, designed to make the process of idea generation and documentation more efficient. The application, built using Next.js, AWS Amplify, and DynamoDB, offers key features such as voice-to-text conversion, text summarization, and a user-friendly interface. The author also highlights the benefits of using AWS Amplify features like the UI library, authentication, GraphQL API, S3 storage, and in-app messaging.
Team Members
Demo Video
Live Website Link
Description
What is Decibel?
Our web application is an advanced voice-to-text converter and summarizer. It is designed to help users quickly and easily convert their spoken ideas into written text and summaries. The platform enables users to upload voice notes, which are then transcribed into text and summarized, providing a concise and easy-to-read version of their thoughts
Why build this?
This platform was born out of the realization that many people find it easier to express their thoughts verbally rather than in written form. Traditional note-taking can be time-consuming and often disrupts the flow of ideas. By providing a tool that transcribes and summarizes spoken thoughts, we aim to make the process of idea generation and documentation more efficient and less daunting.
Key Features and Benefits
Voice-to-Text Conversion: Our software proficiently transcribes user-provided audio recordings into written text, effectively removing the necessity for manual note-taking. This functionality enables users to concentrate on their thoughts and ideas without distraction.
Text Summarization: The software generates succinct summaries of the transcribed text, streamlining the review process and facilitating rapid comprehension of the notes. This feature conserves users' time and energy.
Notion-Like Editor: Our application comes with a text editor similar to Notion's. This editor offers a clean and minimalistic interface, making it easy for users to edit and format their transcriptions. Features like lists, headers, links, dividers, quotes, and callouts are available to help users structure their notes effectively.
Favorites: Users can mark important notes as favorites. This feature allows users to quickly access their most important or frequently used notes, enhancing user experience and productivity.
Cloud Syncing: The application syncs all transcriptions and summaries to the user’s cloud storage. This ensures that users can access their notes anytime, anywhere, and from any device.
User-friendly Interface: The software boasts a straightforward and intuitive interface, ensuring a seamless user experience while uploading voice notes and accessing notes
Why should you use it?
This application revolutionizes the process of note-taking by automating the transcription and summarization of spoken thoughts. By using this tool, you can focus on your ideas and conversations without worrying about missing or forgetting important details. It not only helps you save time and effort but also enhances your productivity and efficiency.
Tech Stack
Technologies Utilized
Frontend: Next.js, Amplify UI Library
Backend: AWS Amplify Graphql APIs
Database: DynamoDB ( Amplify )
AWS Amplify Features used and their Benefits
AWS UI Library
AWS Amplify UI Library for React has pre-built components and features. Here is how we used all the benefits of the AWS UI Library :
Amplify Studio's ability to connect with Figma helped us design complex UI components and make changes with ease. We were able to design most of our navigation items, which included sidebars, top bars, note cards, audio cards, and other UI components. We also built all our basic components, like buttons, using this feature. It was very user-friendly and easy to modify.
We were also able to create and directly integrate our data model with UI components using the AWS UI library, which eliminated a lot of hassle.
We also utilized the Collection component, which allowed us to effortlessly create a list of notes. With collections, we could access a list of notes and incorporate it in its full-fledged form using GraphQL Appsync APIs.
We utilized the Storage Manager component to create a new note and also a Text Field to enter the note title.
We also used AWS Amplify's pre-built components and themes in our app, which allowed us to design with ease and create a responsive layout even without using Tailwind CSS. The feature I appreciated the most was the simplicity of changing CSS styles for different breakpoints. When using AWS Amplify UI, we could create an inline style as a set with different breakpoints assigned to various values, as demonstrated in the example below:
left={{ base: '0', medium: '300px' }}
and because of this, we could make our application responsive for mobile screens with ease.
Amplify Authentication
Amplify Authentication Integration: We have integrated AWS Amplify Authentication modules into our application. This not only provides a robust and production-ready user authentication system but also offers various authentication APIs and building blocks, simplifying the development process.
Support for Multiple Authentication Methods: Our application supports email/password-based authentication as well as OAuth with Google. This gives our users the flexibility to choose their preferred method of login, enhancing user experience.
And when it comes to verifying the mail, Amplify takes care of it, it sends the user verification code and then verifies it, making it very easy for us to work with it.Google Auth Integration: To make the sign-in process even more seamless, we have integrated Google authentication using Amazon Cognito. This allows users to sign in with just one click, using their Google accounts.
AWS GraphQl API
It was incredibly beneficial to work with automatically generated GraphQL APIs, as they were easy to integrate into our application. This efficiency allowed us to deliver the finished application in a significantly shorter time frame.
- You could consider this the backbone of our application, as nothing would function without it. Right from creating and updating notes, to searching and many other features, all these capabilities stem from this essential component.
With the power of AWS GraphQL APIs, which were generated for us automatically, we built features such as search, creating and updating notes, auto-integration with Amplify UI, adding and removing notes from the favorites section, and much more.
AWS Amplify Storage using S3
The AWS Storage allowed us to store our audio files securely and access them with proper authentication. We implemented authentication levels, ensuring that only signed-in users could upload or delete files from the S3 storage. Integration was incredibly straightforward.
Pagination for Storage
Audio files are queried in paginated batches of 8 files at a time to reduce the load on the API call and save bandwidth. This is achieved using the "nextToken" parameter returned by the API call to query files in sequence.
AWS In-App Messaging using Pinpoint
To enhance our user experience and make it more friendly, we leveraged the In-App Messaging feature provided by AWS Amplify using AWS Pinpoint. This powerful tool enabled us to guide our users through each step of their journey within the application, offering real-time assistance and support. By incorporating these interactive messages, we were able to create a more engaging and intuitive experience for our users, ensuring they could easily navigate and utilize the full potential of our platform.
a. You will get a guide to creating a note, once you go to /create_note page, you will get a message asking you to enter the title of the note.
b. Once you enter the note title, the message now asks you to upload an audio file.
c. After uploading the audio, it asks you to tap on the "create note" button and get your summary for the note to finally edit in the WYSIWYG Notion-like editor.
These features of in-app messaging help us guide our users and enhance the experience.
Challenges Faced
Designing a protected backend API for transcribing and summarizing the audio files from S3 Storage where only authenticated users can generate notes was a bit tough to figure out from the docs, but we finally managed to do it with SSR features from Amplify.
Additionally, handling audio files above 4-5 mins in duration did not work as the serverless function kept getting timed out. Due to this the app currently supports only summarizing audio files below 5 minutes into notes, which could be considered a drawback.
Code Repository
https://github.com/Sparklabs0/decibel
Demo Credentials
email:
wotem10291@sportrid.com
password: Temporary@123
Demo audio you can try: click here to download
Subscribe to my newsletter
Read articles from swaraj bachu directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by