Building a NotebookLM mini-clone Powered by Amazon Bedrock


Chatbots are valuable tools for giving customers and employees access to internal documents and information. When powered by a large language model (LLM), a chatbot can enhance the user experience and minimize the need for human involvement. However, LLMs have limitations, as they rely on pre-trained data and may not always provide contextually relevant responses.
To overcome these limitations, businesses can implement the Retrieval-Augmented Generation (RAG) technique. RAG combines retrieval-based and generative AI models to deliver more accurate and context-aware responses. It works by representing document content and user queries as vector embeddings, which are then retrieved and passed to an LLM to generate responses with enhanced context.
In this article, I will deploy a PDF chatbot application that utilizes retrieval-augmented generation to answer queries based on embeddings created from PDF documents. The chatbot will use the LangChain framework and Amazon Bedrock to generate embeddings and responses. The user interface will be built with Streamlit, and the application will be deployed as an Amazon ECS Service. You will configure the system and interact with the chatbot to evaluate its functionality.
Introduction
In this step, you will configure the AWS credentials needed to use the AWS SAM CLI.
Instructions
Use instructions her to download and setup your AWS account for using the SAM CLI
https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/install-sam-cli.html
In the terminal, enter the following set of commands to configure the AWS account credentials:
aws configure set aws_access_key_id AKIA4AEEWDFAKDIDBFJEA && aws configure set aws_secret_access_key yadgahdadsdysdsdbsdjsdndjcjs && aws configure set default.region us-west-2
Note : Replace with your own access key and secret access key
Enter
aws configure list
to confirm the credentials have been set correctly:Before moving on to the next lab step, download the project folder here;
Any directory and files referenced in this lab will appear in this folder
Deploying the PDF Embedding Solution
Introduction
In this step, you will deploy the PDF embedding solution using AWS SAM. At a high level, the solution extracts metadata from PDF documents and generates embeddings using LangChain and Amazon Bedrock.
The PDF chatbot application will use the embeddings to answer user prompts based on the content of the PDF documents.
The following diagram shows the architecture of the embedding solution:
The solution consists of the following components:
An S3 bucket to store the PDF documents
A DynamoDB table to store the metadata and status of the uploaded documents
A Lambda function to extract metadata from the PDF documents
A Lambda function to generate embeddings from the PDF documents
An SQS queue to store the PDF document processing requests
The embedding process is summarized below:
A PDF document is uploaded to the S3 bucket and an S3 event triggers the ExtractMetadata Lambda function.
The ExtractMetadata Lambda function extracts metadata from the PDF document and sends a message to the SQS queue with the metadata.
The GenerateEmbeddings Lambda function reads the message from the SQS queue and generates embeddings from the PDF document using LangChain and Amazon Bedrock. The embeddings are stored in the same S3 bucket.
Instructions
Download the source project folder here
Create an s3 bucket withe name of = “deploy-assets-[random five letters]” and allow all public access
Click the Explorer tab to open the file explorer:
Open the samconfig.toml file in your editor, then paste in the following configuration:
version=0.1 [default.global.parameters] stack_name = "ChatbotStack" [default.deploy.parameters] region = "us-west-2" s3_bucket = "deploy-assets-*****" s3_prefix = "ntbk-lm" confirm_changeset = true capabilities = "CAPABILITY_IAM" tags = "project=\"ntbk-lab\" stage=\"development\""
The configuration above provides the AWS SAM CLI with the deployment parameters for your SAM application.
In the terminal, run the following command to deploy the application:
sam deploy --resolve-image-repos
Enter y when prompted to confirm the deployment:
The deployment may take up to two minutes to complete.
Stepping Through the PDF Chatbot Application Code
Introduction
In this step, you will walk through the Streamlit application code that provides a user interface and logic for the PDF chatbot application.
The application code has already been built and packaged into a Docker image. The Docker image will be deployed to an Amazon ECS Service using the AWS Serverless Application Model (SAM) CLI.
Conversational Retrieval Chains are used to create chatbots that can interact with PDF documents. These chains also allow for follow-up questions and context-aware responses, making it a viable solution for RAG-based chatbots.
Retrievers are LangChain components that retrieve documents from a vector store based on a prompt. The vector store, or index, for each uploaded PDF file, is created using the embedding solution in the previous lab step. The index files will be loaded into the chatbot application's memory and used to retrieve relevant documents based on the user prompt.
Instructions
Expand the src/pdf_chatbot directory in the Explorer tab and open the Main.py file:
This file contains the Streamlit application code that will be deployed to the ECS Service.
The
streamlit
framework is used to render the user interface and interact with the chatbot application. Thelangchain
framework will be along withboto3
to interact with the Amazon Bedrock service.The notable LangChain methods used in the application include:
LangChain Hub: LangChain Hub is a version control system for LLM prompts. You can import prompts to use in your applications. You will use a RAG prompt to generate responses for the chatbot.
BedrockLLM: LangChain BedrockLLM is used to generate responses for the chatbot. The
amazon.titan-text-express-v1
model will be used to generate text responses for the chatbot.BedrockEmbeddings: This class generates embeddings for the uploaded PDF documents. For this lab, the
amazon.titan-embed-text-v1
model will be used to generate embeddings for the uploaded documents.FAISS: This class is used as the vector store to store the embeddings and create the index. FAISS is a library for efficient similarity search and clustering of dense vectors.
ConversationalRetrievalChain: A conversational retrieval chain uses a vector store to represent the documents in the embedding space and uses a retrieval model to retrieve the most relevant documents based on the query.
LangChain Debugging: Setting the
set_debug
method toTrue
will enable debugging mode for the LangChain framework.
The bucket_name
variable references the S3 bucket that stores the PDF documents and their embeddings.
Three helper functions are defined in the Main.py
file:
upload_to_s3
: This function iterates through a list of objects and uploads them to the S3 bucket. Thest.toast
method is a Streamlit method that displays a message banner at the top of the Streamlit application.stream_response
: This function simulates a streaming response in the Streamlit application.
The generate_chat
function contains the main logic for the chatbot application. This function employs the retrieval-augmented generation technique to generate responses for the chatbot. It accepts a user prompt
and a file_name
to begin the conversation. The function performs the following steps:
36-37: The embedding solution referenced in the previous lab step creates a folder with the index files for each document. The index files generated for the uploaded PDF documents are loaded into memory.
40-44: The
BedrockLLM
is initialized with theamazon.titan-text-express-v1
model and AWS credentials. The Docker image used to deploy this application will have IAM roles attached to it through the Amazon ECS task role, removing the need to hardcode the AWS credentials.47-51: The
BedrockEmbeddings
class is initialized with theamazon.titan-embed-text-v1
model. Thisembeddings
object will be used to embed user prompts.52-57: The
index.faiss
andindex.pkl
files are loaded into a FAISS index object. The object can then be used as a retriever to fetch documents from the index vector store relevant to the user prompt.60: The
hub.pull
method is used to pull the RAG prompt from the LangChain Hub. The rlm/rag-prompt is used to generate responses for the chatbot.63-68: The
ConversationalRetrievalChain
is initialized with thellm
,retriever
, andrag_prompt
. Thereturn_source_documents
parameter is set toFalse
to return the generated response only.71-72: The
conversation
chain is invoked by passing in thequestion
andchat_history
arguments. Chat history is used to maintain the context of the conversation, however, it is not used in this lab.
The remaining code in the application is responsible for rendering the Streamlit application and handling user interactions:
The Chatbot application will utilize a Streamlit sidebar to upload and select PDF documents from the Amazon S3 bucket.
84-87: The first container in the sidebar uses the
st.file_uploader
method to upload one or more PDF documents to the S3 bucket. Theupload_to_s3
helper function is called to upload the documents to the S3 bucket when the Upload button is clicked.90-100: The second container in the sidebar calls the
s3.list_objects_v2
method to retrieve all the objects in the bucket. The object list is filtered to only include the PDF file names.102: The
st.selectbox
method renders a select box to display a list of uploaded PDF document names. Theselected_obj
variable stores the selected PDF document name.
Although the S3 bucket contains the original PDF documents and their embeddings, the generate_chat
function only requires the PDF name. Remember that the index.faiss
and index.pkl
files are identified by the PDF name. For example, random.pdf/index.faiss
and random.pdf/index.pkl
.
107: If the
selected_obj
isNone
, a caption is rendered to the user to select a PDF file.109-112: Once a file is selected, the
chat_history
state is initialized in the Streamlit session state.114-117: To render the chat history to the screen, this
for
loop will iterate through each message in the session statechat_history
and render the messagerole
andcontent
to the screen. As you and the chatbot interact, the chat history will be updated and rendered in the application. Each message object in thechat_history
state contains arole
,content
, andfile
attribute. Thefile
attribute is used to identify the PDF document that the message is associated with.119-120: The
st.chat_input
method will render a text input box for you to enter a prompt. Theif prompt := ...
syntax assigns the prompt variable and checks if the prompt is not empty. When you enter a prompt, a message object will be initialized and appended to thechat_history
state.122: The chatbot will render your input prompt to the screen with the
st.write
method.125: The
generate_chat
function is called with theprompt
andselected_obj
arguments.127: The
payload
is passed to thestream_response
helper function then the response is rendered to the screen with thest.write_stream
method.130: The chatbot will initialize and append the
assistant
role's response to thechat_history
session state with thepayload
as itscontent
.
Open the pdf_chatbot/Dockerfile file in the editor:
The Dockerfile contains the instructions to build the Docker image for the Streamlit application. The Docker image will be built with the Python dependencies defined in the pdf_chatbot/requirements.txt file.
The
python:3.9-slim
base image is used to build the Docker image. Beginning in the/app
working directory, thecurl
package is installed. TheCOPY . .
command copies the application code and dependencies into the environment, and thepip install -r requirements.txt
command installs the Python dependencies.A
tmp
directory is created to store the PDF document embeddings retrieved from the S3 bucket. The8501
port is exposed to allow traffic to the Streamlit application and aHEALTHCHECK
is defined to check the health of the application.The
ENTRYPOINT
command specifies the command that will run when the Docker container is started. Thestreamlit run Main.py
command will start the Streamlit application.Next, you will configure the Amazon Elastic Container Service (ECS) resources in the template.yaml file.
Launching the Streamlit Application Using AWS Fargate
In this step, you will deploy the Streamlit application to Amazon ECS using AWS SAM.
Instructions
In the template.yaml file, add the following parameters above the
Resources
section:Parameters: ImageURI: Type: String Description: The URI of the image to deploy Default: "824911993206.dkr.ecr.us-west-2.amazonaws.com/pdf-chatbot-lab:latest" VPC: Type: String Default: "vpc-0cb9d42cfe8a023fb" PublicSubnet: Type: String Default: "subnet-0269991e2b1402394"
The
Parameters
section should be aligned with theResources
section in the template.yaml file:The ECS cluster will be deployed into the default VPC and public subnet. The VPC and related resources have been created for you at the start of the lab.
The
ImageURI
parameter specifies the URI of the Docker image to deploy. In the interest of time, you will not build the Docker image in this lab. Instead, you will use a pre-built image.Add the following resources to the
Resources
section at the bottom of the template.yaml file:ECSSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Allow streamlit app traffic VpcId: !Ref VPC SecurityGroupIngress: - CidrIp: 0.0.0.0/0 IpProtocol: tcp FromPort: 8501 ToPort: 8501 Description: Streamlit app port - CidrIp: 0.0.0.0/0 IpProtocol: tcp FromPort: 80 ToPort: 80 Description: HTTP port - CidrIp: 0.0.0.0/0 IpProtocol: tcp FromPort: 443 ToPort: 443 Description: HTTPS port SecurityGroupEgress: - CidrIp: 0.0.0.0/0 IpProtocol: "-1" Description: Allow all outbound ECSCluster: Type: AWS::ECS::Cluster Properties: ClusterName: streamlit-cluster ECSService: Type: AWS::ECS::Service Properties: Cluster: !Ref ECSCluster DesiredCount: 1 TaskDefinition: !Ref ECSTaskDef LaunchType: FARGATE ServiceName: streamlit-service NetworkConfiguration: AwsvpcConfiguration: AssignPublicIp: ENABLED SecurityGroups: - !Ref ECSSecurityGroup Subnets: - !Ref PublicSubnet DeploymentConfiguration: MaximumPercent: 200 MinimumHealthyPercent: 100 DeploymentCircuitBreaker: Enable: true Rollback: true DeploymentController: Type: ECS ServiceConnectConfiguration: Enabled: false ECSTaskDef: Type: AWS::ECS::TaskDefinition Properties: RequiresCompatibilities: - FARGATE Cpu: 1024 Memory: 2048 NetworkMode: awsvpc Family: streamlit-task-definition ExecutionRoleArn: !Sub "arn:aws:iam::${AWS::AccountId}:role/ECSExecutionRole" TaskRoleArn: !Sub "arn:aws:iam::${AWS::AccountId}:role/ECSTaskRole" ContainerDefinitions: - Name: streamlit Image: !Ref ImageURI PortMappings: - ContainerPort: 8501 Protocol: tcp HostPort: 8501 AppProtocol: http Name: streamlit-8501-tcp Essential: true LogConfiguration: LogDriver: awslogs Options: awslogs-create-group: true awslogs-group: "/ecs/streamlit-task-definition" awslogs-region: !Ref AWS::Region awslogs-stream-prefix: ecs Environment: - Name: BUCKET_NAME Value: !Ref DocumentBucket
Ensure the new resources are aligned with the existing resources in the template.yaml file:
The Amazon ECS resources serve the following purposes:
ECSSecurityGroup: This resource defines the security group for the ECS service. It allows traffic on ports 8501, 80, and 443. The security group is associated with the ECS service which will allow traffic to the Streamlit application.
ECSCluster: This resource defines the ECS cluster where the ECS service will be deployed.
ECSService: This resource defines the ECS service that serves the Streamlit application. It specifies the desired count of tasks, the task definition to use, the launch type, the network configuration, and the deployment configuration. It is set to use the
FARGATE
launch type with a desired count of 1. TheNetworkConfiguration
specifies the public subnet to deploy the service into, along with theAssignPublicIp
value set toENABLED
to assign public IP addresses to the tasks.ECSTaskDef: This resource defines the ECS task definition used by the ECS service to deploy tasks. The
ExecutionRoleArn
is an IAM role that grants the ECS service permissions to pull the Docker image from Amazon ECR. TheTaskRoleArn
is an IAM role that grants the task permissions to access the S3 bucket and any permissions required by the chatbot application. TheContainerDefinitions
section specifies thePortMappings
andLogConfiguration
for the task. The port required by Streamlit is8501
and the ECS task logs will be stored in the/ecs/streamlit-task-definition
CloudWatch Log Group. TheEnvironment
section specifies theBUCKET_NAME
variable passed into the Streamlit application.
In the terminal, run the following command to deploy the application:
sam deploy --resolve-image-repos
Enter y when prompted to confirm the deployment:
The update to the stack will take 3 to 4 minutes to complete.
Interacting With the PDF RAG Chatbot
In this step, you will interact with the PDF chatbot by uploading a PDF file and asking the chatbot a question. The chatbot will use the embeddings generated from the PDF file to provide answers to your questions.
Instructions
In the AWS Console, navigate to the Amazon ECS cluster page:
Click the streamlit-cluster name to view the cluster details:
This will take you to the details page, where you can access the ECS service, tasks, and other resources associated with the cluster.
Below the Cluster overview, select the Tasks tab:
The Tasks table should contain one task with the Running status.
Click the Task ID of the running task to view the task details.
This will take you to the task details page, where you can access the task logs and other information.
Scroll down to the Configuration section and copy the Public IP address of the task:
In a new browser tab, paste in the Public IP and append
:8501
to the end of the URL to access the Streamlit application:Example: http://54.202.1.112:8501
Download the sample PDF file below:
This PDF file contains a list of random, distorted facts about various topics. You will use this file to test the chatbot's ability to answer questions based on the content of the PDF file.
In the Upload a PDF file section in the sidebar, drag the downloaded PDF file into the drop zone or click Browse files to select the file:
Click Upload to upload the PDF file.
The chatbot will process the PDF file, generate embeddings for the document, and display the following banner when the processing is complete:
Select the Sample1.pdf file from the Select a PDF file dropdown:
Below the Chat with a PDF file header, a chat input box will appear once a file is selected:
Enter a question into the chat input field and press Enter to submit the question to the chatbot.
To test whether the chatbot can answer questions based on the content of the PDF file, try asking one of the following questions related to the sample PDF file:
What bear is best?
Give me a random fact about the presidents of the United States.
Today is Monday, what color is the sky?
What does AWS stand for?
The chatbot should return a distorted fact, rather than an accurate answer generated by the Amazon Titan model, demonstrating the RAG technique in action:
Optional: Upload additional PDF files and ask the chatbot questions based on the content of the uploaded files.
Summary
By completing this tutorial, you have successfully:
Employed the Retrieval-Augmented Generation (RAG) technique to generate answers to questions based on embeddings from a PDF document
Deployed a PDF chatbot application to an Amazon ECS service
Note; replace all arns with the arns for your roles and resources.
Subscribe to my newsletter
Read articles from Olorode Oluwadurotimi directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
