HR Question and Answer Application with Retrieval Augmented Generation

Prakash AgrawalPrakash Agrawal
3 min read

Pre-requisite for this tutorial required on your laptop:

  1. Download VS Code and install: Download Visual Studio Tools - Install Free for Windows, Mac, Linux (microsoft.com)

  2. Download Python and install: Download Python | Python.org

  3. Download and install AWS CLI: Install or update to the latest version of the AWS CLI - AWS Command Line Interface (amazon.com)

  4. Install Boto3 and AWS Toolkit from within VS code Extensions.

  1. Create IAM user and give it permissions for CLI access

Attach the policy to the user

Once the user is created, create Access Key Security Credential so that you can access this AWS account from CLI using 'aws configure' command

Once you are able to connect to AWS account from AWS CLI, use the following command to check your connectivity once.

  1. Install the Anaconda Navigator

  1. Launch VS Code from the Anaconda Navigator. Perform install using the following command

a. pip3 install flask-sqlalchemy

b. pip install pypdf

c. pip install faiss-gpu

Create the first file for RAG Backend rag_backend.py

  1. Close the VS Code if it is open, and open VS Code only from Anaconda Navigator

Create a file rag_backend.py

# Import OS, Document Loader

import os

from langchain_community.document_loaders import PyPDFLoader

from langchain.text_splitter import RecursiveCharacterTextSplitter

from langchain.embeddings import BedrockEmbeddings

from langchain.vectorstores import FAISS

from langchain.indexes import VectorstoreIndexCreator

from langchain.llms.bedrock import Bedrock

#Define the data source and load data with PDF Loader

data_load = PyPDFLoader(‘https://…./leave_policy_usa.pdf‘)

data_test = data_load.load_and_split()

print(len(data_test))

print(data_test[2])

#Split the text loaded from PDF using RecursiveCharacterTextSplitter

data_split = RecursiveCharacterTextSplitter(separators=[“\n\n“, “\n”, “ “, ““],

chunk_size=100,

chunk_overlap=10 )

data_sample = “Communication skills are crucial skills to excel in the academic world exhibiting breakneck competition. It is becoming increasingly important for students to cultivate exceptional oral and written communication skills to stand out.“

data_split_test = data_split.split_text(data_sample)

print(data_split_test)

# Create Embeddings Client Connection

data_embedding = BedrockEmbeddings(

credentials_profile_name = ‘default’,

model_id = ‘amazon.titan-text-v1’

)

# Create VectorDB, store embeddings and indexes for the search

data_index = VectorstoreIndexCreator(

text_splitter = data_split,

embedding = data_embeddings,

vectorstore_cls = FAISS

)

# Create index for the HR Policy Document

db_index = data_index.from_loaders([data_load])

return db_index

# Write a Function to connect to Foundation Model

def hr_llm():

llm = Bedrock(

credentials_profile_name = ‘default’,

modelId = ‘anthropic.claude-v2’,

model_kwargs = {

“max_tokens_to_sample”: 300,

“temperature”: 0.1,

“top_p”: 0.9

}

)

return llm

def hr_rag_response( index, question):

new_rag_llm = hr_llm()

new_hr_rag_query = index.query( question = question, llm = new_rag_llm )

return new_hr_rag_query;

Create a Frontend File with name rag_frontend.py

import streamlit as st

import rag_backend as demo

st.set_page_config(page_title = “Question and Answer with RAG”)

page_title = ‘<p style = “font-family:sans-serif; color:Green; font-size: 45px;” > Question and Answer with RAG </p>

st.markdown( page_title, unsafe_allow_html: True )

if ‘vector_index’ not in st.session_state:

with st.spinner(“Wait your page is being loaded”)

st.session_state.vector_index = demo.hr_index()

input_text = st.text_area (“Input_Text”, label_visibility = “collapsed” )

button1 = st.button( “Generate Question and Answer Response” , type = “primary” )

if button1:

with st.spinner( “ I am lucky, I want to learn AI more and more “ )

response_content = demo.hr_rag_response( index = st.session_state.vector_index , question = input_text )

st.write(response_content)

Once this frontend file is created, you can run it using the command:

streamlit run rag_frontend.py

Once you run the command then Output will be shown on your local browser for HR Q&A.

About me: I am an independent technical writer and if you are an organization that want to hire me then I can be contacted at techonlinewriter@gmail.com.

0
Subscribe to my newsletter

Read articles from Prakash Agrawal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Prakash Agrawal
Prakash Agrawal