In today's dynamic software development landscape, staying ahead means integrating cutting-edge technologies seamlessly into our projects. Spring Boot, with its rapid application development capabilities, is a popular choice for building enterprise-level applications. When coupled with the power of artificial intelligence, particularly Language Model (LLM) and Retrieval-Augmented Generation, Spring Boot applications can reach new levels of efficiency and intelligence. In this blog post, we'll explore how to integrate Spring Boot with SpringAI, leveraging LLM and Retrieval-Augmented Generation for enhanced functionality.

Introducing SpringAI:

SpringAI is a novel approach that combines the robustness of Spring Boot with the intelligence of artificial intelligence models. It enables developers to integrate AI capabilities seamlessly into their Spring Boot applications, opening doors to advanced functionalities and improved user experiences.

Retrieval Augmented Generation(RAG)

A technique termed Retrieval Augmented Generation (RAG) has emerged to address the challenge of incorporating relevant data into prompts for accurate AI model responses.

The approach involves a batch processing style programming model, where the job reads unstructured data from your documents, transforms it, and then writes it into a vector database. At a high level, this is an ETL (Extract, Transform and Load) pipeline. The vector database is used in the retrieval part of RAG technique.

As part of loading the unstructured data into the vector database, one of the most important transformations is to split the original document into smaller pieces. The procedure of splitting the original document into smaller pieces has two important steps:

Split the document into parts while preserving the semantic boundaries of the content. For example, for a document with paragraphs and tables, one should avoid splitting the document in the middle of a paragraph or table. For code, avoid splitting the code in the middle of a method’s implementation.
Split the document’s parts further into parts whose size is a small percentage of the AI Model’s token limit.

The next phase in RAG is processing user input. When a user’s question is to be answered by an AI model, the question and all the “similar” document pieces are placed into the prompt that is sent to the AI model. This is the reason to use a vector database. It is very good at finding similar content.

There are several concepts that are used in implementing RAG. The concepts map onto classes in Spring AI:

DocumentReader: A Java functional interface that is responsible for loading a List<Document> from a data source. Common data sources are PDF, Markdown, and JSON.
Document: A text-based representation of your data source that also contains metadata to describe the contents.
DocumentTransformer: Responsible for processing the data in various ways (for example, splitting documents into smaller pieces or adding additional metadata to the Document).
DocumentWriter: Lets you persist the Documents into a database (most commonly in the AI stack, a vector database).
Embedding: A representation of your data as a List<Double> that is used by the vector database to compute the “similarity” of a user’s query to relevant documents.

Steps to implement:

Get the document loaded from a pdf source and insert it into vector database using postgres.
Do a similarity search of the prompt String in the vector database and get a list of document in response.
Use the list of documents in the Prompt Template and get the prompt response from LLM.

Inserting Document/PDF content into PGVector Database

What is PGvector?

PGvector is an open-source extension for PostgreSQL that enables storing and searching over machine learning-generated embeddings. It provides different capabilities that let users identify both exact and approximate nearest neighbors. It is designed to work seamlessly with other PostgreSQL features, including indexing and querying.

Prerequisites

OpenAI Account: Create an account at OpenAI Signup and generate the token at API Keys.
Access to PostgreSQL instance with the following configurations

The setup local Postgres/PGVector appendix shows how to set up a DB locally with a Docker container.

On startup, the PgVectorStore will attempt to install the required database extensions and create the required vector_store table with an index. Optionally, you can do this manually like so:

CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS hstore;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

CREATE TABLE IF NOT EXISTS vector_store (
    id uuid DEFAULT uuid_generate_v4() PRIMARY KEY,
    content text,
    metadata json,
    embedding vector(1536)
);

CREATE INDEX ON vector_store USING HNSW (embedding vector_cosine_ops);

Configuration

To set up PgVectorStore, you need to provide (via application.yaml) configurations to your PostgreSQL database.

Additionally, you’ll need to provide your OpenAI API Key. Set it as an environment variable like so:

export SPRING_AI_OPENAI_API_KEY='Your_OpenAI_API_Key'

Repository

To acquire Spring AI artifacts, declare the Spring Snapshot repository:

<repositories>
    <repository>
      <id>spring-milestones</id>
      <name>Spring Milestones</name>
      <url>https://repo.spring.io/milestone</url>
      <snapshots>
        <enabled>false</enabled>
      </snapshots>
    </repository>
    <repository>
      <id>spring-snapshots</id>
      <name>Spring Snapshots</name>
      <url>https://repo.spring.io/snapshot</url>
      <releases>
        <enabled>false</enabled>
      </releases>
    </repository>
  </repositories>

Dependency Management

Using the BOM from your application’s build script avoids the need for you to specify and maintain the dependency versions yourself.

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-bom</artifactId>
            <version>0.8.0</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

Dependencies

Add these dependencies to your project:

PostgreSQL connection and JdbcTemplate auto-configuration.

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>

<dependency>
    <groupId>org.postgresql</groupId>
    <artifactId>postgresql</artifactId>
    <scope>runtime</scope>
</dependency>

OpenAI: Required for calculating embeddings.

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pdf-document-reader</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
    </dependency>

PGvector

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pgvector-store</artifactId>
</dependency>

Sample Code

To configure PgVectorStore in your application, you can use the following setup:

Add to application.yml (using your DB credentials):

spring:
  datasource:
    url: jdbc:postgresql://localhost/vector_store
    username: postgres
    password: postgres

For Postgres, using docker image - docker-compose.yml

version: '3.7'
services:
    postgres:
        image: ankane/pgvector:v0.5.0
        restart: always
        environment:
          - POSTGRES_USER=postgres
          - POSTGRES_PASSWORD=postgres
          - POSTGRES_DB=vector_store
          - PGPASSWORD=postgres
        logging:
          options:
            max-size: 10m
            max-file: "3"
        ports:
          - '5432:5432'
        healthcheck:
          test: "pg_isready -U postgres -d vector_store"
          interval: 2s
          timeout: 20s
          retries: 10
    pgadmin:
        container_name: pgadmin_container
        image: dpage/pgadmin4
        environment:
          PGADMIN_DEFAULT_EMAIL: ${PGADMIN_DEFAULT_EMAIL:-pgadmin4@pgadmin.org}
          PGADMIN_DEFAULT_PASSWORD: ${PGADMIN_DEFAULT_PASSWORD:-admin}
        volumes:
          - ./servers.json:/pgadmin4/servers.json
        ports:
          - "${PGADMIN_PORT:-5050}:80"Integrate with OpenAI’s embeddings by adding the Spring Boot OpenAI Starter to your project. This provides you with an implementation of the Embeddings client:

@Bean
public VectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingClient embeddingClient) {
    return new PgVectorStore(jdbcTemplate, embeddingClient);
}

In your main code, load document from pdf and add to vector database:

@Autowired
    VectorStore vectorStore;
    @Autowired
    JdbcTemplate jdbcTemplate;

    /* Get the document Loaded
    // This pdf contains the context that we have to provide to
    // the LLM so that it can respond accordingly. This may contain 
    // the FAQs if the application is to use this as a chatbot.
    */
    @Value("${pdf.file}")
    private Resource pdfResource;

    @Autowired
    TokenTextSplitter tokenTextSplitter;

    public void retrainAI() {
        //before store delete the existing one
        jdbcTemplate.update("delete from vector_store");
        // Document Reader cofig
        var config = PdfDocumentReaderConfig
                .builder()
                .withPageExtractedTextFormatter(new ExtractedTextFormatter.Builder().withNumberOfBottomTextLinesToDelete(3)
                        .withNumberOfTopPagesToSkipBeforeDelete(1)
                        .build())
                .withPagesPerDocument(1)
                .build();

        var pdfReader = new PagePdfDocumentReader(pdfResource, config);
        var textSplitter = new TokenTextSplitter();
        // Add document to Vector database after using text splitter
        vectorStore.accept(textSplitter.apply(pdfReader.get()));
    }

Doing a similarity search of the prompt String in the vector database and getting a list of document in response

var listOfSimilarDocuments = this.vectorStore.similaritySearch(message);
var documents = listOfSimilarDocuments
                .stream()
                .map(Document::getContent)
                .collect(Collectors.joining(System.lineSeparator()));

Use the list of documents in the Prompt Template and get the prompt response from LLM

We'll need a template to feed in the document

private final String template = """

            You're assisting with questions about services offered by India Insurance.
            India Insurance is a life insurance company that offers a variety of products
            and services. They also offer long-term care insurance, investments, retirement 
            plan services, institutional asset management, and annuities.

            Use the information from the DOCUMENTS section to provide accurate answers but act as if you knew this information innately.
            If unsure, simply state that you don't know.

            DOCUMENTS:
            {documents}

            """;

Communication with AI Chat client using the template and document

var systemMessage = new SystemPromptTemplate(this.template)
                .createMessage(Map.of("documents", documents));
var userMessage = new UserMessage(message);
var prompt = new Prompt(List.of(systemMessage, userMessage));
var aiResponse = aiClient.call(prompt);
return aiResponse.getResult().getOutput().getContent();

In the response we'll get the AI response back which will have the context of the info that we provided in terms of a document.
Voila!! Chat bot is ready

We have used OpenAI library and gpt-4 LLM here but we can use many others like Ollama, Amazon Bedrock and Vertex AI, Etc.

AI & Java : Integrating GPT with SpringBoot using SpringAI, Retrieval Augmented Generation(RAG) and PG Vector Database