Overview

Developing custom chatbots using LLM and RAG is currently one of the hottest topics globally. This article outlines how to set up the securely isolated GPT-4o model on Azure infrastructure, integrate it with the securely isolated Azure AI Search, and build a custom chatbot using the LangChain4j library.

Steps

Create Azure OpenAI Instance
Create Azure OpenAI Deployment
Create Azure AI Search Service
Store custom data in Azure AI Search
Develop a custom chatbot

Creating Azure OpenAI Instance

To install various OpenAI models on my infrastructure, create an OpenAI Instance in your preferred region.

Microsoft Azure Portal
→ [Azure OpenAI]
→ [Create Azure OpenAI]

# [1] Basics
# Project Details
→ Select Subscription: {your-subscription}
→ Resource group: [Create new] → Enter Name: {your-resource-group} → [OK]
# Instance Details
→ Select Region: {your-region}
→ Enter name: {your-instance-name}
→ Select Pricing tier: {your-pricing-tier}
→ [Next]

# [2] Network
→ Select Type: [All networks, including the internet, can access this resource.]
→ [Next]
→ [Next]

# [4] Review + submit
→ [Create]

Creating Azure OpenAI Deployment: GPT-4o

Create the latest conversational multi-modal model GPT-4o from OpenAI as the LLM to send questions.

Microsoft Azure Portal
→ [Azure OpenAI]
→ {your-instance}
→ [Model deployments]
→ [Manage Deployments]

# Azure AI Studio
→ [Create new deployment]

# Deploy model
→ Select a model: [gpt-4o]
→ Model version: [2024-05-13]
→ Select Deployment type: [Standard]
→ Enter Deployment name: {your-deployment-name}
→ Tokens per Minute Rate Limit (thousands): 150K
→ Select Enable Dynamic Quota: [Enabled]
→ [Create]

Creating Azure OpenAI Deployment: text-embedding-ada-002

Create the Embedding Model responsible for vector transformation of the original text. Chose text-embedding-ada-002, the latest text embedding model from OpenAI.

Microsoft Azure Portal
→ [Azure OpenAI]
→ {your-instance}
→ [Model deployments]
→ [Manage Deployments]

# Azure AI Studio
→ [Create new deployment]

# Deploy model
→ Select a model: [text-embedding-ada-002]
→ Model version: [2]
→ Select Deployment type: [Standard]
→ Enter Deployment name: {your-deployment-name}
→ Tokens per Minute Rate Limit (thousands): 350K
→ Select Enable Dynamic Quota: [Enabled]
→ [Deploy]

Creating Azure AI Search

Create Azure AI Search to use as the RAG repository and search engine.

Microsoft Azure Portal
→ [Azure AI Search]
→ [Create Azure AI Search]

# [1] Basics
# Project Details
→ Select Subscription: {your-subscription}
→ Select Resource Group: {your-resource-group}

# Instance Details
→ Enter Service name: {your-ai-search-name}
→ Location: {your-region}

→ Select Pricing tier: {your-pricing-tier}
→ [Next]

→ [Create]

When selecting Pricing tier, note that the Semantic ranker feature corresponding to Hybrid Search requires selecting a tier above Basic.

Creating Azure AI Search Index

Create an index to store RAG data.

Microsoft Azure Portal
→ [AI Search]
→ {your-ai-search}
→ [Add index]
→ [Add index (JSON)]
→ (Paste below JSON content)
→ [Save]

Below is the Index template for Azure AI Search used by LangChain4j, modified to store the content field, which stores the original text, up to the maximum size allowed by the document (16 MB) from 32,766 bytes.

{
  "name": "{your-ai-search-index-name}",
  "fields": [
    {
      "name": "id",
      "type": "Edm.String",
      "searchable": true,
      "filterable": true,
      "retrievable": true,
      "stored": true,
      "sortable": true,
      "facetable": true,
      "key": true,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "normalizer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "vectorEncoding": null,
      "synonymMaps": []
    },
    {
      "name": "content",
      "type": "Edm.String",
      "searchable": true,
      "filterable": false,
      "retrievable": true,
      "stored": true,
      "sortable": false,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "normalizer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "vectorEncoding": null,
      "synonymMaps": []
    },
    {
      "name": "content_vector",
      "type": "Collection(Edm.Single)",
      "searchable": true,
      "filterable": false,
      "retrievable": true,
      "stored": true,
      "sortable": false,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "normalizer": null,
      "dimensions": 1536,
      "vectorSearchProfile": "vector-search-profile",
      "vectorEncoding": null,
      "synonymMaps": []
    },
    {
      "name": "metadata",
      "type": "Edm.ComplexType",
      "fields": [
        {
          "name": "source",
          "type": "Edm.String",
          "searchable": true,
          "filterable": true,
          "retrievable": true,
          "stored": true,
          "sortable": true,
          "facetable": true,
          "key": false,
          "indexAnalyzer": null,
          "searchAnalyzer": null,
          "analyzer": null,
          "normalizer": null,
          "dimensions": null,
          "vectorSearchProfile": null,
          "vectorEncoding": null,
          "synonymMaps": []
        },
        {
          "name": "attributes",
          "type": "Collection(Edm.ComplexType)",
          "fields": [
            {
              "name": "key",
              "type": "Edm.String",
              "searchable": true,
              "filterable": true,
              "retrievable": true,
              "stored": true,
              "sortable": false,
              "facetable": true,
              "key": false,
              "indexAnalyzer": null,
              "searchAnalyzer": null,
              "analyzer": null,
              "normalizer": null,
              "dimensions": null,
              "vectorSearchProfile": null,
              "vectorEncoding": null,
              "synonymMaps": []
            },
            {
              "name": "value",
              "type": "Edm.String",
              "searchable": true,
              "filterable": true,
              "retrievable": true,
              "stored": true,
              "sortable": false,
              "facetable": true,
              "key": false,
              "indexAnalyzer": null,
              "searchAnalyzer": null,
              "analyzer": null,
              "normalizer": null,
              "dimensions": null,
              "vectorSearchProfile": null,
              "vectorEncoding": null,
              "synonymMaps": []
            }
          ]
        }
      ]
    }
  ],
  "scoringProfiles": [],
  "corsOptions": null,
  "suggesters": [],
  "analyzers": [],
  "normalizers": [],
  "tokenizers": [],
  "tokenFilters": [],
  "charFilters": [],
  "encryptionKey": null,
  "similarity": {
    "@odata.type": "#Microsoft.Azure.Search.BM25Similarity",
    "k1": null,
    "b": null
  },
  "semantic": {
    "defaultConfiguration": "semantic-search-config",
    "configurations": [
      {
        "name": "semantic-search-config",
        "prioritizedFields": {
          "titleField": null,
          "prioritizedContentFields": [
            {
              "fieldName": "content"
            }
          ],
          "prioritizedKeywordsFields": [
            {
              "fieldName": "content"
            }
          ]
        }
      }
    ]
  },
  "vectorSearch": {
    "algorithms": [
      {
        "name": "vector-search-algorithm",
        "kind": "hnsw",
        "hnswParameters": {
          "metric": "cosine",
          "m": 4,
          "efConstruction": 400,
          "efSearch": 500
        },
        "exhaustiveKnnParameters": null
      }
    ],
    "profiles": [
      {
        "name": "vector-search-profile",
        "algorithm": "vector-search-algorithm",
        "vectorizer": null,
        "compression": null
      }
    ],
    "vectorizers": [],
    "compressions": []
  }
}

build.gradle.kts

Infrastructure preparation is now complete. Import LangChain4j into the project for the actual coding work.

val langChain4jVersion = "0.35.0"
dependencies {
    implementation("dev.langchain4j:langchain4j-core:$langChain4jVersion")
    implementation("dev.langchain4j:langchain4j-embeddings:$langChain4jVersion")
    implementation("dev.langchain4j:langchain4j-easy-rag:$langChain4jVersion")
    implementation("dev.langchain4j:langchain4j-open-ai:$langChain4jVersion")
    implementation("dev.langchain4j:langchain4j-azure-open-ai:$langChain4jVersion")
    implementation("dev.langchain4j:langchain4j-azure-ai-search:$langChain4jVersion")
}

Creating Embedding, LLM Model, and RAG Objects

LangChain4j fully supports the Azure OpenAI and Azure AI Search ecosystems. Below is how you can easily create related objects.

// Create EmbeddingModel object for Azure OpenAI text-embedding-ada-002
val embeddingModel: EmbeddingModel = AzureOpenAiEmbeddingModel.builder()
    .endpoint("https://{your-azrue-open-ai-text-embedding-ada-002-deployment-name}.openai.azure.com")
    .serviceVersion("2023-05-15")
    .apiKey("{your-azure-openai-instance-api-key}")
    .deploymentName("{your-azrue-open-ai-text-embedding-ada-002-deployment-name}")
    .build()

// Create Document Splitter object
val documentSplitter = DocumentSplitters.recursive(
    8191,
    256,
    OpenAiTokenizer("gpt-4o-2024-05-13")
)

// Create ContentRetriever object for Azure AI Search with Hybrid Search applied
val contentRetriever: ContentRetriever = AzureAiSearchContentRetriever.builder()
    .endpoint("https://{your-azure-ai-search-service-name}.search.windows.net")
    .apiKey("{your-azure-ai-search-admin-key}")
    .dimensions(1536)
    .indexName("{your-azure-ai-search-index-name}")
    .createOrUpdateIndex(false)
    .embeddingModel(embeddingModel)
    .queryType(AzureAiSearchQueryType.HYBRID_WITH_RERANKING)
    .maxResults(50)
    .minScore(0.0)
    .build()

// Create ChatLanguageModel object for Azure OpenAI GPT-4o
val chatLanguageModel = AzureOpenAiChatModel.builder()
    .endpoint("https://{your-azrue-open-ai-gpt-4o-deployment-name}.openai.azure.com")
    .apiKey("{your-azure-openai-instance-api-key}")
    .deploymentName("{your-azrue-open-ai-gpt-4o-deployment-name}")
    .serviceVersion("2024-02-01")
    .timeout(Duration.ofSeconds(360))
    .temperature(0.3)
    .topP(0.3)
    .build()

Storing Data in Azure AI Search

Azure AI Search can store Text as the data source and its multi-dimensional floating-point array Vector data as a single Document. This feature enables Hybrid Search combining Semantic Search and Keyword Search.
Below is an example of cloning the LangChain4j library source code and storing it in Azure AI Search. First, clone the related repositories in the project root directory.

# Clone LangChain4j library repository
$ git clone https://github.com/langchain4j/langchain4j.git
$ git clone https://github.com/langchain4j/langchain4j-embeddings.git
$ git clone https://github.com/langchain4j/langchain4j-examples.git

Convert the cloned source code files to Vector data using the text-embedding-ada-002 embedding model and store them with the original text in Azure AI Search.

// Store only the source code files of the cloned LangChain4j library in Azure AI Search
val embeddings: MutableList<Pair<Embedding, TextSegment>> = mutableListOf()
arrayOf("langchain4j", "langchain4j-embeddings", "langchain4j-examples").forEach { directory ->
    Files.walk(Paths.get(directory)).use { paths ->
        paths.filter {
            Files.isRegularFile(it) && arrayOf(
                "md",
                "xml",
                "gradle",
                "kts",
                "java",
                "kt"
            ).contains(it.fileName.toString().substringAfterLast('.', ""))
        }
            .forEach { path ->
                val document = Document.document(path.toFile().readText())
                val segments = documentSplitter.split(document)
                segments.forEach { segment ->
                    embeddings.add(Pair(embeddingModel.embed(segment).content(), segment))
                }
            }
    }
    try {
        contentRetriever.addAll(embeddings.map { it.first }, embeddings.map { it.second })
    } catch (ex: IndexBatchException) {
        ex.indexingResults.filter { !it.isSucceeded }.forEach {
            // Print error messages that occurred during storage in Azure AI Search
            println(it.errorMessage)
        }
    }

Querying LLM Model with RAG Integration

Now that the LangChain4j related source code is stored in Azure AI Search, you can include the list of information queried by RAG in the LLM's prompt for querying.

// [1] Write LLM question
val question = "Please explain in detail about Hybrid Search combining Semantic Search and Keyword Search from RAG. Also, write an example of storing and querying data in Azure AI Search using LangChain4j."

// [2] Acquire Hybrid Search results with Re-ranking applied from Azure AI Search
val contents: List<Content> = contentRetriever.retrieve(Query.from(question))

// [3] Acquire LLM answer
val aiMessage = chatLanguageModel.generate(
    SystemMessage(
"""
You are a coding assistant who helps guide the usage of the LangChain4j library and writes examples. Please create the examples in the Kotlin language. Refer to the Information below to answer the questions.

Information: \\\
${contents.take(25).joinToString("\n\n") { it.textSegment().text() }}
\\\
""".trimIndent()
    ),
    UserMessage(question)
)

// [4] Print LLM answer
println(aiMessage.content().text())

Building a Custom Chatbot with Azure OpenAI, Azure AI Search, and LangChain4j

Overview

Steps

Creating Azure OpenAI Instance

Creating Azure OpenAI Deployment: GPT-4o

Creating Azure OpenAI Deployment: text-embedding-ada-002

Creating Azure AI Search

Creating Azure AI Search Index

build.gradle.kts

Creating Embedding, LLM Model, and RAG Objects

Storing Data in Azure AI Search

Querying LLM Model with RAG Integration

References

Subscribe to my newsletter

Taehyeong Lee

Taehyeong Lee