Connect a modern LLM backend to a clean Vaadin UI using Java 21 microservices

In this article, we’ll build a modern LLM-powered Retrieval-Augmented Generation (RAG) system using Java 21, Quarkus for the backend, and Vaadin for a clean interactive UI. This end-to-end system demonstrates embedding, prompt engineering, and LLM integration in a microservices architecture.

💡 What You'll Build

Two Java 21 microservices:

🔍 rag-service (Quarkus)
- Accepts a user query
- Generates embeddings (stubbed)
- Retrieves similar documents (mocked vector store)
- Calls LLM (stubbed or real OpenAI/GPT)
- Returns the generated response
💬 rag-ui (Vaadin)
- A clean web UI where users ask questions
- Sends questions to the backend
- Displays the AI-generated answers

🏧 Project Structure

rag-poc/
├── rag-service/      # Quarkus backend
└── rag-ui/           # Vaadin frontend

🧠 `rag-service` — RAG API with Quarkus

`pom.xml`

<dependencies>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-resteasy-reactive</artifactId>
  </dependency>
  <dependency>
    <groupId>io.quarkus</groupId>
    <artifactId>quarkus-jackson</artifactId>
  </dependency>
</dependencies>

`RagService.java`

@Path("/rag")
@ApplicationScoped
public class RagService {

    @Inject EmbeddingService embeddingService;
    @Inject DocumentStore documentStore;
    @Inject LlmClient llmClient;

    @POST
    @Produces(MediaType.TEXT_PLAIN)
    @Consumes(MediaType.TEXT_PLAIN)
    public Uni<String> ask(String userQuestion) {
        return embeddingService.embed(userQuestion)
            .onItem().transformToUni(vector -> documentStore.searchSimilarDocuments(vector))
            .onItem().transformToUni(docs -> {
                String prompt = buildPrompt(userQuestion, docs);
                return llmClient.ask(prompt);
            });
    }

    private String buildPrompt(String question, List<String> docs) {
        String context = String.join("\n", docs);
        return """
               You are a helpful assistant. Use the context below to answer the question.

               Context:
               %s

               Question: %s
               Answer:""".formatted(context, question);
    }
}

Stubbed Services

`EmbeddingService.java`

@ApplicationScoped
public class EmbeddingService {
    public Uni<float[]> embed(String text) {
        return Uni.createFrom().item(new float[] {0.1f, 0.2f, 0.3f});
    }
}

`DocumentStore.java`

@ApplicationScoped
public class DocumentStore {
    public Uni<List<String>> searchSimilarDocuments(float[] queryVector) {
        return Uni.createFrom().item(List.of(
            "Doc 1: LLMs are models that generate text.",
            "Doc 2: Embeddings convert text into vectors."
        ));
    }
}

`LlmClient.java`

@ApplicationScoped
public class LlmClient {
    public Uni<String> ask(String prompt) {
        return Uni.createFrom().item("This is the answer based on the context.");
    }
}

💬 `rag-ui` — Vaadin Frontend

`pom.xml`

<dependencies>
    <dependency>
        <groupId>com.vaadin</groupId>
        <artifactId>vaadin-spring-boot-starter</artifactId>
        <version>24.4.0</version>
    </dependency>
</dependencies>

<properties>
    <java.version>21</java.version>
</properties>

`RagUiApplication.java`

@SpringBootApplication
public class RagUiApplication {
    public static void main(String[] args) {
        SpringApplication.run(RagUiApplication.class, args);
    }
}

`MainView.java`

@Route("")
@PageTitle("RAG Assistant")
public class MainView extends VerticalLayout {

    private final TextField questionField = new TextField("Ask a question:");
    private final Button submitButton = new Button("Submit");
    private final TextArea resultArea = new TextArea("Answer");

    public MainView() {
        resultArea.setWidthFull();
        resultArea.setHeight("200px");
        resultArea.setReadOnly(true);

        submitButton.addClickListener(e -> askQuestion());

        add(questionField, submitButton, resultArea);
    }

    private void askQuestion() {
        String question = questionField.getValue();
        if (question == null || question.isBlank()) {
            Notification.show("Please enter a question");
            return;
        }

        try {
            String answer = sendQuestionToRagService(question);
            resultArea.setValue(answer);
        } catch (IOException | InterruptedException ex) {
            resultArea.setValue("Error: " + ex.getMessage());
        }
    }

    private String sendQuestionToRagService(String question) throws IOException, InterruptedException {
        HttpClient client = HttpClient.newHttpClient();
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create("http://localhost:8080/rag"))
            .header("Content-Type", "text/plain")
            .POST(HttpRequest.BodyPublishers.ofString(question))
            .build();
        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
        return response.body();
    }
}

▶ How to Run It

Start Backend (rag-service)

cd rag-service
./mvnw quarkus:dev

Start Frontend (rag-ui)

cd rag-ui
./mvnw spring-boot:run

Open your browser at: http://localhost:8081

📈 Next Steps

Integrate real OpenAI embeddings and completions
Replace stubbed vector search with Qdrant or Weaviate
Add document upload + ingestion pipeline
Stream answers from the backend using Server-Sent Events (SSE)

🧠 Why This Matters

This architecture demonstrates how modern Java can play a significant role in building practical AI systems. By combining Quarkus for performant APIs and Vaadin for elegant UIs, you can create powerful, interactive RAG systems that leverage the best of LLMs — all in Java.

🔗 Repo Template?

Interested in a GitHub repo template or Docker Compose setup? Let me know in the comments!

🚀 Building a Retrieval-Augmented Generation (RAG) System with Java 21, Quarkus and Vaadin

Connect a modern LLM backend to a clean Vaadin UI using Java 21 microservices

💡 What You'll Build

🏧 Project Structure

🧠 `rag-service` — RAG API with Quarkus

`pom.xml`

`RagService.java`

Stubbed Services

`EmbeddingService.java`

`DocumentStore.java`

`LlmClient.java`

💬 `rag-ui` — Vaadin Frontend

`pom.xml`

`RagUiApplication.java`

`MainView.java`

▶ How to Run It

📈 Next Steps

🧠 Why This Matters

🔗 Repo Template?

Subscribe to my newsletter

Vitali R

Vitali R

🚀 Building a Retrieval-Augmented Generation (RAG) System with Java 21, Quarkus and Vaadin

Connect a modern LLM backend to a clean Vaadin UI using Java 21 microservices

💡 What You'll Build

🏧 Project Structure

🧠 rag-service — RAG API with Quarkus

pom.xml

RagService.java

Stubbed Services

EmbeddingService.java

DocumentStore.java

LlmClient.java

💬 rag-ui — Vaadin Frontend

pom.xml

RagUiApplication.java

MainView.java

▶ How to Run It

📈 Next Steps

🧠 Why This Matters

🔗 Repo Template?

Subscribe to my newsletter

Vitali R

Vitali R

🧠 `rag-service` — RAG API with Quarkus

`pom.xml`

`RagService.java`

`EmbeddingService.java`

`DocumentStore.java`

`LlmClient.java`

💬 `rag-ui` — Vaadin Frontend

`pom.xml`

`RagUiApplication.java`

`MainView.java`