Building Your Simple ChatBot API with Java Spring AI

Yogesh Bali
4 min readFeb 2, 2025

--

Spring AI was introduced in Spring Framework 6.0 and Spring Boot 3.0

Spring AI is an application framework designed for AI engineering. Its aim is to bring the Spring ecosystem’s core principles, such as portability and modular design, to the AI domain. It also encourages the use of POJOs (Plain Old Java Objects) as the fundamental building blocks for AI applications.

At its core, Spring AI addresses the fundamental challenge of AI integration: Connecting your enterprise Data and APIs with the AI Models.

Features

  • Support for Major AI Model Providers: Includes providers such as Anthropic, OpenAI, Microsoft, Amazon, Google, and Ollama. Supported model types include chat completion, embeddings, text-to-image, audio transcription, and text-to-speech.
  • Portable API Support: Seamlessly supports both synchronous and streaming API options across various AI providers.
  • Structured Outputs: Maps AI model outputs to POJOs for easy integration and handling.
  • Vector Database Support: Compatibility with major vector databases, including Apache Cassandra, Azure Vector Search, Chroma, Milvus, MongoDB Atlas, Neo4j, Oracle, PostgreSQL/PGVector, PineCone, Qdrant, Redis, and Weaviate.
  • Observability: Provides valuable insights into AI-related operations for better monitoring and debugging.
  • ChatClient API: A fluent API for interacting with AI chat models, similar in style to WebClient and RestClient APIs.
  • Support for Chat Conversation Memory & Retrieval-Augmented Generation (RAG): Enables enhanced chat interactions and knowledge retrieval.
  • Spring Boot Auto Configuration and Starters: Easily configure and start AI models and vector stores through start.spring.io, selecting your desired model or vector store.
Spring AI-Driven Application Architecture

Ollama

Ollama is a platform that provides a tool to run large language models (LLMs) locally on your machine, rather than relying on cloud-based services. Since everything runs locally, you have more control over the data you interact with, which can be a key consideration for applications where privacy is important.

Download and Install Ollama from here: https://ollama.com/

Ollama runs on default port: 11434

http://127.0.0.1:11434/

Command to know AI models running locally

ollama list

Mistral

Mistral is an open-source, high-performance large language model (LLM) designed for natural language processing tasks. It’s known for being a lightweight and efficient model, developed to provide strong performance in a variety of AI applications

Open the Command Prompt as an administrator and execute the following command to install Mistral: ollama run mistral

Installing Mistral

Spring Boot Web application with a Spring AI

Include at least Ollama and Spring Web (REST API)

Edit project’s application.properties:

application.properties
  • Line 2 tells the base URL for the Ollama service that the application will communicate with.
  • Line 3 specify the AI model to use when sending requests to Ollama for chat interactions.
  • Line 4 sets the temperature for the AI model’s responses. The temperature value controls the randomness and creativity of the model’s output.
spring.ai.ollama.base-url=http://127.0.0.1:11434
spring.ai.ollama.chat.options.model=mistral
spring.ai.ollama.chat.options.temperature=0.7
package com.example.HelloWorldOllamaSpringAi.Prompt;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import reactor.core.publisher.Flux;

@RestController
public class PromptController {

private ChatClient.Builder chatClientBuilder;

@Autowired
public PromptController(ChatClient.Builder chatClientBuilder) {
this.chatClientBuilder = chatClientBuilder;
}

@GetMapping("/spring-ai/prompt")
public Flux<String> promptResponse(@RequestParam("message") String message) {
// Build the chat client
ChatClient chatClient = chatClientBuilder.build();

// Send the message prompt and get the response
Flux<String> response = chatClient.prompt(message).stream().content();

// Return the response
return response;
}
}

Code Breakdown:

private ChatClient.Builder chatClientBuilder;

This is a builder class specifically designed for constructing ChatClient object. ChatClient is the core class responsible for communicating with the AI service (like Ollama). It handles the details of sending prompts, receiving responses, and managing the connection to the service.

public Flux<String> promptResponse(@RequestParam("message") String message)

This is the method that will handle incoming GET requests to /spring-ai/prompt. It accepts a message parameter from the query string and returns a reactive stream of Flux<String>, which represents multiple String responses from the AI model.

Flux<String> response = chatClient.prompt(message).stream().content();

This line sends the message to the AI service through the ChatClient and retrieves the response. Let's break this down:

  • chatClient.prompt(message): This sends the message to the AI model.
  • .stream(): This suggests the response from the model might be a stream (e.g., incremental responses).
  • .content(): This indicates that the content of the response is being extracted. Assuming content() returns a Flux<String>, the data will be emitted as a reactive stream. This could be a chat-like response, where multiple parts of the response come in over time.
Spring AI is amazing

--

--

Yogesh Bali
Yogesh Bali

Written by Yogesh Bali

Senior Technical Lead in Thales

Responses (7)