Incorporating an LLM Batch (Ollama/OpenAI) for Auto-Generating Film Tags into Spring Boot
Introduction
In the DVD rental app, there was a requirement to add “mood tags” to films.
These are emotional tags finer than categories—things like “Action,” “Heartwarming,” “Family-friendly,” “Horror.”
With about 1,000 films, manually tagging them isn’t practical.
So we built a batch process that sends film titles and descriptions to an LLM to auto-generate tags.
Overall Architecture
Spring Boot Batch Process
↓ Fetch untagged films from the film table
↓ Build prompt with title + description
↓ Send to LLM API (Ollama or OpenAI)
↓ Extract tags from response
↓ Save to the taste_tags column in the film table
LLM Provider Selection
For local development we used Ollama (free, privacy-safe), and for production OpenAI API.
| Provider | Pros | Cons |
|---|---|---|
| Ollama (llama3) | Free, offline, privacy-safe | Lower accuracy than OpenAI, slower |
| OpenAI (GPT-4o-mini) | High accuracy, fast | API cost, requires internet |
Incorporating the LLM Client into Spring Boot
Dependencies
<!-- pom.xml -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>1.0.0-M6</version>
</dependency>
<!-- or Ollama -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
<version>1.0.0-M6</version>
</dependency>
Configuration
# application.yml (when using Ollama)
spring:
ai:
ollama:
base-url: http://localhost:11434
chat:
model: llama3
---
# application-prod.yml (when using OpenAI)
spring:
ai:
openai:
api-key: ${OPENAI_API_KEY}
chat:
options:
model: gpt-4o-mini
Batch Processing Implementation
Tag Generation Service
@Service
@Slf4j
public class FilmTagGenerationService {
private final ChatClient chatClient;
private final FilmTagMapper filmTagMapper;
public FilmTagGenerationService(ChatClient.Builder builder, FilmTagMapper mapper) {
this.chatClient = builder.build();
this.filmTagMapper = mapper;
}
public void generateTagsForAllFilms(int batchSize) {
// Fetch untagged films
List<FilmForTagging> films = filmTagMapper.findFilmsWithoutTags(batchSize);
log.info("Tag generation targets: {} films", films.size());
for (FilmForTagging film : films) {
try {
String[] tags = generateTags(film);
filmTagMapper.updateTasteTags(film.getFilmId(), tags);
log.info("Tags applied: {} → {}", film.getTitle(), Arrays.toString(tags));
// API rate limit countermeasure
Thread.sleep(500);
} catch (Exception e) {
log.error("Tag generation error: filmId={}", film.getFilmId(), e);
}
}
}
private String[] generateTags(FilmForTagging film) {
String prompt = buildPrompt(film);
String response = chatClient.prompt()
.user(prompt)
.call()
.content();
return parseTagsFromResponse(response);
}
private String buildPrompt(FilmForTagging film) {
return String.format("""
Generate up to 5 tags that describe the mood of the following film.
Return the tags comma-separated. No extra explanation needed.
Title: %s
Description: %s
Example output format: Action,Thrilling,Family-friendly,Exhilarating,Battle
""",
film.getTitle(),
film.getDescription()
);
}
private String[] parseTagsFromResponse(String response) {
return Arrays.stream(response.split("[,、]"))
.map(String::trim)
.filter(s -> !s.isEmpty())
.limit(5)
.toArray(String[]::new);
}
}
Batch Trigger Methods
Manual Execution Endpoint (Called from Admin Screen)
@RestController
@RequestMapping("/admin/api/batch")
@PreAuthorize("hasRole('ADMIN')")
public class BatchController {
private final FilmTagGenerationService tagGenerationService;
@PostMapping("/generate-tags")
public ResponseEntity<Map<String, Object>> generateTags(
@RequestParam(defaultValue = "50") int batchSize) {
tagGenerationService.generateTagsForAllFilms(batchSize);
return ResponseEntity.ok(Map.of(
"message", "Tag generation complete",
"processedCount", batchSize
));
}
}
Scheduled Execution (Nightly Batch)
@Component
public class TagGenerationScheduler {
private final FilmTagGenerationService tagGenerationService;
@Scheduled(cron = "0 0 2 * * ?") // 2 AM every day
public void scheduledTagGeneration() {
tagGenerationService.generateTagsForAllFilms(100);
}
}
# application.yml
spring:
task:
scheduling:
enabled: true
Example Generated Results
| Film Title | Generated Tags |
|---|---|
| ACADEMY DINOSAUR | Educational, Adventure, Family-friendly, Heartwarming, Fun |
| ACE GOLDFINGER | Spy, Thrilling, Action, Mystery, Adult |
| AFFAIR PREJUDICE | Romance, Drama, Emotional, Deep, Adult |
~1,000 films tagged in about 8 minutes (using Ollama llama3, local environment).
Pain Points Encountered
① LLM Returns Output Outside the Specified Format
The LLM sometimes returns with preambles like:
"Here are the generated tags: Action, Thrilling, Family-friendly"
Either extract just the tag portion with regex, or fix the output format with few-shot prompting.
// Few-shot prompt example
String fewShotPrompt = """
Instruction: Return mood tags for the film comma-separated.
Example 1:
Input: Die Hard (detective/action)
Output: Action,Thrilling,Suspenseful,Adult,Exciting
Example 2:
Input: Toy Story (animated/children)
Output: Family,Heartwarming,Adventure,Fun,Moving
Input: %s (%s)
Output:
""".formatted(film.getTitle(), film.getDescription());
② API Cost Management
OpenAI API is billed per call.
1,000 films × 1 call = at most a few hundred yen (GPT-4o-mini), but costs can balloon if accidentally repeated.
// Skip films that already have tags
filmTagMapper.findFilmsWithoutTags(batchSize)
// → Query that fetches only films where taste_tags IS NULL
Summary
Spring AImakes it easy to incorporate an LLM client into Spring Boot- The split between Ollama (free) for local development and OpenAI API for production can be achieved with
@Profile - Fixing the output format with few-shot prompting in tag generation prompts makes results stable
- Keep batches “target only unprocessed” to maintain idempotency (same result no matter how many times it runs)
Using LLM on data at a scale that manual work can’t handle lets you enrich content in a short time.