Tech Blog

Incorporating an LLM Batch (Ollama/OpenAI) for Auto-Generating Film Tags into Spring Boot

by y104
Spring Boot LLM Java AI

Introduction

In the DVD rental app, there was a requirement to add “mood tags” to films.

These are emotional tags finer than categories—things like “Action,” “Heartwarming,” “Family-friendly,” “Horror.”
With about 1,000 films, manually tagging them isn’t practical.

So we built a batch process that sends film titles and descriptions to an LLM to auto-generate tags.


Overall Architecture

Spring Boot Batch Process
  ↓ Fetch untagged films from the film table
  ↓ Build prompt with title + description
  ↓ Send to LLM API (Ollama or OpenAI)
  ↓ Extract tags from response
  ↓ Save to the taste_tags column in the film table

LLM Provider Selection

For local development we used Ollama (free, privacy-safe), and for production OpenAI API.

ProviderProsCons
Ollama (llama3)Free, offline, privacy-safeLower accuracy than OpenAI, slower
OpenAI (GPT-4o-mini)High accuracy, fastAPI cost, requires internet

Incorporating the LLM Client into Spring Boot

Dependencies

<!-- pom.xml -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    <version>1.0.0-M6</version>
</dependency>

<!-- or Ollama -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
    <version>1.0.0-M6</version>
</dependency>

Configuration

# application.yml (when using Ollama)
spring:
  ai:
    ollama:
      base-url: http://localhost:11434
      chat:
        model: llama3

---
# application-prod.yml (when using OpenAI)
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: gpt-4o-mini

Batch Processing Implementation

Tag Generation Service

@Service
@Slf4j
public class FilmTagGenerationService {
    
    private final ChatClient chatClient;
    private final FilmTagMapper filmTagMapper;
    
    public FilmTagGenerationService(ChatClient.Builder builder, FilmTagMapper mapper) {
        this.chatClient = builder.build();
        this.filmTagMapper = mapper;
    }
    
    public void generateTagsForAllFilms(int batchSize) {
        // Fetch untagged films
        List<FilmForTagging> films = filmTagMapper.findFilmsWithoutTags(batchSize);
        log.info("Tag generation targets: {} films", films.size());
        
        for (FilmForTagging film : films) {
            try {
                String[] tags = generateTags(film);
                filmTagMapper.updateTasteTags(film.getFilmId(), tags);
                log.info("Tags applied: {} → {}", film.getTitle(), Arrays.toString(tags));
                
                // API rate limit countermeasure
                Thread.sleep(500);
            } catch (Exception e) {
                log.error("Tag generation error: filmId={}", film.getFilmId(), e);
            }
        }
    }
    
    private String[] generateTags(FilmForTagging film) {
        String prompt = buildPrompt(film);
        
        String response = chatClient.prompt()
            .user(prompt)
            .call()
            .content();
        
        return parseTagsFromResponse(response);
    }
    
    private String buildPrompt(FilmForTagging film) {
        return String.format("""
            Generate up to 5 tags that describe the mood of the following film.
            Return the tags comma-separated. No extra explanation needed.
            
            Title: %s
            Description: %s
            
            Example output format: Action,Thrilling,Family-friendly,Exhilarating,Battle
            """,
            film.getTitle(),
            film.getDescription()
        );
    }
    
    private String[] parseTagsFromResponse(String response) {
        return Arrays.stream(response.split("[,、]"))
            .map(String::trim)
            .filter(s -> !s.isEmpty())
            .limit(5)
            .toArray(String[]::new);
    }
}

Batch Trigger Methods

Manual Execution Endpoint (Called from Admin Screen)

@RestController
@RequestMapping("/admin/api/batch")
@PreAuthorize("hasRole('ADMIN')")
public class BatchController {
    
    private final FilmTagGenerationService tagGenerationService;
    
    @PostMapping("/generate-tags")
    public ResponseEntity<Map<String, Object>> generateTags(
            @RequestParam(defaultValue = "50") int batchSize) {
        
        tagGenerationService.generateTagsForAllFilms(batchSize);
        return ResponseEntity.ok(Map.of(
            "message", "Tag generation complete",
            "processedCount", batchSize
        ));
    }
}

Scheduled Execution (Nightly Batch)

@Component
public class TagGenerationScheduler {
    
    private final FilmTagGenerationService tagGenerationService;
    
    @Scheduled(cron = "0 0 2 * * ?")  // 2 AM every day
    public void scheduledTagGeneration() {
        tagGenerationService.generateTagsForAllFilms(100);
    }
}
# application.yml
spring:
  task:
    scheduling:
      enabled: true

Example Generated Results

Film TitleGenerated Tags
ACADEMY DINOSAUREducational, Adventure, Family-friendly, Heartwarming, Fun
ACE GOLDFINGERSpy, Thrilling, Action, Mystery, Adult
AFFAIR PREJUDICERomance, Drama, Emotional, Deep, Adult

~1,000 films tagged in about 8 minutes (using Ollama llama3, local environment).


Pain Points Encountered

① LLM Returns Output Outside the Specified Format

The LLM sometimes returns with preambles like:

"Here are the generated tags: Action, Thrilling, Family-friendly"

Either extract just the tag portion with regex, or fix the output format with few-shot prompting.

// Few-shot prompt example
String fewShotPrompt = """
    Instruction: Return mood tags for the film comma-separated.

    Example 1:
    Input: Die Hard (detective/action)
    Output: Action,Thrilling,Suspenseful,Adult,Exciting

    Example 2:
    Input: Toy Story (animated/children)
    Output: Family,Heartwarming,Adventure,Fun,Moving

    Input: %s (%s)
    Output:
    """.formatted(film.getTitle(), film.getDescription());

② API Cost Management

OpenAI API is billed per call.
1,000 films × 1 call = at most a few hundred yen (GPT-4o-mini), but costs can balloon if accidentally repeated.

// Skip films that already have tags
filmTagMapper.findFilmsWithoutTags(batchSize)
// → Query that fetches only films where taste_tags IS NULL

Summary

  • Spring AI makes it easy to incorporate an LLM client into Spring Boot
  • The split between Ollama (free) for local development and OpenAI API for production can be achieved with @Profile
  • Fixing the output format with few-shot prompting in tag generation prompts makes results stable
  • Keep batches “target only unprocessed” to maintain idempotency (same result no matter how many times it runs)

Using LLM on data at a scale that manual work can’t handle lets you enrich content in a short time.


Article Map for This Series

Building an End-User DVD Rental App — Vue 3 + Spring Boot Paired with the Admin App, with Article Map

Feel free to send a message

Please send a message if you have any technical questions, feedback, or inquiries.