Incorporating an LLM Batch (Ollama/OpenAI) for Auto-Generating Film Tags into Spring Boot

Introduction

In the DVD rental app, there was a requirement to add “mood tags” to films.

These are emotional tags finer than categories—things like “Action,” “Heartwarming,” “Family-friendly,” “Horror.”
With about 1,000 films, manually tagging them isn’t practical.

So we built a batch process that sends film titles and descriptions to an LLM to auto-generate tags.

Overall Architecture

Spring Boot Batch Process
  ↓ Fetch untagged films from the film table
  ↓ Build prompt with title + description
  ↓ Send to LLM API (Ollama or OpenAI)
  ↓ Extract tags from response
  ↓ Save to the taste_tags column in the film table

LLM Provider Selection

For local development we used Ollama (free, privacy-safe), and for production OpenAI API.

Provider	Pros	Cons
Ollama (llama3)	Free, offline, privacy-safe	Lower accuracy than OpenAI, slower
OpenAI (GPT-4o-mini)	High accuracy, fast	API cost, requires internet

Incorporating the LLM Client into Spring Boot

Dependencies

<!-- pom.xml -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    <version>1.0.0-M6</version>
</dependency>

<!-- or Ollama -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
    <version>1.0.0-M6</version>
</dependency>

Configuration

# application.yml (when using Ollama)
spring:
  ai:
    ollama:
      base-url: http://localhost:11434
      chat:
        model: llama3

---
# application-prod.yml (when using OpenAI)
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: gpt-4o-mini

Batch Processing Implementation

Tag Generation Service

@Service
@Slf4j
public class FilmTagGenerationService {
    
    private final ChatClient chatClient;
    private final FilmTagMapper filmTagMapper;
    
    public FilmTagGenerationService(ChatClient.Builder builder, FilmTagMapper mapper) {
        this.chatClient = builder.build();
        this.filmTagMapper = mapper;
    }
    
    public void generateTagsForAllFilms(int batchSize) {
        // Fetch untagged films
        List<FilmForTagging> films = filmTagMapper.findFilmsWithoutTags(batchSize);
        log.info("Tag generation targets: {} films", films.size());
        
        for (FilmForTagging film : films) {
            try {
                String[] tags = generateTags(film);
                filmTagMapper.updateTasteTags(film.getFilmId(), tags);
                log.info("Tags applied: {} → {}", film.getTitle(), Arrays.toString(tags));
                
                // API rate limit countermeasure
                Thread.sleep(500);
            } catch (Exception e) {
                log.error("Tag generation error: filmId={}", film.getFilmId(), e);
            }
        }
    }
    
    private String[] generateTags(FilmForTagging film) {
        String prompt = buildPrompt(film);
        
        String response = chatClient.prompt()
            .user(prompt)
            .call()
            .content();
        
        return parseTagsFromResponse(response);
    }
    
    private String buildPrompt(FilmForTagging film) {
        return String.format("""
            Generate up to 5 tags that describe the mood of the following film.
            Return the tags comma-separated. No extra explanation needed.
            
            Title: %s
            Description: %s
            
            Example output format: Action,Thrilling,Family-friendly,Exhilarating,Battle
            """,
            film.getTitle(),
            film.getDescription()
        );
    }
    
    private String[] parseTagsFromResponse(String response) {
        return Arrays.stream(response.split("[,、]"))
            .map(String::trim)
            .filter(s -> !s.isEmpty())
            .limit(5)
            .toArray(String[]::new);
    }
}

Batch Trigger Methods

Manual Execution Endpoint (Called from Admin Screen)

@RestController
@RequestMapping("/admin/api/batch")
@PreAuthorize("hasRole('ADMIN')")
public class BatchController {
    
    private final FilmTagGenerationService tagGenerationService;
    
    @PostMapping("/generate-tags")
    public ResponseEntity<Map<String, Object>> generateTags(
            @RequestParam(defaultValue = "50") int batchSize) {
        
        tagGenerationService.generateTagsForAllFilms(batchSize);
        return ResponseEntity.ok(Map.of(
            "message", "Tag generation complete",
            "processedCount", batchSize
        ));
    }
}

Scheduled Execution (Nightly Batch)

@Component
public class TagGenerationScheduler {
    
    private final FilmTagGenerationService tagGenerationService;
    
    @Scheduled(cron = "0 0 2 * * ?")  // 2 AM every day
    public void scheduledTagGeneration() {
        tagGenerationService.generateTagsForAllFilms(100);
    }
}

# application.yml
spring:
  task:
    scheduling:
      enabled: true

Example Generated Results

Film Title	Generated Tags
ACADEMY DINOSAUR	Educational, Adventure, Family-friendly, Heartwarming, Fun
ACE GOLDFINGER	Spy, Thrilling, Action, Mystery, Adult
AFFAIR PREJUDICE	Romance, Drama, Emotional, Deep, Adult

~1,000 films tagged in about 8 minutes (using Ollama llama3, local environment).

Pain Points Encountered

① LLM Returns Output Outside the Specified Format

The LLM sometimes returns with preambles like:

"Here are the generated tags: Action, Thrilling, Family-friendly"

Either extract just the tag portion with regex, or fix the output format with few-shot prompting.

// Few-shot prompt example
String fewShotPrompt = """
    Instruction: Return mood tags for the film comma-separated.

    Example 1:
    Input: Die Hard (detective/action)
    Output: Action,Thrilling,Suspenseful,Adult,Exciting

    Example 2:
    Input: Toy Story (animated/children)
    Output: Family,Heartwarming,Adventure,Fun,Moving

    Input: %s (%s)
    Output:
    """.formatted(film.getTitle(), film.getDescription());

② API Cost Management

OpenAI API is billed per call.
1,000 films × 1 call = at most a few hundred yen (GPT-4o-mini), but costs can balloon if accidentally repeated.

// Skip films that already have tags
filmTagMapper.findFilmsWithoutTags(batchSize)
// → Query that fetches only films where taste_tags IS NULL

Summary

Spring AI makes it easy to incorporate an LLM client into Spring Boot
The split between Ollama (free) for local development and OpenAI API for production can be achieved with @Profile
Fixing the output format with few-shot prompting in tag generation prompts makes results stable
Keep batches “target only unprocessed” to maintain idempotency (same result no matter how many times it runs)

Using LLM on data at a scale that manual work can’t handle lets you enrich content in a short time.

Article Map for This Series

→ Building an End-User DVD Rental App — Vue 3 + Spring Boot Paired with the Admin App, with Article Map