Back to Papers

Curiosity-driven LLM-as-a-judge for personalized creative judgment

Vanya Bannihatti Kumar

2025Computer science - computation and language, computer science - machine learning

Abstract

Modern large language models (LLMs) excel at objective tasks such as evaluating mathematical reasoning and factual accuracy, yet they falter when faced with the nuanced, subjective nature of assessing creativity. In this work, we propose a novel curiosity-driven LLM-as-a-judge for evaluating creative writing which is personlized to each individual's creative judgments. We use the Torrance Test of Creative Thinking(TTCW) benchmark introduced in Chakrabarty et al. (2024), which has stories annotat

Relevance Assessment

Research Gap

Notes

Notes are automatically saved as you type

Tags

Human-In-The-Loop › Autonomous GenerationCreativity Evaluation Methods › LLM-Based EvaluationCreativity Frameworks › Logical CreativityCreativity Evaluation Methods › Human EvaluationCreativity Evaluation Methods › Automatic MetricsLevel of Analysis › Document-LevelModel Scale › Small (<3B)Relationship to Creativity › ImplicitTextual Domain › Literary TextsResearch Focus › Architectural ResearchModel Scale › Medium (8-24)Creativity Frameworks › Linguistic CreativityResearch Focus › Benchmark

Search Queries

Paper ID: f6f31b70-079b-4f46-8e14-ebeac7a35e49Added: 10/26/2025