CreativityPrism: a holistic benchmark for large language model creativity

Zhaoyi Joey Hou

2025Computer science - computation and language, computer science - artificial intelligence

Abstract

Creativity is often seen as a hallmark of human intelligence. While large language models (LLMs) are increasingly perceived as producing creative text, there is still no holistic framework to evaluate their creativity across diverse scenarios. Existing evaluation methods remain fragmented, with dramatic variation across domains and tasks, largely due to differing definitions and measurements of creativity. Inspired by the hypothesis that creativity is not one fixed idea, we propose CreativityPri

Relevance Assessment

Research Gap

Notes

Notes are automatically saved as you type

Tags

Human-In-The-Loop › Autonomous GenerationRelationship to Creativity › ExplicitCreativity Frameworks › Logical CreativityCreative Phenomena Studied › LogicsCreativity Evaluation Methods › Automatic MetricsLevel of Analysis › Word-LevelModel Scale › Medium (8-24)Creativity Evaluation Methods › Creativity-Specific EvaluationTextual Domain › Literary TextsTextual Domain › PoetryProprietary Models › Anthropic ClaudeProprietary Models › Google GeminiLevel of Analysis › Document-LevelLevel of Analysis › Sentence-LevelResearch Focus › Benchmark

Search Queries

("LLMs" OR large language models) AND creative BY TITLE - ("LLMs" OR large language models) AND creative

Paper ID: 400637ca-d87c-4932-94ff-3b1f27aefd42Added: 10/26/2025