Igniting creative writing in small language models: LLM-as-a-judge versus multi-agent refined rewards
Xiaolong Wei
Computer science - computation and language, computer science - artificial intelligence
Abstract
Large Language Models (LLMs) have demonstrated remarkable creative writing capabilities, yet their substantial computational demands hinder widespread use. Enhancing Small Language Models (SLMs) offers a promising alternative, but current methods like Supervised Fine-Tuning (SFT) struggle with novelty, and Reinforcement Learning from Human Feedback (RLHF) is costly. This paper explores two distinct AI-driven reward strategies within a Reinforcement Learning from AI Feedback (RLAIF) framework to
Relevance Assessment
Research Gap
Notes
Notes are automatically saved as you type
Tags
Search Queries
Paper ID: f0a11a2a-9c43-4fd0-88c2-b617ce9c192bAdded: 10/26/2025