Back to Papers

Igniting creative writing in small language models: LLM-as-a-judge versus multi-agent refined rewards

Xiaolong Wei

Computer science - computation and language, computer science - artificial intelligence

Abstract

Large Language Models (LLMs) have demonstrated remarkable creative writing capabilities, yet their substantial computational demands hinder widespread use. Enhancing Small Language Models (SLMs) offers a promising alternative, but current methods like Supervised Fine-Tuning (SFT) struggle with novelty, and Reinforcement Learning from Human Feedback (RLHF) is costly. This paper explores two distinct AI-driven reward strategies within a Reinforcement Learning from AI Feedback (RLAIF) framework to

Relevance Assessment

Research Gap

Notes

Notes are automatically saved as you type

Tags

Search Queries

Paper ID: f0a11a2a-9c43-4fd0-88c2-b617ce9c192bAdded: 10/26/2025