EscapeBench: Towards advancing creative intelligence of language model agents
Cheng Qian
2024Computer science - computation and language, computer science - artificial intelligence, computer science - machine learning
Abstract
Language model agents excel in long-session planning and reasoning, but existing benchmarks primarily focus on goal-oriented tasks with explicit objectives, neglecting creative adaptation in unfamiliar environments. To address this, we introduce EscapeBench, a benchmark suite of room escape game environments designed to challenge agents with creative reasoning, unconventional tool use, and iterative problem-solving to uncover implicit goals. Our results show that current LM models, despite emplo
Relevance Assessment
Research Gap
Notes
Notes are automatically saved as you type
Tags
Research Focus › BenchmarkResearch Focus › Controllable Generation
Search Queries
Paper ID: 9c01f1cc-47e3-4f53-92f9-2f14eb00b58aAdded: 10/26/2025